CN113806316B

CN113806316B - File synchronization method, equipment and storage medium

Info

Publication number: CN113806316B
Application number: CN202111078965.9A
Authority: CN
Inventors: 刘熙
Original assignee: Xinghuan Zhongzhi Technology Beijing Co ltd
Current assignee: Xinghuan Zhongzhi Technology Beijing Co ltd
Priority date: 2021-09-15
Filing date: 2021-09-15
Publication date: 2022-06-21
Anticipated expiration: 2041-09-15
Also published as: CN113806316A

Abstract

The embodiment of the invention discloses a file synchronization method, file synchronization equipment and a storage medium. The method comprises the following steps: responding to the synchronous operation of a client to a target file segment, and determining a file identifier of a Blob file corresponding to the target file segment; writing the target file segments into a temporary file corresponding to the file identification, generating corresponding target log items and storing the target log items into a Raft log; the target log item comprises index information pointing to the temporary file; generating fragment index information corresponding to the target log item, and storing the fragment index information into file index data corresponding to the file identifier; the file index data includes: file state information and segment index information pointing to the log entry corresponding to the file identifier. The technical scheme of the embodiment of the invention can reduce IO (input/output) overhead generated in the process of synchronizing the Blob file by using the Raft protocol and improve the system performance.

Description

File synchronization method, equipment and storage medium

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to a file synchronization method, file synchronization equipment and a storage medium.

Background

In a distributed system, multiple copies are generally used to ensure the reliability of the system, and a log-based Raft protocol is used to ensure the consistency of data of the multiple copies.

At present, the process of implementing file synchronization using the Raft protocol includes: the leader node writes the file fragment data to be synchronized sent by the client into a Raft log and informs the follower node of acquiring the Raft log item. And writing the local Raft log after the following node successfully acquires the Raft log entry. After the Raft log item is received by a plurality of nodes and written into the local log, all participating nodes read file fragment data from the local log and write the file fragment data into the local file. When all file segments of a file to be synchronized are committed in the Raft group, the file is finally copied to the local of each of the Raft members. Obviously, when a file segment is synchronized, each participating node needs two write operations and one additional read operation, and the Input/Output (IO) overhead is large.

Disclosure of Invention

Embodiments of the present invention provide a file synchronization method, device, and storage medium, so as to reduce IO overhead generated in a process of synchronizing a Binary Large Object (Blob) file using a Raft protocol, and improve system performance.

In a first aspect, an embodiment of the present invention provides a file synchronization method, including:

responding to the synchronous operation of a client to a target file segment, and determining a file identifier of a Blob file corresponding to the target file segment;

writing the target file segments into a temporary file corresponding to the file identification, generating corresponding target log items and storing the target log items into a Raft log; the target log item comprises index information pointing to the temporary file;

generating segment index information corresponding to the target log item, and storing the segment index information into file index data corresponding to the file identification; the file index data includes: file state information and segment index information that points to the log entry corresponding to the file identification.

Optionally, writing the target file segment into a temporary file corresponding to the file identifier, generating a corresponding target log entry, and storing the target log entry into a Raft log, where the method includes:

searching file index data corresponding to the file identification, and extracting the file offset and the segment length of the last synchronous historical file segment from the file index data;

if the file offset of the target file segment is equal to the sum of the file offset and the segment length of the historical file segment, writing the target file segment into the temporary file after determining that the actual length of the temporary file is equal to the expected length, generating a corresponding target log item and storing the target log item into a Raft log;

and if the file offset of the target file segment is equal to the file offset of the historical file segment, and the checksum of the target file segment is equal to the checksum of the historical file segment, generating a target log item corresponding to the target file segment and storing the target log item into a Raft log when the log item of the historical file segment is not successfully written into the Raft log.

Optionally, the file status information includes: file identification, file synchronization state, starting log sequence number and ending log sequence number;

generating segment index information corresponding to the target log item, and storing the segment index information into file index data corresponding to the file identifier, including:

generating fragment index information corresponding to the target log item according to the log serial number in the target log item, the file offset of the target file fragment in the temporary file, the data length of the target file fragment and the checksum;

inserting the segment index information into file index data corresponding to the file identification, and acquiring a log sequence number minimum value and a log sequence number maximum value corresponding to each segment index information in the file index data;

and taking the minimum value of the log sequence number as the initial log sequence number in the file state information, and taking the maximum value of the log sequence number as the termination log sequence number in the file state information when the file synchronization state is in an uncompleted state.

Optionally, the method further includes:

in response to a raw log read request, determining a destination address and a first log sequence number;

reading a corresponding log entry from a Raft log according to the first log sequence number, and reading a corresponding file segment from a local temporary file according to a local temporary file path, a file offset and a data length recorded in the log entry;

and sending the log entry and the file segment to a destination address.

Optionally, the method further includes:

responding to a trigger message for deleting the log, determining a second log serial number, and deleting log items of which the log serial number is greater than or equal to the second log serial number in the Raft log;

traversing each file index data, and if the initial log serial number in the file index data is greater than or equal to the second log serial number, deleting the file index data;

if the stop log sequence number in the file index data is smaller than the second log sequence number, the file index data is reserved;

and if the initial log sequence number is less than the second log sequence number and the termination log sequence number is greater than or equal to the second log sequence number, setting the file synchronization state as an uncompleted state, deleting the segment index information of which the log sequence number is greater than or equal to the second log sequence number in the file index data, and updating the termination log sequence number in the file state information to the maximum log sequence number corresponding to the residual segment index information in the file index data.

Optionally, the method further includes:

responding to a timing trigger message of the recovery temporary file, and determining the maximum recovery log sequence number;

traversing each file index data, and determining a candidate file identifier and a candidate termination log serial number corresponding to the synchronization state of the specified file; the specified file synchronization state comprises a completion state or a cancellation state;

and if the maximum recovery log serial number is greater than or equal to the candidate stop log serial number, determining a candidate file identifier corresponding to the candidate stop log serial number, and deleting the file index data and the temporary file corresponding to the candidate file identifier.

Optionally, the method further includes:

responding to a data snapshot request sent by a new node, and determining a snapshot log serial number;

traversing each file index data, and if the initial log serial number in the file index data is less than or equal to the snapshot log serial number and the file synchronization state in the file index data is an uncompleted state, acquiring a file identifier corresponding to the file index data;

and sending the temporary file data and the file index data corresponding to the file identification to the new node.

Optionally, the method further includes:

receiving a synchronous state updating request sent by a client, acquiring a target file identifier and a target file state corresponding to the synchronous state updating request, and searching target file index data matched with the target file identifier;

if the file synchronization state in the target file index data is an incomplete state, generating a state log item corresponding to the target file state and writing a Raft log, and updating the file synchronization state and the termination log serial number in the target file index data according to the target file state and the state log item;

and if the file synchronization state in the target file index data is not an uncompleted state, returning a corresponding feedback message to the client according to whether the current file synchronization state is matched with the target file state.

In a second aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:

one or more processors;

a storage device for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement a file synchronization method provided by any embodiment of the invention.

In a third aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a file synchronization method provided in any embodiment of the present invention.

In the embodiment of the invention, the file identifier of a Blob file corresponding to a target file fragment is determined by responding to the synchronous operation of a client on the target file fragment; writing the target file segments into a temporary file corresponding to the file identification, generating corresponding target log items and storing the target log items into a Raft log; the target log item comprises index information pointing to the temporary file; generating segment index information corresponding to the target log item, and storing the segment index information into file index data corresponding to the file identification; the file index data includes: the file state information and the segment index information of the log item corresponding to the file identification solve the problem that IO (input/output) overhead generated by file synchronization by using a Raft protocol in the prior art is large, reduce the IO overhead generated in the process of synchronizing the Blob file by using the Raft protocol and improve the system performance.

Drawings

FIG. 1a is a flowchart of a file synchronization method according to a first embodiment of the present invention;

FIG. 1b is a schematic structural diagram of a temporary file according to a first embodiment of the present invention;

FIG. 1c is a schematic diagram of a structure of a Raft log according to a first embodiment of the present invention;

FIG. 1d is a schematic structural diagram of file index data according to a first embodiment of the present invention;

fig. 2a is a schematic structural diagram of a temporary file corresponding to multi-file synchronization in the second embodiment of the present invention;

fig. 2b is a schematic structural diagram of a Raft log corresponding to multi-file synchronization in the second embodiment of the present invention;

FIG. 2c is a schematic structural diagram of index data corresponding to multi-file synchronization according to a second embodiment of the present invention;

fig. 3 is a schematic structural diagram of a computer device in a third embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Example one

Fig. 1a is a flowchart of a file synchronization method in a first embodiment of the present invention, and this embodiment may be applied to a case where Blob files are synchronized using a Raft protocol, and this method may be executed by a computer device providing a file synchronization function, for example, a leader node in a Raft group. As shown in fig. 1a, the method comprises:

and step 110, responding to the synchronous operation of the client to the target file segment, and determining the file identifier of the Blob file corresponding to the target file segment.

The target file segment may be any one of Blob files to be synchronized by the client. The Blob file represents a class file object containing read-only raw data for storing large binary objects in the database. Each Blob file needs to be configured with a unique file identifier to mark the Blob file. It should be noted that this embodiment is also applicable to the case of synchronizing other files than the Blob file using the Raft protocol.

In this embodiment, because the Blob file is relatively large, the client divides the Blob file into a plurality of file segments according to a fixed length (for example, 512KB), and serially synchronizes the file segments in sequence, that is, after a file segment is successfully transmitted, the next file segment can be transmitted, so that retry of Remote Procedure Call Protocol (RPC) is only required to be processed, and disorder of the file segments is not required to be processed.

In this embodiment, when a client needs to synchronize a target file segment of a Blob file, if the target file segment is a first file segment of the Blob file, a synchronization request message carrying a complete data length of the Blob file may be sent to a leader node. The leader node allocates a file identifier for the Blob file to be synchronized and returns the file identifier to the client; distributing a local temporary file path for the file identifier, and distributing a disk space matched with the complete data length of the Blob file by using a fallocate function, so that the storage space of each file fragment is continuous in the disk; and initializing file index data corresponding to the Blob file in the synchronous index data, and recording a file identifier, a temporary file path and a file synchronous state set as an uncompleted state in the file index data.

For example, as shown in fig. 1c, assuming that a client wants to synchronize a Blob file a with a length of 2048KB, after receiving a synchronization request message sent by the client, the leader node allocates a file identifier 0000 to the Blob file a, and allocates a local temporary file path 0000.tmp corresponding to the identifier 0000, where the file path 0000.tmp refers to a continuous disk space with a length of 2048KB shown in fig. 1b, and the disk space is divided into a plurality of storage fragments according to a file fragment length of 512 KB. At the same time, file index data corresponding to the identification 0000 as shown in fig. 1d is created, and bid (file identification): 0000, tfname (temporary file path): 0000, tmp, status (file synchronization status): open (not completed state) is written in the file index data.

After acquiring the file identifier of the Blob file, the client combines the file identifier and the target file segment synchronized this time into an RPC message and sends the RPC message to the leader node. And the leader node determines the file identifier of the Blob file to be synchronized and the target file segment of the synchronization according to the received message.

Step 120, writing the target file segments into a temporary file corresponding to the file identification, generating corresponding target log items and storing the target log items into a Raft log; the target log entry includes index information pointing to the temporary file.

In this embodiment, the leader node queries file index data according to a file identifier, determines a path of a local temporary file for storing Blob file data, writes a target file segment synchronized this time into an assigned position in the temporary file pointed by the temporary file path, generates a target log item according to information related to the file segment write operation this time, and stores the target log item into a Raft log.

In the file segment write operation, each log entry may include the following information: bid: file identification of the Blob file to which the target file fragment belongs; fname: original file name of Blob file; tfname: a local temporary file path; offset: file offset of the target file segment in the local temporary file; length: the data length of the target file segment; checksum: and data checksum corresponding to the target file fragment.

It should be noted that each Raft log entry has two attributes: let term and log sequence number log sequence, both attributes are incremental long shaping values, and term + log sequence uniquely identifies a log entry. In order to ensure the performance of the Raft protocol, it is necessary to ensure that log processing operations related to the Raft protocol can be normally executed, including: reading a log entry; writing a log entry; deleting all subsequent log entries starting from the specified log sequence; all previous log entries are deleted, etc., starting with the specified log sequence.

Optionally, writing the target file segment into a temporary file corresponding to the file identifier, generating a corresponding target log entry, and storing the target log entry in a Raft log, may include: searching file index data corresponding to the file identification, and extracting the file offset and the segment length of the last synchronous historical file segment from the file index data; if the file offset of the target file segment is equal to the sum of the file offset and the segment length of the historical file segment, writing the target file segment into the temporary file after determining that the actual length of the temporary file is equal to the expected length, generating a corresponding target log item and storing the target log item into a Raft log; and if the file offset of the target file segment is equal to the file offset of the historical file segment, and the checksum of the target file segment is equal to the checksum of the historical file segment, generating a target log item corresponding to the target file segment and storing the target log item into a Raft log when the log item of the historical file segment is not successfully written into the Raft log.

In this embodiment, before finding the temporary file corresponding to the file identifier and writing the target file segment into the temporary file, it is further required to verify whether the previous file segments are correctly synchronized, so as to ensure that each file segment of the Blob file is sequentially written into the temporary file. The file index data corresponding to the file identifier bid may be searched first, and if the corresponding file index data cannot be found or the file synchronization state of the found file index data is not an open state, synchronization fails, and a synchronization failure message needs to be returned to the client. If the corresponding file index data is found and is in the open state, the last synchronized history file segment f1, and the file offset f1.offset and segment length f1.length of f1 are extracted from the file index data.

Let the destination file segment be f0, if f0.offset is f1.offset + f1.length, i.e. the starting position of the destination file segment inserted into the temporary file is just before the ending position of the previous file segment, then the desired length of the temporary file should be f0.offset. And further detecting the actual length of the temporary file, and if the actual length is greater than the expected length, deleting the temporary file to the expected length by using a truncate log operation. If the actual length is determined to be equal to the desired length, the target file segment is written to the temporary file and a corresponding target log entry is generated for storage in the Raft log. If the log entry storage is successful, a success is returned, otherwise an error is returned. If the actual length of the temporary file is smaller than the expected length, the file is damaged, and at the moment, the copy is marked to be in a damaged state, and failure is returned.

If f0.offset is f1.offset, namely the starting position of the target file fragment inserted into the temporary file is just equal to the starting position of the history file fragment inserted into the temporary file, calculating the checksum f0.checksum of the target file fragment, and judging whether the file synchronization is the client retry or not. If f0.checksum is f1.checksum, it is determined that this is a client retry, at this time, if the log entry corresponding to f1 has been successfully written into the Raft log, success is returned; otherwise, generating a target log item corresponding to the target file segment, storing the target log item into the Raft log, and returning success when the target log item is successfully stored, otherwise, returning an error. If f0.checksum is not consistent with f1.checksum, it indicates that this is an illegal operation against the flow, and returns a failure.

And step 130, generating segment index information corresponding to the target log item, and storing the segment index information into file index data corresponding to the file identifier.

Wherein the file index data includes: file state information and segment index information that points to the log entry corresponding to the file identification.

In this embodiment, in order to ensure the performance of the Raft protocol and ensure that log processing operations related to the Raft protocol can be executed normally, corresponding segment index information may be generated for log entries in the Raft log, and each segment index information may be stored according to a Blob file, so as to obtain file index data corresponding to a file identifier, as shown in fig. 1 d.

Optionally, the file status information includes: file identification, file synchronization state, starting log sequence number and ending log sequence number; generating segment index information corresponding to the target log item, and storing the segment index information into file index data corresponding to the file identifier may include: generating fragment index information corresponding to the target log item according to the log serial number in the target log item, the file offset of the target file fragment in the temporary file, the data length of the target file fragment and the checksum; inserting the segment index information into file index data corresponding to the file identification, and acquiring a log sequence number minimum value and a log sequence number maximum value corresponding to each segment index information in the file index data; and taking the minimum value of the log sequence number as the initial log sequence number in the file state information, and taking the maximum value of the log sequence number as the termination log sequence number in the file state information when the file synchronization state is in an uncompleted state.

In this embodiment, the file index data includes: file state information reflecting the synchronization state of the Blob file, and segment index information corresponding to the log entry of each file segment. The file state information specifically includes: file identification bid, file name fname, local temporary file path tfname, file synchronization status, start log sequence number min log sequence, and end log sequence number max log sequence. The segment index information includes

The file synchronization state comprises an open state, a committed state or an aborted state. The Open state represents that the Blob file is not synchronized completely, only the Blob file in the Open state allows file fragment writing operation, and the committed or aborted state represents that the Blob file is synchronized completely. Min log sequence refers to the log sequence number of the log item corresponding to the first file segment write operation of the Blob file. For the Blob file in the open state, the max log sequence indicates the log sequence number of the log item corresponding to the last file fragment write operation, and for the Blob file in the committed or aborted state, the max log sequence indicates the log sequence number of the log item corresponding to the commit or abort operation.

For example, as shown in fig. 1c, assuming that a log entry with a sequence number of 8 is generated according to the current file write operation, the log sequence number of 8, the file offset 1536KB, the data length of the file fragment of 512KB, and the checksum xxx are obtained from the log entry, and fragment index information corresponding to the log entry is generated. As shown in fig. 1d, the segment index information includes: log sequence: 8, offset: 1536KB, length: 512KB, checksum: xxx. And inquiring file index data corresponding to the file identification 0000 in the log entry, and inserting the fragment index information into the file index data. And then, updating a min log sequence in the file state information by using a minimum log sequence number 0 corresponding to the fragment index information in the file index data, updating a max log sequence by using a maximum log sequence number 8, and recording the current synchronization progress of the Blob file.

It should be noted that the leader node stores the target file segment in the local temporary file, generates and stores a corresponding target log entry and segment index information, and then sends the target log entry and the target file segment to the follower node, so as to implement file segment synchronization between devices in the Raft group.

In the embodiment of the invention, the file identifier of a Blob file corresponding to a target file fragment is determined by responding to the synchronous operation of a client on the target file fragment; writing the target file segments into a temporary file corresponding to the file identification, generating corresponding target log items and storing the target log items into a Raft log; the target log item comprises index information pointing to the temporary file; generating fragment index information corresponding to the target log item, and storing the fragment index information into file index data corresponding to the file identifier; the file index data includes: the file state information and the segment index information of the log item corresponding to the file identification solve the problem that IO (input/output) overhead generated by file synchronization by using a Raft protocol in the prior art is large, reduce the IO overhead generated in the process of synchronizing the Blob file by using the Raft protocol and improve the system performance.

Example two

The embodiment further refines on the basis of the above embodiment, and provides a specific step of reading a raw log, a specific step of a Truncate raw log entry, a specific step of recovering a temporary file, a specific step of a data snapshot, a specific step of updating a file synchronization state, a specific step of maintaining file index data, and a specific step of parallel synchronization of a plurality of files. Each operation is described in detail in the following cases.

For the operation of reading the Raft log, the method can comprise the following steps: in response to a raw log read request, determining a destination address and a first log sequence number; reading a corresponding log entry from a Raft log according to the first log sequence number, and reading a corresponding file segment from a local temporary file according to a local temporary file path, a file offset and a data length recorded in the log entry; and sending the log entry and the file segment to a destination address.

In this embodiment, the leader node needs to synchronize the target file segment and the target log entry to the follower node by reading the Raft log under the conditions that the target file segment is written into the local temporary file and the corresponding log entry and the segment index information are updated; or, the leader node completes synchronization on the whole Blob file to generate a corresponding synchronization completion log item, and when the file synchronization state in the file index data is committed, the target file segment and the target log item need to be submitted to the database by reading the Raft log.

Firstly, determining a first log serial number of a log item to be read, then querying the log item corresponding to the first log serial number from a Raft log, extracting a temporary file path from the log item, finding a corresponding temporary file, determining the position of a file segment to be read in the temporary file according to the file offset, determining the size of the file segment to be read according to the data length, and further reading the file segment from the temporary file. And finally, sending the log items and the file segments to a following node or a database.

Operations on the trunk raw log entry may include: responding to a trigger message for deleting the log, determining a second log serial number, and deleting log items of which the log serial number is greater than or equal to the second log serial number in the Raft log; traversing each file index data, and if the initial log serial number in the file index data is greater than or equal to the second log serial number, deleting the file index data; if the stop log serial number in the file index data is smaller than the second log serial number, retaining the file index data; and if the initial log sequence number is less than the second log sequence number and the termination log sequence number is greater than or equal to the second log sequence number, setting the file synchronization state as an uncompleted state, deleting the segment index information of which the log sequence number is greater than or equal to the second log sequence number in the file index data, and updating the termination log sequence number in the file state information to the maximum log sequence number corresponding to the residual segment index information in the file index data.

In this embodiment, the Raft protocol requires that the log can accurately execute a delete (truncate) operation, and when the truncate operation of the log is executed, consistency between the log item and the file index data and the temporary file is ensured. And using the truncate log sequence to represent the log sequence which needs to be deleted by the truncate operation, namely after the truncate operation is completed, deleting all log entries with log sequence numbers which are more than or equal to the log sequence of the truncate. For example, after the leader node changes, the new leader node notifies the node to write a log entry with a log sequence number of 101, and the node already stores a log entry with a sequence number of 103 that the previous leader node notified the writing, and at this time, all log entries after the sequence number of 100 of the log entry of the leader are triggered.

In this embodiment, after the truncate operation of the log is triggered, the second log sequence number, that is, the truncate log sequence, is determined, and all log entries whose log sequence numbers are greater than or equal to the truncate log sequence are deleted. Traversing each file index data, and deleting the whole file index data if the min log sequence is more than or equal to the truncate log sequence for each file index data; if max log sequence is less than truncate log sequence, indicating that truncate is not needed for the file index data; otherwise, if the file synchronization state is committed or aborted, the file synchronization state is set to open, all fragment index information with log sequence greater than or equal to round log sequence in the file index data is deleted, and max log sequence in the file state information is updated to the maximum value of log sequence in the remaining fragment index information.

It should be noted that the truncate operation does not require a truncate temporary file, and the reason why this can be done is that: since the client guarantees the sending order of the file fragments, when truncate logs, only the last file fragment of each temporary file needs to truncate at most. Wherein due to the existence of retry, multiple fragment index information of truncate may be required. In the process of writing the file segment into the temporary file, whether the writing operation of the file segment is retried by the client or not or whether redundant data exists at the tail part of the temporary file or not is judged, and if the redundant data exists, the truncate operation of the temporary file is postponed until the writing operation of the file segment is executed.

When the file fragment writing operation is executed, firstly, the file fragment is written into a temporary file, then a corresponding log item is generated and written into a raw log, and if the file fragment is down and restarted in the process of writing the file fragment into the temporary file, or the file fragment is successfully written into the temporary file, but the log item is down and restarted before being written into the raw log, redundant data can be generated at the tail part of the temporary file.

The significance of omitting the truncate temporary file is that the client may retry the file fragment write operation, so that a plurality of fragment index information points to the same file fragment of the temporary file, and omitting the truncate temporary file can reduce the logic complexity.

The operation of recovering the temporary file may include: responding to a timing trigger message of the recovery temporary file, and determining the maximum recovery log sequence number; traversing each file index data, and determining a candidate file identifier and a candidate termination log serial number corresponding to the synchronization state of the specified file; specifying a file synchronization state to include a completion state or a cancellation state; and if the maximum recovery log serial number is greater than or equal to the candidate termination log serial number, determining a candidate file identifier corresponding to the candidate termination log serial number, and deleting the file index data and the temporary file corresponding to the candidate file identifier.

In this embodiment, it is necessary to periodically recover a useless Blob temporary file, and the recovery of the temporary file depends on the recovery of the raw log. Using the purge log sequence to represent the current maximum recoverable random log sequence, the logic for determining whether the temporary file needs to be retained includes: and traversing each file index data to find all candidate Blob files in a committed or aborted state, wherein if the max log sequence in the file index data of a certain candidate Blob file is less than or equal to the purge log sequence, the candidate Blob file can be recycled, and at this time, the file index data and the temporary file of the Blob file are deleted.

The operations for the data snapshot may include: responding to a data snapshot request sent by a new node, and determining a snapshot log serial number; traversing each file index data, and if the initial log serial number in the file index data is less than or equal to the snapshot log serial number and the file synchronization state in the file index data is an uncompleted state, acquiring a file identifier corresponding to the file index data; and sending the temporary file data and the file index data corresponding to the file identification to the new node.

In this embodiment, after adding a new node device to the Raft group, the new node device first acts as a learner to pull snapshot data from the Raft group, and before starting the Raft log synchronization, the data of the temporary file needs to be copied to the new node device first as a part of the snapshot. Using the snapshot log sequence to represent the log sequence of the snapshot operation, the raw log of the new node device will be synchronized starting from the snapshot log sequence + 1. The logic to snapshot the temporary file includes: traversing each file index data, wherein if the snapshot log sequence is smaller than the min log sequence in the file index data, the temporary file corresponding to the file index data does not need to be included in the snapshot; if the snapshot log sequence is greater than or equal to the max log sequence in the file index data and the Blob file is in a committed or aborted state, the temporary file corresponding to the file index data does not need to be included in the snapshot; if the min log sequence in the file index data is less than or equal to the snapshot log sequence and the corresponding Blob file is in the open state, the file index data and the corresponding temporary file are both included in the snapshot.

It should be noted that the file index data included in the snapshot is sent to the new node device. For the newly added node device, the file synchronization state of the Blob temporary file in the snapshot is an open state. The Max log sequence is a log sequence of the last piece index information of each file index data included in the snapshot.

For the operation of updating the file synchronization state, the method may include: receiving a synchronous state updating request sent by a client, acquiring a target file identifier and a target file state corresponding to the synchronous state updating request, and searching target file index data matched with the target file identifier; if the file synchronization state in the target file index data is an incomplete state, generating a state log item corresponding to the target file state and writing a Raft log, and updating the file synchronization state and the termination log serial number in the target file index data according to the target file state and the state log item; and if the file synchronization state in the target file index data is not an uncompleted state, returning a corresponding feedback message to the client according to whether the current file synchronization state is matched with the target file state.

In this embodiment, when the client notifies the Blob file of completion or cancellation of synchronization, the corresponding file index data is searched according to the Blob file identifier bid, if the file synchronization status in the file index data is open, a corresponding commit or abort log entry is generated and written into the raw log, the status in the file index data is updated to committed or abort, and the max log sequence is updated to the log sequence of the commit or abort log entry. And if the status in the file index data is committed/aborted, judging whether the status in the file index data is matched with the target file state notified by the client, if so, returning to be successful, and otherwise, returning to be failed. If the file index data corresponding to the bid does not exist, failure or success can be returned.

It should be noted that the log entry generated by the operation of completing or canceling file synchronization may only include the file synchronization status and the file identifier, as shown in fig. 1 b.

For the operation of maintaining the file index data, since it is required to ensure that the file index data is consistent with the Raft log, and it is desirable that the average IO overhead of each file segment write operation approaches to 2, this embodiment proposes two feasible solutions for the maintenance operation of the file index data: according to the first scheme, maintenance of file index data is achieved through a full memory and a checkpoint; and in the second scheme, the maintenance of the file index data is realized based on a kv engine, such as rocksdb.

For the first scheme, because the data volume of the file index data is relatively small, the file index data can be designed into a full memory, and the full checkpoint is periodically performed and stored on the disk. When the Checkpoint is performed, log sequence in the file index data in the committed state is recorded in the Checkpoint file. When a new node device joins a raw group, the file index data in the snapshot is saved as checkpoint. In the temporary file recovery flow, ensuring that the purge log sequence is less than or equal to the log sequence recorded by the latest checkpoint. And in the recovery stage after the downtime restart, the checkpoint is loaded in full, and the raw logs are sequentially scanned from the log sequence +1 recorded in the checkpoint file to recover the file index data.

For the second scheme, file index data is encoded and converted into data in a key value pair form, the data is written into the kv engine, and a write ahead log (write ahead log) function of the kv engine is closed before the data is written, so that the IO overhead is reduced. And writing special records into the kv engine regularly, identifying log sequence in the file index data in the committed state, and performing flush operation. And in the temporary file recycling process, ensuring that the purge log sequence is less than or equal to the log sequence written into the kv engine finally. And in the recovery stage after the downtime restart, reading the recorded log sequence from the kv engine, and sequentially scanning the raw log from the log sequence +1 to recover the file index data.

Optionally, in some scenarios, for example, in a scenario in which a distributed database implements BulkLoad, it is necessary to synchronize a plurality of Blob files by means of Raft, and guarantee that commit or abort of the Blob files have atomicity. One atomicity unit may be described using group and a unique identification of the atomicity unit may be described using gid (group id). In the RPC message of the file fragment write operation, besides the identifier bid of the Blob file, it needs to take gid of the atomicity unit to which the Blob file belongs, and correspondingly, as shown in fig. 2b, the write operation on the file fragment needs to additionally record the corresponding gid in the Raft log, and the Commit or abort logic acts on the group to indicate the submission or interruption of a group. Wherein, each Blob file data in each group is still stored in the temporary file corresponding to the file identifier, as shown in fig. 2 a.

In this embodiment, as shown in fig. 2c, the index data of the atomicity unit is divided into two layers, i.e., a group and a Blob file, where one group includes a plurality of Blob files, and the index information of the group includes: gid, status, min log sequence, max log sequence. Wherein status represents a Blob group status, open status represents that the Blob group is not completed synchronously, only the Blob group in open status allows file fragment write operation, and committed or aborted status represents that the Blob group is completed synchronously. Min log sequence is the log sequence of the log item corresponding to the write operation of the first file segment of the Blob group. And Max log sequence, for the Blob group in open state, identifying the log sequence of the log item corresponding to the last file fragment writing operation, and for the Blob group in committed or aborted state, representing the log sequence of the Raft log item corresponding to the commit or abort operation.

When the Truncate raft process is carried out on the atomicity unit, traversing the index data of the Blob groups, and deleting the whole Blob group if the min log sequence is greater than or equal to the Truncate log sequence for the index data of each Blob group; if max log sequence is less than truncate log sequence, it indicates that the Blob group does not need to be truncated; otherwise, if the synchronization state of the Blob group is committed or aburted, the synchronization state of the Blob group is set to open, a truncate operation is sequentially executed on the file index data of each Blob file, and the max log sequence is updated.

When a data snapshot process is carried out on an atomicity unit, traversing index data of each Blob group, and aiming at the index data of each Blob group, if snapshot log sequence is smaller than min log sequence, the Blob group does not need to be included in a snapshot; if the snapshot log sequence is greater than or equal to the max log sequence, and the Blob group is in a committed or aborted state, the Blob group does not need to be included in the snapshot; if the snapshot log sequence is greater than or equal to the min log sequence, and the corresponding Blob group is in an open state, the index data of the Blob group and the temporary file corresponding to each Blob file in the Blob group are included in the snapshot.

When the temporary file recovery process is performed on the atomicity unit, the index data of the Blob group is traversed, all Blob groups in the committed or aborted state are found, and if the purge log sequence is greater than or equal to the max log sequence of the Blob group, the Blob group can be recovered, that is, the index data and the temporary file of the Blob group are deleted.

The embodiment provides a method for synchronizing a Blob file by using a Raft protocol, the Blob file data in the synchronizing process is written into an independent temporary file, only information similar to an index is stored in a Raft log, and index data aiming at the Blob temporary file are generated at the same time, so that the performance of related log operations of the Raft log is not influenced based on the index data, IO (input/output) expenses generated in the process of synchronizing the Blob file by using the Raft protocol are reduced, and the system performance is improved.

EXAMPLE III

Fig. 3 is a schematic structural diagram of a computer device according to a third embodiment of the present invention, and fig. 3 shows a block diagram of an exemplary device 12 suitable for implementing an embodiment of the present invention. The device 12 shown in fig. 3 is only an example and should not bring any limitations to the functionality and scope of use of the embodiments of the present invention.

As shown in FIG. 3, device 12 is in the form of a general purpose computing device. The components of device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory. Device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 3, and commonly referred to as a "hard drive"). Although not shown in FIG. 3, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of embodiments of the invention as described.

Device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with device 12, and/or with any devices (e.g., network card, modem, etc.) that enable device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 20. As shown, the network adapter 20 communicates with the other modules of the device 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, to name a few.

The processing unit 16 executes various functional applications and data processing by executing programs stored in the system memory 28, for example, to implement a file synchronization method provided by an embodiment of the present invention, including:

Example four

The fourth embodiment of the present invention further discloses a computer storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a file synchronization method, and includes:

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method for synchronizing files, comprising:

responding to the synchronous operation of a client to a target file segment, and determining the file identifier of a binary large object Blob file corresponding to the target file segment;

generating fragment index information corresponding to the target log item, and storing the fragment index information into file index data corresponding to the file identifier; the file index data includes: file state information and segment index information pointing to a log item corresponding to the file identifier;

writing the target file segments into a temporary file corresponding to the file identifier, generating corresponding target log items, and storing the target log items into a raw log, wherein the method comprises the following steps:

2. The method of claim 1, wherein the file status information comprises: file identification, file synchronization state, starting log sequence number and ending log sequence number;

and taking the minimum value of the log sequence number as an initial log sequence number in the file state information, and taking the maximum value of the log sequence number as an ending log sequence number in the file state information when the file synchronization state is an uncompleted state.

3. The method of claim 1, further comprising:

and sending the log entry and the file segment to the destination address.

4. The method of claim 2, further comprising:

if the termination log serial number in the file index data is smaller than the second log serial number, retaining the file index data;

5. The method of claim 1, further comprising:

responding to a timing trigger message of recovering the temporary file, and determining a maximum recovery log sequence number;

and if the maximum recycle log serial number is greater than or equal to the candidate termination log serial number, determining a candidate file identifier corresponding to the candidate termination log serial number, and deleting the file index data and the temporary file corresponding to the candidate file identifier.

6. The method of claim 1, further comprising:

traversing each file index data, and if an initial log serial number in the file index data is less than or equal to the snapshot log serial number and a file synchronization state in the file index data is an uncompleted state, acquiring a file identifier corresponding to the file index data;

7. The method of claim 1, further comprising:

8. A computer device, the device comprising:

one or more processors;

a storage device to store one or more programs,

when executed by the one or more processors, cause the one or more processors to implement a method for file synchronization as recited in any of claims 1-7.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method for file synchronization according to any one of claims 1 to 7.