WO2021093735A1 - Data synchronization method, apparatus and device for distributed storage system, and storage medium - Google Patents

Data synchronization method, apparatus and device for distributed storage system, and storage medium Download PDF

Info

Publication number
WO2021093735A1
WO2021093735A1 PCT/CN2020/127873 CN2020127873W WO2021093735A1 WO 2021093735 A1 WO2021093735 A1 WO 2021093735A1 CN 2020127873 W CN2020127873 W CN 2020127873W WO 2021093735 A1 WO2021093735 A1 WO 2021093735A1
Authority
WO
WIPO (PCT)
Prior art keywords
log
storage server
log information
determined
written
Prior art date
Application number
PCT/CN2020/127873
Other languages
French (fr)
Chinese (zh)
Inventor
黎海兵
Original Assignee
北京金山云网络技术有限公司
北京金山云科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京金山云网络技术有限公司, 北京金山云科技有限公司 filed Critical 北京金山云网络技术有限公司
Publication of WO2021093735A1 publication Critical patent/WO2021093735A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Definitions

  • This application relates to the field of communication technology, and more specifically, to a data synchronization method of a distributed storage system, a data synchronization device of a distributed storage system, a storage server of a distributed storage system, and a computer storage medium.
  • the primary storage server can synchronize log entries in the local log file to the secondary storage server.
  • the primary storage server will also notify the secondary storage server of the latest log entry that has been persisted by most storage servers, so that the secondary storage server can read the latest log entry and all previous log entries. And store the read content in the storage engine corresponding to the slave storage server.
  • This method of data synchronization has the following drawbacks: After the primary storage server synchronizes the last log entry in the local log file to the secondary storage server, there may be a situation where no new data is written for a certain period of time, and the primary storage The server only interacts with the secondary storage server when it needs to synchronize log entries. Then, the primary storage server will no longer interact with the secondary storage server within a certain period of time. Therefore, the primary storage server cannot promptly notify the secondary storage server of the latest log entry information that has been persisted by most storage servers, resulting in the lag in updating the log entries in the storage engine of the secondary storage server, and cannot communicate with the storage engine of the primary storage server. The log entries in are kept in sync.
  • One purpose of this application is to provide a data synchronization method, device, device, and storage medium for a distributed storage system to solve the problem of the lag in updating log entries in the storage engine of the storage server in the related art, which cannot be compared with the storage of the main storage server.
  • the log entries in the engine remain synchronized.
  • the distributed storage system includes a primary storage server and a secondary storage server.
  • the method is implemented by the primary storage server and includes: determining Whether the slave storage server successfully writes the first log information, where the first log information includes the latest log entry in the local log file of the master storage server; when it is determined that the slave storage server has successfully written the first log entry After a log information, it is determined whether there is data written; if it is determined that no data is written, the second log information is sent to the slave storage server to notify the slave storage server to store the third log information, and the second log information There is no log entry; wherein, the third log information includes a set number of other latest log entries persisted from the storage server.
  • the determining whether data is written after it is determined that the slave storage server has successfully written the first log information includes: determining that the slave server has successfully written the first log information When the timer reaches the preset time, monitor whether there are log entries added to the local log file; if it is determined that there are log entries added to the local log file, it is determined that data is written; if it is determined If no log entry is added to the local log file, it is determined that no data is written.
  • the method further includes: at any time before the timing time reaches a preset time, if it is monitored that a log entry is added to the local log file, stopping timing; Four log information, wherein the fourth log information includes at least log entries added to the local log file.
  • the method further includes: when it is determined that the slave storage server has not successfully written the first log information, receiving a first serial number sent by the slave storage server, where the first serial number is the slave The number of the most recent log entry in the storage server.
  • the method further includes: sending fifth log information to the slave storage server according to the first serial number, wherein the fifth log information includes the first serial number to the second serial number The log entries corresponding to all consecutive serial numbers in between, where the second serial number is the number corresponding to the log entry with the latest writing time in the local log file.
  • the determining whether the slave storage server successfully writes the first log information includes: receiving reply information from the slave storage server; when the reply information is a notification of successful writing, determining The writing of the first log information from the storage server is successful; when the reply information is a notification of writing failure, it is determined that the writing of the first log information from the storage server has failed.
  • a data synchronization device of a distributed storage system including: a first determining module configured to determine whether the slave storage server successfully writes first log information, the first log The information includes the latest log entry in the local log file of the primary storage server; the second determining module is configured to determine whether there is data to be written after it is determined that the secondary storage server has successfully written the first log information; The sending module is configured to send second log information to the slave storage server if it is determined that no data is written to notify the slave storage server to store the third log information, and the second log information does not have a log entry; wherein, The third log information includes a set number of other latest log entries persisted from the storage server.
  • the second determining module is specifically configured to: when it is determined that the slave server has successfully written the first log information, start timing; during the period when the timing time reaches a preset time, monitor the local Whether there are log entries added to the log file; if it is determined that there are log entries added to the local log file, it is determined that data is written; if it is determined that there are no log entries added to the local log file, it is determined that no data is written.
  • a storage server of a distributed storage system including: a processor and a memory; the memory is used to store executable instructions, and the instructions are used to control the processor to execute The method of any one of the first aspect.
  • a computer storage medium stores computer instructions, and when the computer instructions in the storage medium are executed by a processor, a Methods.
  • the storage server it is first determined whether the first log information is successfully written from the storage server. Since the first log information includes the latest log entry in the local log file of the primary storage server, when it is determined that the secondary storage server has successfully written the first log information, it indicates that the log entry in the local log file of the primary storage server is the same as Synchronize log entries from the local log file of the storage server.
  • the slave storage server reads and stores the current set number of other latest log entries persisted from the storage server and previous log entries from the local log file to the storage engine. This avoids the lag in updating the log entries in the storage engine of the slave storage server.
  • the log entries in the storage engine of the slave storage server are synchronized with the log entries in the storage engine of the primary storage server.
  • FIG. 1 is a block diagram of the hardware configuration of a storage server in a distributed storage system provided by an embodiment of the present application;
  • FIG. 2 is a flowchart of a data synchronization method of a distributed storage system provided by an embodiment of the present application
  • FIG. 3 is a schematic structural diagram of a data synchronization device of a distributed storage system provided by an embodiment of the present application
  • FIG. 4 is a schematic structural diagram of a storage server of a distributed storage system provided by an embodiment of the present application.
  • FIG. 1 is a block diagram of the hardware configuration of a storage server in a distributed storage system provided by an embodiment of the present application.
  • the storage server may be a primary storage server in a distributed storage system, and may also be a secondary storage server in a distributed storage system.
  • the storage server 1000 may be a virtual machine or a physical machine.
  • the storage server 1000 may include a processor 1100, a memory 1200, an interface device 1300, a communication device 1400, a display device 1500, an input device 1600, a speaker 1700, a microphone 1800, and so on.
  • the processor 1100 may be a central processing unit CPU, a microprocessor MCU, or the like.
  • the memory 1200 may include ROM (Read Only Memory), RAM (Random Access Memory), non-volatile memory such as a hard disk, and the like.
  • the interface device 1300 may include a USB (Universal Serial Bus, universal serial bus) interface, a headphone interface, and the like.
  • the communication device 1400 can perform wired or wireless communication.
  • the display device 1500 may include a liquid crystal display screen, a touch display screen, and the like.
  • the input device 1600 may include a touch screen, a keyboard, and the like. The user can input/output voice information through the speaker 1700 and the microphone 1800.
  • the present application may only involve some of the devices.
  • the storage server 1000 only involves the memory 1200 and the processor 1100.
  • An embodiment of the present application provides a data synchronization method for a distributed storage system, where the distributed storage system includes a master storage server and a slave storage server, and the number of slave storage servers may be at least one.
  • the description of the primary storage server and secondary storage server in the distributed storage system is as follows:
  • a distributed storage system users upload table data to the main storage server through the client.
  • the main storage server generates a corresponding log entry based on each table data uploaded by the user, and writes the log entry locally to the main storage server In the log file (also known as the primary storage server to persist log entries).
  • the log file also known as the primary storage server to persist log entries.
  • the local log file of the primary storage server consists of at least one log entry.
  • the main storage server generates a corresponding log entry according to each table data uploaded by the user.
  • a log entry corresponding to each table data includes: the uploaded table data, the number of the generated log entry, and the tenure number of the main storage server.
  • the number of log entries generated by the primary storage server is incremental, for example, strictly continuous increments, that is, the log entry corresponding to the largest number in the primary storage server is the latest log entry generated in the primary storage server.
  • the uploaded table data is the data to be stored
  • the number of the generated log entry is the number of the log entry given when the log entry is generated
  • the term number of the primary storage server can represent the primary storage server as the primary server. Effective time limit.
  • the primary storage server After the primary storage service writes the log entries into the local log file, the primary storage server obtains a batch of log entries that need to be synchronized to the secondary storage server from the local log file in a descending and continuous manner, and obtains The log entries are forwarded to the slave storage server.
  • the secondary storage server When the secondary storage server receives the log entry sent by the primary storage server, it performs matching according to the current latest log entry (that is, the log entry corresponding to the largest number) in the local log file of the secondary storage server.
  • the secondary storage server determines that the match is successful, the secondary storage server writes the log entries forwarded by the primary storage server to the local log file (also known as the secondary storage server persisting the log entries), and sends the write to the primary storage server Successful notification.
  • the secondary storage server determines that the match fails, the secondary storage server does not write the log entries forwarded by the storage server into the local log file, and sends a notification of the write failure to the primary storage server.
  • matching from the storage server according to the current latest log entry in the local log file may include: identifying the smallest number of the received log entry, whether it is the number of the current latest log entry in the local log file The next number, if it is, it is judged as a match, otherwise, it is judged as a mismatch; or, to identify whether the number of the received log entry is continuously increasing, and it contains the lower part of the number of the current latest log entry in the local log file. A number, if it is, it is judged as a match, otherwise, it is judged as a non-match.
  • most storage servers may include a primary storage server and other secondary storage servers other than the primary storage server, and the other secondary storage servers are storage servers other than the above-mentioned secondary storage server and primary storage server.
  • the embodiment of the present application provides a data synchronization method of a distributed storage system, where the distributed storage system includes a primary storage server and a secondary storage server.
  • the method is implemented by the main storage server in the distributed storage system, that is, the method is applied to the main storage server in the distributed storage system.
  • the method includes the following S2100-S2300:
  • S2100 Determine whether the slave storage server successfully writes the first log information, where the first log information includes the latest log entry in the log of the master storage server.
  • the first log information is the latest batch of log entries forwarded by the primary storage server to the secondary storage server, that is, during this synchronization process, the primary storage server forwarded to the secondary storage server and contains the latest log entry.
  • the first log information includes the latest log entry in the local log file of the primary storage server.
  • the first log entry may include one log entry, or may include multiple log entries. It can be understood that the latest log entry in the local log file of the primary storage server is the log entry corresponding to the largest number in the local log file of the primary storage server.
  • the above S2100 may be determined according to the reply information sent from the storage server. Based on this, the above S2100 can be implemented through the following S2110-S2112:
  • the reply information from the secondary storage server is generated by the secondary storage server based on the first log information received.
  • the secondary storage server successfully writes the first log information into the local log file, the secondary storage server sends a notification of writing success to the primary storage server.
  • the secondary storage server when the secondary storage server does not successfully write the first log information into the local log file, the secondary storage server sends a notification of writing failure to the primary storage server.
  • the reply message from the storage server may be a notification of successful writing or a notification of writing failure.
  • the reply message when the reply message is a notification that the writing is successful, it means that the secondary storage server has written the first log information into the local log file, and it can be determined that the secondary storage server has successfully written the first log information.
  • the reply message when the reply message is a notification of writing failure, it means that the slave storage server has not successfully written the first log information to the local log file, and it can be determined that the writing of the first log information from the storage server has failed.
  • the slave storage server when the slave storage server has successfully written the first log information, it means that the master storage server has sent all log entries in the local log file to the slave storage server, and the slave storage server has sent the master All log entries in the local log file of the storage server are written to its own local log file. At this time, the log entries in the local log file of the primary storage server are synchronized with the log entries in the local log file of the secondary storage server.
  • the primary storage server determines whether there is data to be written, so as to determine whether the log entries in the primary storage server’s local log file are different from the secondary storage server’s log entries. Whether the log entries in the local log file change from the synchronized state to the unsynchronized state. In other words, determining whether data is written may refer to whether a new log entry is added to the local log file of the primary storage server relative to the time point when the secondary storage server has successfully written the first log information. In addition, when determining whether there is data to be written, it can be specifically determined whether there is data to be written within a predetermined period of time. The predetermined period of time can be set according to the actual situation, for example, the distributed storage system can tolerate the log entry update delay in the storage engine. Time to set.
  • the second log information does not have log entries.
  • the third log information contains the current set number of other latest log entries persisted from the storage server. It should be noted that although there are no log entries in the second log information, in order to ensure that the secondary storage server can store the third log information in the storage engine, the second log information may carry the currently set number of other secondary storages. The number of the latest log entry persisted by the server.
  • the set number of other storage servers and main storage servers are most of the storage servers in the distributed storage system.
  • it is determined that no data is written it is determined that the log entries in the local log file of the primary storage server are still synchronized with the log entries in the local log file of the secondary storage server.
  • the primary storage server sends a second log information without a log entry to the secondary storage server to notify the secondary storage server to store the third log information to the storage engine of the secondary storage server.
  • the storage server can know that the currently set number of log entries Other latest log entries persisted from the storage server.
  • the log entry contained in the third log information in the local log file and the previous log entry are read from the storage server, and stored in its own storage engine. It should be noted that the log entry contained in the third log information can be read from the storage server, and all or part of the content or part of the previous log entry can be read from the storage server, and the read content can be stored in the storage engine.
  • the log entry includes the uploaded form data, the number of the generated log entry, and the term number of the main storage server, part of the content may be the uploaded form data, or the content containing the uploaded form data and the number of the generated log entry.
  • each storage server uses its own storage engine to create, query, update, and delete data operations.
  • the storage engine in MySQL may include but is not limited to: MyISAM storage engine, innoDB storage engine, MEMORY storage engine, ARCHIVE storage engine.
  • the storage server it is first determined whether the first log information is successfully written from the storage server. Since the first log information includes the latest log entry in the local log file of the primary storage server, when it is determined that the secondary storage server has successfully written the first log information, it indicates that the log entry in the local log file of the primary storage server is the same as Synchronize log entries from the local log file of the storage server.
  • the slave storage server reads and stores the current set number of other latest log entries persisted from the storage server and previous log entries from the local log file to the storage engine. This avoids the lag in updating log entries from the storage engine of the storage server.
  • the log entries in the storage engine of the slave storage server are synchronized with the log entries in the storage engine of the master storage server.
  • the above S2200 can be implemented by the following S2210-S2113:
  • S2210 Start timing when it is determined that the slave server has successfully written the first log information.
  • the above-mentioned S2210 may be specifically implemented as follows: upon receiving a notification of a successful write from the slave storage server, the master storage server starts timing.
  • the above-mentioned preset time may be 5 minutes, or other. This embodiment is not limited.
  • the so-called period during which the timing time reaches the preset time is the duration from the start of the timing to the preset time.
  • the main storage server can monitor whether the storage amount of the local log file (relative to the storage amount of the local log file corresponding to the time when the timing is started) increases to monitor Whether there are log entries added to the local log file.
  • the main storage server when the client uploads the table data to the main storage server, the main storage server generates corresponding log entries for the uploaded table data and writes them to the local log file. Therefore, during the period when the timing time reaches the preset time, the main storage server can monitor whether the table data uploaded by the client is received to monitor whether there are log entries in the local log file.
  • the storage capacity of the local log file (relative to the storage amount of the local log file corresponding to the start timing) becomes larger, it can be determined that there are log entries added to the local log file, and further it can be determined that data is written .
  • the storage amount of the local log file when the storage amount of the local log file (relative to the storage amount of the local log file corresponding to the start timing) does not change, it can be determined that no log entries are added to the local log file, and it is further determined that no data is written. Into.
  • the monitoring itself when the monitoring itself does not receive the form data uploaded by the client, it can be determined that no log entries have been added to the local log file, and it can be further determined that no data is written.
  • the data synchronization method of the distributed storage system provided in this embodiment further includes the following S2114 and S2115:
  • the log entry in the local log file of the primary storage server is different from the local log entry of the secondary storage server.
  • the log entries in the log file change from a synchronized state to an unsynchronized state.
  • the log entries in the local log file of the primary storage server and the log entries in the local log file of the secondary storage server change from a synchronized state to a non-synchronized state.
  • the primary storage server needs to increase the number of log entries in the local log file of the secondary storage server.
  • Log entries are sent to the slave storage server, that is, additional log entries need to be sent to the slave storage server.
  • the primary storage server uses part of the newly added log entries or all the added log entries as the fourth log information to send to the secondary storage server.
  • the slave storage server can be notified of the latest log entry information that has been persisted by most storage servers, so that the slave The storage server reads the latest log entry and all previous log entries, and stores the read content in the storage engine corresponding to the slave storage server.
  • the data synchronization method of the distributed storage system provided in this embodiment further includes the following S2400:
  • the first log information is matched according to the first serial number in the own local log file. If the matching fails, the secondary storage server will not write the first log information to the local log file, that is, the secondary storage server does not successfully write the first log information to the local log file. At this time, a notification of write failure is sent to the primary storage server.
  • the primary storage server receives the reply message of the notification of the writing failure from the secondary storage server, it is determined that the secondary storage server has not successfully written the first log information.
  • matching the first log information according to the first serial number in its own local log file please refer to the above about matching from the storage server according to the current latest log entry in the local log file of the slave storage server. The way to achieve it.
  • the primary storage server may instruct the secondary storage server to report the number of the new log entry in its local log file, that is, the first serial number . It can be understood that the number of the latest log entry from the storage server is the largest number of the latest log entry from the storage server.
  • the secondary storage server may actively report the number of the latest log entry in its local log file, that is, the first sequence number, when sending a notification of write failure to the primary storage server.
  • the primary storage server determines that the secondary storage server has not successfully written the first log information
  • when it receives the first serial number sent from the storage server it can learn the writing in the local log file from the storage server. Information about the log entry.
  • the data synchronization method of the distributed storage system provided in this embodiment further includes the following S2500:
  • S2500 Send fifth log information to the slave storage server according to the first serial number, where the fifth log information includes log entries corresponding to all consecutive serial numbers between the first serial number and the second serial number, where the second The serial number is the number corresponding to the log entry with the latest write time in the local log file.
  • the master server since the second serial number is the number corresponding to the log entry with the latest writing time in the local file, that is, the master server forwards the latest log entry in the local log file to the slave server. Therefore, if the master server After determining that the secondary server has successfully written the fifth log information, you can continue to determine whether there is data written. If it is determined that no data has been written, send the second log information to the secondary storage server to notify the secondary storage server to store the third log Information steps.
  • the fifth log information includes all log entries corresponding to consecutive serial numbers between the first serial number and the second serial number, including log entries corresponding to the second serial number.
  • the master storage server when it is determined that the slave storage server has not successfully written the first log information, the master storage server sends the fifth log information to the slave storage server. Based on this, when the first log information is received from the storage server and the log entries in the fifth log information are written to the local log file, the local log file of the primary storage server and the local log file of the secondary storage server Log entries are synchronized. In this way, compared to the traditional one, when it is determined that the slave storage server has not successfully written the first log information, the master storage server only sends the first log information and the log entry preceding the log entry with the smallest number in the first log information to From the storage server, the local log file of the primary storage server can be quickly synchronized with the log entries in the local log file of the secondary storage server.
  • this embodiment also provides a data synchronization device 30 of a distributed storage system.
  • the device 30 includes a first determining module 31, a second determining module 32, and a sending module 33.
  • the first determination module 31 is configured to determine whether the slave storage server successfully writes the first log information, the first log information includes the latest log entry in the local log file of the master storage server;
  • the second determination The module 32 is used for determining whether there is data writing after it is determined that the secondary storage server has successfully written the first log information;
  • the sending module 33 is used for sending data to the secondary storage server if it is determined that there is no data writing. Sending the second log information to notify the secondary storage server to store the third log information, the second log information does not have log entries; wherein the third log information includes the currently set number of other secondary storage servers The latest persistent log entry.
  • the second determining module 32 is specifically configured to: start timing when it is determined that the slave server has successfully written the first log information; and monitor when the timing time reaches a preset time Whether there are log entries added to the local log file; if it is determined that there are log entries added to the local log file, it is determined that data is written; if it is determined that there are no log entries added to the local log file, it is determined that there is no data Write.
  • the data synchronization device 30 of the distributed storage system provided in this embodiment further includes a second sending module.
  • the second sending module is configured to: at any time before the timing time reaches the preset time, if it is detected that a log entry is added to the local log file, stop timing; Log information, where the fourth log information includes at least log entries added to the local log file.
  • the second determining module 32 is further configured to: when it is determined that the slave storage server has not successfully written the first log information, receive the first sequence number sent by the slave storage server, and the first sequence The number is the number of the latest log entry from the storage server.
  • the data synchronization device 30 of the distributed storage system provided in this embodiment further includes a third sending module.
  • the third sending module is configured to: send fifth log information to the slave storage server according to the first serial number, wherein the fifth log information includes the first serial number to the second serial number.
  • the first determining module 31 is specifically configured to: receive reply information from the secondary storage server; when the reply information is a notification of successful writing, determine that the secondary storage server successfully writes the first log information ; When the reply message is a notification of writing failure, it is determined that the writing of the first log information from the storage server fails.
  • an embodiment of the present application also provides a storage server 40 of a distributed storage system.
  • the storage server 40 includes the data synchronization device 30 of the distributed storage system in the foregoing device embodiment.
  • the storage server of the distributed storage system includes a storage 41 and a processor 42.
  • the memory is used to store executable instructions, and the instructions are used to control the processor to execute the method according to any one of the above method embodiments.
  • the present application also provides a computer storage medium that stores computer instructions, and when the computer instructions in the storage medium are executed by a processor, the method according to any one of the above method embodiments is implemented.
  • a computer program product may include a computer-readable storage medium, which carries computer-readable program instructions for enabling a processor to implement various aspects of the present application.
  • the computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Non-exhaustive list of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, mechanical encoding device, such as a printer with instructions stored thereon
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory flash memory
  • SRAM static random access memory
  • CD-ROM compact disk read-only memory
  • DVD digital versatile disk
  • memory stick floppy disk
  • mechanical encoding device such as a printer with instructions stored thereon
  • the computer-readable storage medium used here is not interpreted as the instantaneous signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (for example, light pulses through fiber optic cables), or through wires Transmission of electrical signals.
  • the computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • the network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .
  • the computer program instructions used to perform the operations of this application may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, status setting data, or in one or more programming languages.
  • Source code or object code written in any combination, the programming language includes object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as "C" language or similar programming languages.
  • Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server carried out.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to access the Internet). connection).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), can be customized by using the status information of the computer-readable program instructions.
  • the computer-readable program instructions are executed to realize various aspects of the present application.
  • These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine that makes these instructions when executed by the processor of the computer or other programmable data processing device , A device that implements the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium. These instructions make computers, programmable data processing apparatuses, and/or other devices work in a specific manner. Thus, the computer-readable medium storing the instructions includes An article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction contains one or more components for realizing the specified logical function.
  • Executable instructions may also occur in a different order than the order marked in the drawings. For example, two consecutive blocks can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that realization by hardware, realization by software, and realization by a combination of software and hardware are all equivalent.
  • the storage server it is first determined whether the first log information is successfully written from the storage server. Since the first log information includes the latest log entry in the local log file of the primary storage server, when it is determined that the secondary storage server has successfully written the first log information, it indicates that the log entry in the local log file of the primary storage server is the same as Synchronize log entries from the local log file of the storage server.
  • the slave storage server reads and stores a set number of other latest log entries persisted from the storage server and previous log entries from the local log file to the storage engine. This avoids the lag in updating the log entries in the storage engine of the slave storage server.
  • the log entries in the storage engine of the slave storage server are synchronized with the log entries in the storage engine of the primary storage server.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data synchronization method, apparatus and device for a distributed storage system, and a storage system. The distributed storage system comprises a master storage server and a slave storage server. The method is implemented by the master storage server, and comprises: determining whether the slave storage server successfully writes first log information, wherein the first log information comprises the latest log entry in a local log file of the master storage server; after it is determined that the first log information has been successfully written, determining whether data is written; and if it is determined that no data is written, sending second log information to the slave storage server to inform the slave storage server of storing third log information, wherein there is no log entry in the second log information, and the third log information includes a current set number of persistent latest log entries of other slave storage servers. By means of the solution, the problem of a log entry in a storage engine of the slave storage server being updated late and failing to be kept synchronous with a log entry in a storage engine of the master storage server can be solved.

Description

分布式存储系统的数据同步方法、装置、设备及存储介质Data synchronization method, device, equipment and storage medium of distributed storage system
本申请要求于2019年11月15日提交中国专利局、申请号为201911121853.X发明名称为“分布式存储系统的数据同步方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on November 15, 2019 with the application number 201911121853.X and the invention titled "Data synchronization method, device, equipment and storage medium of a distributed storage system". The entire content is incorporated into this application by reference.
技术领域Technical field
本申请涉及通信技术领域,更具体地,涉及一种分布式存储系统的数据同步方法、一种分布式存储系统的数据同步装置、一种分布式存储系统的存储服务器以及一种计算机存储介质。This application relates to the field of communication technology, and more specifically, to a data synchronization method of a distributed storage system, a data synchronization device of a distributed storage system, a storage server of a distributed storage system, and a computer storage medium.
背景技术Background technique
目前,在分布式存储系统中,主存储服务器可以将本地日志文件中的日志条目同步至从存储服务器中。与此同时,主存储服务器还会通知从存储服务器当前已被大多数存储服务器持久化的最新日志条目的信息,以使得从存储服务器读取该条最新的日志条目及其之前的所有日志条目,并将读取的内容存储在从存储服务器对应的存储引擎中。Currently, in a distributed storage system, the primary storage server can synchronize log entries in the local log file to the secondary storage server. At the same time, the primary storage server will also notify the secondary storage server of the latest log entry that has been persisted by most storage servers, so that the secondary storage server can read the latest log entry and all previous log entries. And store the read content in the storage engine corresponding to the slave storage server.
这种数据同步的方式会存在如下缺陷:主存储服务器在将本地日志文件中最后的一条日志条目同步至从存储服务器后,可能会出现一定时间内没有新数据被写入的情况,而主存储服务器在需要同步日志条目时,才与从存储服务器进行交互,那么,主存储服务器在一定时间内将不再与从存储服务器进行交互。因而,主存储服务器不能及时通知从存储服务器当前已经被大多数存储服务器持久化的最新日志条目的信息,导致该从存储服务器的存储引擎中的日志条目更新滞后,不能与主存储服务器的存储引擎中的日志条目保持同步。This method of data synchronization has the following drawbacks: After the primary storage server synchronizes the last log entry in the local log file to the secondary storage server, there may be a situation where no new data is written for a certain period of time, and the primary storage The server only interacts with the secondary storage server when it needs to synchronize log entries. Then, the primary storage server will no longer interact with the secondary storage server within a certain period of time. Therefore, the primary storage server cannot promptly notify the secondary storage server of the latest log entry information that has been persisted by most storage servers, resulting in the lag in updating the log entries in the storage engine of the secondary storage server, and cannot communicate with the storage engine of the primary storage server. The log entries in are kept in sync.
发明内容Summary of the invention
本申请的一个目的是提供一种分布式存储系统的数据同步方法、装置、设备及存储介质,以解决相关技术中从存储服务器的存储引擎中的日志条目 更新滞后,不能与主存储服务器的存储引擎中的日志条目保持同步的问题。One purpose of this application is to provide a data synchronization method, device, device, and storage medium for a distributed storage system to solve the problem of the lag in updating log entries in the storage engine of the storage server in the related art, which cannot be compared with the storage of the main storage server. The log entries in the engine remain synchronized.
根据本申请的第一方面,提供了一种分布式存储系统的数据同步方法,所述分布式存储系统包括主存储服务器和从存储服务器,所述方法由所述主存储服务器实施,包括:确定所述从存储服务器是否成功写入第一日志信息,所述第一日志信息包括所述主存储服务器的本地日志文件中最新一条日志条目;当确定所述从存储服务器已成功写入所述第一日志信息后,确定是否有数据写入;如果确定没有数据写入,向所述从存储服务器发送第二日志信息,以通知所述从存储服务器存储第三日志信息,所述第二日志信息不具有日志条目;其中,所述第三日志信息包含当前已被设定数量的其他从存储服务器持久化的最新日志条目。According to the first aspect of the present application, there is provided a data synchronization method for a distributed storage system. The distributed storage system includes a primary storage server and a secondary storage server. The method is implemented by the primary storage server and includes: determining Whether the slave storage server successfully writes the first log information, where the first log information includes the latest log entry in the local log file of the master storage server; when it is determined that the slave storage server has successfully written the first log entry After a log information, it is determined whether there is data written; if it is determined that no data is written, the second log information is sent to the slave storage server to notify the slave storage server to store the third log information, and the second log information There is no log entry; wherein, the third log information includes a set number of other latest log entries persisted from the storage server.
可选的,所述当确定所述从存储服务器已成功写入所述第一日志信息后,确定是否有数据写入,包括:在确定所述从服务器已成功写入所述第一日志信息时,启动计时;在计时时间到达预设时间的期间,监测所述本地日志文件中是否增加有日志条目;如果确定所述本地日志文件中增加有日志条目,则确定有数据写入;如果确定所述本地日志文件中未增加有日志条目,则确定没有数据写入。Optionally, the determining whether data is written after it is determined that the slave storage server has successfully written the first log information includes: determining that the slave server has successfully written the first log information When the timer reaches the preset time, monitor whether there are log entries added to the local log file; if it is determined that there are log entries added to the local log file, it is determined that data is written; if it is determined If no log entry is added to the local log file, it is determined that no data is written.
可选的,所述方法还包括:在所述计时时间到达预设时间之前的任意时刻,如果监测到所述本地日志文件中增加有日志条目,则停止计时;向所述从存储服务器发送第四日志信息,其中,所述第四日志信息至少包含所述本地日志文件中增加的日志条目。Optionally, the method further includes: at any time before the timing time reaches a preset time, if it is monitored that a log entry is added to the local log file, stopping timing; Four log information, wherein the fourth log information includes at least log entries added to the local log file.
可选的,所述方法还包括:当确定所述从存储服务器没有成功写入第一日志信息时,接收所述从存储服务器发送的第一序列号,所述第一序列号为所述从存储服务器中最新的日志条目的编号。Optionally, the method further includes: when it is determined that the slave storage server has not successfully written the first log information, receiving a first serial number sent by the slave storage server, where the first serial number is the slave The number of the most recent log entry in the storage server.
可选的,所述方法还包括:根据所述第一序列号,向所述从存储服务器发送第五日志信息,其中,所述第五日志信息包括所述第一序列号至第二序列号之间连续的所有序列号对应的日志条目,其中,所述第二序列号为本地日志文件中写入时间最新的日志条目对应的编号。Optionally, the method further includes: sending fifth log information to the slave storage server according to the first serial number, wherein the fifth log information includes the first serial number to the second serial number The log entries corresponding to all consecutive serial numbers in between, where the second serial number is the number corresponding to the log entry with the latest writing time in the local log file.
可选的,其中,所述确定所述从存储服务器是否成功写入第一日志信 息包括:接收来自所述从存储服务器的回复信息;当所述回复信息为写入成功的通知,则确定所述从存储服务器成功写入第一日志信息;当所述回复信息为写入失败的通知,则确定所述从存储服务器写入第一日志信息失败。Optionally, wherein the determining whether the slave storage server successfully writes the first log information includes: receiving reply information from the slave storage server; when the reply information is a notification of successful writing, determining The writing of the first log information from the storage server is successful; when the reply information is a notification of writing failure, it is determined that the writing of the first log information from the storage server has failed.
根据本申请的第二方面,提供了一种分布式存储系统的数据同步装置,包括:第一确定模块,设置为确定所述从存储服务器是否成功写入第一日志信息,所述第一日志信息包括所述主存储服务器的本地日志文件中最新一条日志条目;第二确定模块,设置为当确定所述从存储服务器已成功写入所述第一日志信息后,确定是否有数据写入;发送模块,设置为如果确定没有数据写入,向所述从存储服务器发送第二日志信息,以通知所述从存储服务器存储第三日志信息,所述第二日志信息不具有日志条目;其中,所述第三日志信息包含当前已被设定数量的其他从存储服务器持久化的最新日志条目。According to a second aspect of the present application, there is provided a data synchronization device of a distributed storage system, including: a first determining module configured to determine whether the slave storage server successfully writes first log information, the first log The information includes the latest log entry in the local log file of the primary storage server; the second determining module is configured to determine whether there is data to be written after it is determined that the secondary storage server has successfully written the first log information; The sending module is configured to send second log information to the slave storage server if it is determined that no data is written to notify the slave storage server to store the third log information, and the second log information does not have a log entry; wherein, The third log information includes a set number of other latest log entries persisted from the storage server.
可选的,所述第二确定模块,具体用于:在确定所述从服务器已成功写入所述第一日志信息时,启动计时;在计时时间到达预设时间的期间,监测所述本地日志文件中是否增加有日志条目;如果确定所述本地日志文件中增加有日志条目,则确定有数据写入;如果确定所述本地日志文件中未增加有日志条目,则确定没有数据写入。Optionally, the second determining module is specifically configured to: when it is determined that the slave server has successfully written the first log information, start timing; during the period when the timing time reaches a preset time, monitor the local Whether there are log entries added to the log file; if it is determined that there are log entries added to the local log file, it is determined that data is written; if it is determined that there are no log entries added to the local log file, it is determined that no data is written.
根据本申请的第三方面,提供了一种分布式存储系统的存储服务器,包括:处理器和存储器;所述存储器用于存储可执行的指令,所述指令用于控制所述处理器执行根据第一方面中任一项所述的方法。According to a third aspect of the present application, there is provided a storage server of a distributed storage system, including: a processor and a memory; the memory is used to store executable instructions, and the instructions are used to control the processor to execute The method of any one of the first aspect.
根据本申请的第四方面,提供了一种计算机存储介质,所述存储介质存储有计算机指令,当所述存储介质中的计算机指令由处理器执行时,实现如第一方面任一项所述的方法。According to a fourth aspect of the present application, a computer storage medium is provided, the storage medium stores computer instructions, and when the computer instructions in the storage medium are executed by a processor, a Methods.
在本实施例中,首先确定从存储服务器是否成功写入第一日志信息。由于第一日志信息包括主存储服务器的本地日志文件中最新一条日志条目,因此,当确定从存储服务器已成功写入第一日志信息后,说明主存储服务器的本地日志文件中的日志条目,与从存储服务器的本地日志文件中 的日志条目同步。此时,确定主存储服务器的本地日志文件中是否有数据写入;如果确定没有数据写入,向从存储服务器发送第二日志信息,以通知从存储服务器存储第三日志信息,第二日志信息不具有日志条目;由于第三日志信息包含当前已被设定数量的其他从存储服务器持久化的最新日志条目,因此,虽然主存储服务器的本地日志文件中的日志条目,与从存储服务器的本地日志文件中的日志条目同步,但是主存储服务器仍能够及时通知从存储服务器当前已被设定数量的其他从存储服务器持久化的最新日志条目。进一步的,从存储服务器将当前已被设定数量的其他从存储服务器持久化的最新日志条目及其之前的日志条目,从本地日志文件中读取并存储至存储引擎中。这样就避免了从存储服务器的存储引擎中的日志条目更新滞后,此时,从存储服务器的存储引擎中的日志条目与主存储服务器的存储引擎中的日志条目同步。In this embodiment, it is first determined whether the first log information is successfully written from the storage server. Since the first log information includes the latest log entry in the local log file of the primary storage server, when it is determined that the secondary storage server has successfully written the first log information, it indicates that the log entry in the local log file of the primary storage server is the same as Synchronize log entries from the local log file of the storage server. At this time, determine whether there is data written in the local log file of the primary storage server; if it is determined that there is no data written, send the second log information to the secondary storage server to notify the secondary storage server to store the third log information, and the second log information There is no log entry; because the third log information contains the latest log entries persisted by other secondary storage servers that are currently set, although the log entries in the local log file of the primary storage server are different from those of the secondary storage server. The log entries in the log file are synchronized, but the primary storage server can still notify the secondary storage server of the latest persistent log entries that have been set by the number of other secondary storage servers in a timely manner. Further, the slave storage server reads and stores the current set number of other latest log entries persisted from the storage server and previous log entries from the local log file to the storage engine. This avoids the lag in updating the log entries in the storage engine of the slave storage server. At this time, the log entries in the storage engine of the slave storage server are synchronized with the log entries in the storage engine of the primary storage server.
附图说明Description of the drawings
为了更清楚地说明本申请实施例和相关技术的技术方案,下面对实施例和相关技术中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions of the embodiments of the present application and related technologies, the following briefly introduces the drawings that need to be used in the embodiments and related technologies. Obviously, the drawings in the following description are only of the present application. For some embodiments, those of ordinary skill in the art can obtain other drawings based on these drawings without creative work.
图1是本申请实施例提供的一种分布式存储系统中存储服务器的硬件配置的框图;FIG. 1 is a block diagram of the hardware configuration of a storage server in a distributed storage system provided by an embodiment of the present application;
图2是本申请实施例提供的一种分布式存储系统的数据同步方法的流程图;2 is a flowchart of a data synchronization method of a distributed storage system provided by an embodiment of the present application;
图3是本申请实施例提供的一种分布式存储系统的数据同步装置的结构示意图;FIG. 3 is a schematic structural diagram of a data synchronization device of a distributed storage system provided by an embodiment of the present application;
图4是本申请实施例提供的一种分布式存储系统的存储服务器的结构示意图。FIG. 4 is a schematic structural diagram of a storage server of a distributed storage system provided by an embodiment of the present application.
具体实施方式Detailed ways
为使本申请的目的、技术方案、及优点更加清楚明白,以下参照附图并举实施例,对本申请进一步详细说明。显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions, and advantages of the present application clearer, the following further describes the present application in detail with reference to the accompanying drawings and embodiments. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of this application.
<硬件配置><Hardware Configuration>
图1是本申请实施例提供的一种分布式存储系统中存储服务器的硬件配置的框图。其中,该存储服务器可以为分布式存储系统中的主存储服务器,还可以为分布式存储系统中的从存储服务器。FIG. 1 is a block diagram of the hardware configuration of a storage server in a distributed storage system provided by an embodiment of the present application. Wherein, the storage server may be a primary storage server in a distributed storage system, and may also be a secondary storage server in a distributed storage system.
存储服务器1000可以为虚拟机或物理机。存储服务器1000可以包括处理器1100、存储器1200、接口装置1300、通信装置1400、显示装置1500、输入装置1600、扬声器1700、麦克风1800,等等。示例性地,处理器1100可以是中央处理器CPU、微处理器MCU等。示例性地,存储器1200可以包括ROM(只读存储器)、RAM(随机存取存储器)、诸如硬盘的非易失性存储器等。示例性的,接口装置1300可以包括USB(UniversalSerialBus,通用串行总线)接口、耳机接口等。示例性的,通信装置1400能够进行有线或无线通信。示例性地,显示装置1500可以包括是液晶显示屏、触摸显示屏等。示例性的,输入装置1600可以包括触摸屏、键盘等。用户可以通过扬声器1700和麦克风1800输入/输出语音信息。The storage server 1000 may be a virtual machine or a physical machine. The storage server 1000 may include a processor 1100, a memory 1200, an interface device 1300, a communication device 1400, a display device 1500, an input device 1600, a speaker 1700, a microphone 1800, and so on. Exemplarily, the processor 1100 may be a central processing unit CPU, a microprocessor MCU, or the like. Exemplarily, the memory 1200 may include ROM (Read Only Memory), RAM (Random Access Memory), non-volatile memory such as a hard disk, and the like. Exemplarily, the interface device 1300 may include a USB (Universal Serial Bus, universal serial bus) interface, a headphone interface, and the like. Exemplarily, the communication device 1400 can perform wired or wireless communication. Exemplarily, the display device 1500 may include a liquid crystal display screen, a touch display screen, and the like. Exemplarily, the input device 1600 may include a touch screen, a keyboard, and the like. The user can input/output voice information through the speaker 1700 and the microphone 1800.
尽管在图1中对存储服务器1000均示出了多个装置,但是,本申请可以仅涉及其中的部分装置,例如,存储服务器1000只涉及存储器1200和处理器1100。Although multiple devices are shown for the storage server 1000 in FIG. 1, the present application may only involve some of the devices. For example, the storage server 1000 only involves the memory 1200 and the processor 1100.
在上述描述中,技术人员可以根据本申请所公开方案设计指令。指令如何控制处理器进行操作,这是本领域公知,故在此不再详细描述。In the above description, technicians can design instructions according to the solutions disclosed in this application. How the instruction controls the processor to operate is well known in the art, so it will not be described in detail here.
<系统实施例><System Example>
本申请实施例提供一种分布式存储系统的数据同步方法,其中分布式 存储系统中包括主存储服务器和从存储服务器,其中,从存储服务器的数量可以为至少一个。对分布式存储系统中的主存储服务器和从存储服务器的说明如下:An embodiment of the present application provides a data synchronization method for a distributed storage system, where the distributed storage system includes a master storage server and a slave storage server, and the number of slave storage servers may be at least one. The description of the primary storage server and secondary storage server in the distributed storage system is as follows:
在分布式存储系统中,用户通过客户端向主存储服务器中上传表格数据,主存储服务器根据用户上传的每一个表格数据,生成对应的一条日志条目,并将日志条目写入主存储服务器的本地日志文件中(也可称之为主存储服务器将日志条目持久化)。也就是说,主存储服务器的本地日志文件由至少一条日志条目组成。需要强调的是,上述的表格数据仅仅是用户能够上传的待存储数据中的一种示例,用户还可以通过客户端向主存储服务器上传其他类型的数据,具体存储类型可以与存储系统中的数据库存储类型相对应,本申请实施例对此不做限定。为了描述清楚,下文仍以表格数据为例对方案进行介绍。In a distributed storage system, users upload table data to the main storage server through the client. The main storage server generates a corresponding log entry based on each table data uploaded by the user, and writes the log entry locally to the main storage server In the log file (also known as the primary storage server to persist log entries). In other words, the local log file of the primary storage server consists of at least one log entry. It should be emphasized that the above table data is only an example of the data to be stored that can be uploaded by the user. The user can also upload other types of data to the main storage server through the client. The specific storage type can be compared with the database in the storage system. The storage type corresponds to the storage type, which is not limited in the embodiment of the present application. In order to describe clearly, the following still uses tabular data as an example to introduce the solution.
需要说明的是,主存储服务器根据用户上传的每一个表格数据,生成对应的一条日志条目。其中,每一个表格数据对应的一条日志条目中包括:上传的表格数据、生成日志条目的编号、以及主存储服务器的任期号。另外,主存储服务器生成的日志条目的编号是递增的,例如:严格连续递增,也就是说,主存储服务器中的最大编号对应的日志条目为主存储服务器中最新生成的一条日志条目。其中,上传的表格数据即为待存储数据,生成日志条目的编号即为在生成日志条目时所给定的日志条目的编号,而主存储服务器的任期号可以表征该主存储服务器作为主服务器的有效时限。It should be noted that the main storage server generates a corresponding log entry according to each table data uploaded by the user. Among them, a log entry corresponding to each table data includes: the uploaded table data, the number of the generated log entry, and the tenure number of the main storage server. In addition, the number of log entries generated by the primary storage server is incremental, for example, strictly continuous increments, that is, the log entry corresponding to the largest number in the primary storage server is the latest log entry generated in the primary storage server. Among them, the uploaded table data is the data to be stored, the number of the generated log entry is the number of the log entry given when the log entry is generated, and the term number of the primary storage server can represent the primary storage server as the primary server. Effective time limit.
主存储服务将日志条目写入至本地日志文件中后,主存储服务器从本地日志文件中按照编号从小到大且连续的方式获取需同步至从存储服务器的一批日志条目,并将获取到的日志条目向从存储服务器转发。After the primary storage service writes the log entries into the local log file, the primary storage server obtains a batch of log entries that need to be synchronized to the secondary storage server from the local log file in a descending and continuous manner, and obtains The log entries are forwarded to the slave storage server.
从存储服务器在接收到主存储服务器发送的日志条目时,根据从存储服务器的本地日志文件中的当前最新的一条日志条目(即为最大的编号对应的日志条目)进行匹配。在从存储服务器确定匹配成功时,从存储服务器将主存储服务器转发的日志条目写入本地日志文件中(也可称之为从存储服务器将日志条目持久化),并向主存储服务器发送写入成功的通知。 对应的,在从存储服务器确定匹配失败时,从存储服务器不会将存储服务器转发的日志条目写入本地日志文件中,并向主存储服务器发送写入失败的通知。示例性的,从存储服务器根据本地日志文件中的当前最新的一条日志条目进行匹配,可以包括:识别接收到的日志条目的最小编号,是否为本地日志文件中的当前最新一条日志条目的编号的下一编号,如果是,判定为相匹配,否则,判定为不匹配;或者,识别接收到的日志条目的编号是否连续递增,且包含本地日志文件中的当前最新的一条日志条目的编号的下一编号,如果是,判定为相匹配,否则,判定为不匹配。When the secondary storage server receives the log entry sent by the primary storage server, it performs matching according to the current latest log entry (that is, the log entry corresponding to the largest number) in the local log file of the secondary storage server. When the secondary storage server determines that the match is successful, the secondary storage server writes the log entries forwarded by the primary storage server to the local log file (also known as the secondary storage server persisting the log entries), and sends the write to the primary storage server Successful notification. Correspondingly, when the secondary storage server determines that the match fails, the secondary storage server does not write the log entries forwarded by the storage server into the local log file, and sends a notification of the write failure to the primary storage server. Exemplarily, matching from the storage server according to the current latest log entry in the local log file may include: identifying the smallest number of the received log entry, whether it is the number of the current latest log entry in the local log file The next number, if it is, it is judged as a match, otherwise, it is judged as a mismatch; or, to identify whether the number of the received log entry is continuously increasing, and it contains the lower part of the number of the current latest log entry in the local log file. A number, if it is, it is judged as a match, otherwise, it is judged as a non-match.
主存储服务器在接收到写入成功的通知后,当主存储服务器的本地日志文件中还有未转发至从存储服务器中的日志条目时,从本地日志文件中获取下一批需要转发至从存储服务器中的日志条目,并将获取到日志条目转发至从存储服务器中;同时告知从存储服务器的当前已被大多数存储服务器持久化的最新日志条目。示例性地,大多数存储服务器可以包括主存储服务器,以及主存储服务器以外的其他从存储服务器,该其他从存储服务器为除上述的从存储服务器、主存储服务器以外的存储服务器。After the primary storage server receives the notification that the write is successful, when there are log entries in the local log file of the primary storage server that have not been forwarded to the secondary storage server, the next batch obtained from the local log file needs to be forwarded to the secondary storage server And forward the obtained log entries to the secondary storage server; at the same time, inform the secondary storage server of the latest log entries that have been persisted by most storage servers. Exemplarily, most storage servers may include a primary storage server and other secondary storage servers other than the primary storage server, and the other secondary storage servers are storage servers other than the above-mentioned secondary storage server and primary storage server.
<方法实施例><Method Example>
本申请实施例提供一种分布式存储系统的数据同步方法,其中,分布式存储系统包括主存储服务器和从存储服务器。如图2所示,该方法由分布式存储系统中的主存储服务器实施,即该方法应用于分布式存储系统中的主存储服务器,该方法包括如下S2100-S2300:The embodiment of the present application provides a data synchronization method of a distributed storage system, where the distributed storage system includes a primary storage server and a secondary storage server. As shown in Figure 2, the method is implemented by the main storage server in the distributed storage system, that is, the method is applied to the main storage server in the distributed storage system. The method includes the following S2100-S2300:
S2100、确定从存储服务器是否成功写入第一日志信息,第一日志信息包括主存储服务器的日志中最新的一条日志条目。S2100. Determine whether the slave storage server successfully writes the first log information, where the first log information includes the latest log entry in the log of the master storage server.
在本实施例中,第一日志信息为主存储服务器最新向从存储服务器转发的一批日志条目,也就是,本次同步过程中,主存储服务器向从存储服务器转发的、包含最新的一条日志条目的最后一批日志条目。基于此,第一日志信息包括主存储服务器的本地日志文件中最新的一条日志条目。另外,第一日志条目可以包含一条日志条目,也可以包含多条日志条目。可 以理解的是,主存储服务器的本地日志文件中最新的一条日志条目,为主存储服务器的本地日志文件中最大编号对应的日志条目。In this embodiment, the first log information is the latest batch of log entries forwarded by the primary storage server to the secondary storage server, that is, during this synchronization process, the primary storage server forwarded to the secondary storage server and contains the latest log entry. The last batch of log entries for entries. Based on this, the first log information includes the latest log entry in the local log file of the primary storage server. In addition, the first log entry may include one log entry, or may include multiple log entries. It can be understood that the latest log entry in the local log file of the primary storage server is the log entry corresponding to the largest number in the local log file of the primary storage server.
在一个实施例中,上述S2100可根据从存储服务器发送的回复信息确定。基于此,上述S2100可通过如下S2110-S2112来实现:In an embodiment, the above S2100 may be determined according to the reply information sent from the storage server. Based on this, the above S2100 can be implemented through the following S2110-S2112:
S2110、接收来自从存储服务器的回复信息。S2110. Receive a reply message from a storage server.
在本实施例中,来自从存储服务器的回复信息,是从存储服务器基于接收到第一日志信息生成的。当从存储服务器成功将第一日志信息成功写入本地日志文件中时,从存储服务器向主存储服务器发送写入成功的通知。In this embodiment, the reply information from the secondary storage server is generated by the secondary storage server based on the first log information received. When the secondary storage server successfully writes the first log information into the local log file, the secondary storage server sends a notification of writing success to the primary storage server.
对应的,当从存储服务器未将第一日志信息成功写入本地日志文件志中时,从存储服务器向主存储服务器发送写入失败的通知。Correspondingly, when the secondary storage server does not successfully write the first log information into the local log file, the secondary storage server sends a notification of writing failure to the primary storage server.
基于上述内容可知,来自从存储服务器的回复信息可以为写入成功的通知,也可以为写入失败的通知。Based on the above content, the reply message from the storage server may be a notification of successful writing or a notification of writing failure.
S2111、当回复信息为写入成功的通知,则确定从存储服务器成功写入第一日志信息。S2111. When the reply message is a notification of successful writing, it is determined that the first log information is successfully written from the storage server.
在本实施例中,当回复信息为写入成功的通知时,说明从存储服务器已经将第一日志信息写入至本地日志文件中,即可以确定从存储服务器成功写入第一日志信息。In this embodiment, when the reply message is a notification that the writing is successful, it means that the secondary storage server has written the first log information into the local log file, and it can be determined that the secondary storage server has successfully written the first log information.
S2112、当回复信息为写入失败的通知,则确定从存储服务器写入第一日志信息失败。S2112. When the reply message is a notification of writing failure, it is determined that writing of the first log information from the storage server has failed.
在本实施例中,当回复信息为写入失败的通知时,说明从存储服务器未将第一日志信息成功写入至本地日志文件中,即可以确定从存储服务器写入第一日志信息失败。In this embodiment, when the reply message is a notification of writing failure, it means that the slave storage server has not successfully written the first log information to the local log file, and it can be determined that the writing of the first log information from the storage server has failed.
S2200、当确定从存储服务器已成功写入第一日志信息后,确定是否有数据写入。S2200: After it is determined that the first log information has been successfully written from the storage server, determine whether there is data to be written.
在本实施例中,当从存储服务器已经成功写入第一日志信息时,说明主存储服务器已经将本地日志文件中的所有日志条目均已发送至从存储服务器中,且从存储服务器已将主存储服务器的本地日志文件中的所有日志条目,均写入至自身的本地日志文件中。此时,主存储服务器的本地日志 文件中的日志条目,与从存储服务器的本地日志文件中的日志条目同步。In this embodiment, when the slave storage server has successfully written the first log information, it means that the master storage server has sent all log entries in the local log file to the slave storage server, and the slave storage server has sent the master All log entries in the local log file of the storage server are written to its own local log file. At this time, the log entries in the local log file of the primary storage server are synchronized with the log entries in the local log file of the secondary storage server.
在本实施例中,主存储服务器当确定从存储服务器已成功写入第一日志信息后,确定是否有数据写入,以确定主存储服务器的本地日志文件中的日志条目,与从存储服务器的本地日志文件中的日志条目是否由同步状态变为不同步的状态。也就是说,确定是否有数据写入可以是指:相对于从存储服务器已成功写入第一日志信息的时间点,主存储服务器的本地日志文件中是否新增日志条目。并且,在确定是否有数据写入时,具体可以确定预定时长内是否有数据写入,该预定时长可以根据实际情况设定,例如:根据分布式存储系统能够容忍存储引擎中的日志条目更新滞的时长来设定。In this embodiment, after the primary storage server determines that the secondary storage server has successfully written the first log information, it determines whether there is data to be written, so as to determine whether the log entries in the primary storage server’s local log file are different from the secondary storage server’s log entries. Whether the log entries in the local log file change from the synchronized state to the unsynchronized state. In other words, determining whether data is written may refer to whether a new log entry is added to the local log file of the primary storage server relative to the time point when the secondary storage server has successfully written the first log information. In addition, when determining whether there is data to be written, it can be specifically determined whether there is data to be written within a predetermined period of time. The predetermined period of time can be set according to the actual situation, for example, the distributed storage system can tolerate the log entry update delay in the storage engine. Time to set.
S2300、如果确定没有数据写入,向从存储服务器发送第二日志信息,以通知从存储服务器存储第三日志信息至从存储服务器的存储引擎中。S2300. If it is determined that no data is written, send the second log information to the slave storage server to notify the slave storage server to store the third log information to the storage engine of the slave storage server.
其中,第二日志信息中不具有日志条目。以及第三日志信息包含当前已被设定数量的其他从存储服务器持久化的最新日志条目。需要说明的是,尽管第二日志信息中不具有日志条目,但是为了保证从存储服务器能够存储第三日志信息至存储引擎中,第二日志信息可以携带有当前已被设定数量的其他从存储服务器持久化的最新日志条目的编号。Among them, the second log information does not have log entries. And the third log information contains the current set number of other latest log entries persisted from the storage server. It should be noted that although there are no log entries in the second log information, in order to ensure that the secondary storage server can store the third log information in the storage engine, the second log information may carry the currently set number of other secondary storages. The number of the latest log entry persisted by the server.
在本实施例中,设定数量的其他存储服务器和主存储服务器,为分布式存储系统中的大多数存储服务器。在本实施例中,如果确定没有数据写入,则确定主存储服务器的本地日志文件中的日志条目,与从存储服务器的本地日志文件中的日志条目仍然是同步的。此时,主存储服务器向从存储服务器发送一个不具有日志条目的第二日志信息,以通知从存储服务器存储第三日志信息至从存储服务器的存储引擎中。In this embodiment, the set number of other storage servers and main storage servers are most of the storage servers in the distributed storage system. In this embodiment, if it is determined that no data is written, it is determined that the log entries in the local log file of the primary storage server are still synchronized with the log entries in the local log file of the secondary storage server. At this time, the primary storage server sends a second log information without a log entry to the secondary storage server to notify the secondary storage server to store the third log information to the storage engine of the secondary storage server.
由于第三日志信息包含当前已被设定数量的其他从存储服务器持久化的最新日志条目,因此,当从存储服务器中得到第三日志信息时,从存储服务器可获知当前已被设定数量的其他从存储服务器持久化的最新日志条目。此时,从存储服务器读取本地日志文件中的第三日志信息包含的日志条目,及其之前的日志条目,并存储至自身的存储引擎中。需要说明的 是,从存储服务器可以读取第三日志信息包含的日志条目,及其之前的日志条目中的全部内容或部分内容,并将读取到的内容存储到存储引擎中,示例性的,日志条目包括上传的表格数据、生成日志条目的编号、以及主存储服务器的任期号时,部分内容可以为上传的表格数据,或者,包含上传的表格数据和生成日志条目的编号的内容。Since the third log information contains the current set number of other latest log entries persisted from the storage server, when the third log information is obtained from the storage server, the storage server can know that the currently set number of log entries Other latest log entries persisted from the storage server. At this time, the log entry contained in the third log information in the local log file and the previous log entry are read from the storage server, and stored in its own storage engine. It should be noted that the log entry contained in the third log information can be read from the storage server, and all or part of the content or part of the previous log entry can be read from the storage server, and the read content can be stored in the storage engine. When the log entry includes the uploaded form data, the number of the generated log entry, and the term number of the main storage server, part of the content may be the uploaded form data, or the content containing the uploaded form data and the number of the generated log entry.
另外,每个存储服务器使用自身的存储引擎进行创建、查询、更新和删除数据操作,示例性的,MySQL中的存储引擎可以包括但不局限于:MyISAM存储引擎、innoDB存储引擎、MEMORY存储引擎、ARCHIVE存储引擎。In addition, each storage server uses its own storage engine to create, query, update, and delete data operations. Illustratively, the storage engine in MySQL may include but is not limited to: MyISAM storage engine, innoDB storage engine, MEMORY storage engine, ARCHIVE storage engine.
在本实施例中,首先确定从存储服务器是否成功写入第一日志信息。由于第一日志信息包括主存储服务器的本地日志文件中最新一条日志条目,因此,当确定从存储服务器已成功写入第一日志信息后,说明主存储服务器的本地日志文件中的日志条目,与从存储服务器的本地日志文件中的日志条目同步。此时,确定主存储服务器的本地日志文件中是否有数据写入;如果确定没有数据写入,向从存储服务器发送第二日志信息,以通知从存储服务器存储第三日志信息,第二日志信息不具有日志条目;由于第三日志信息包含当前已被设定数量的其他从存储服务器持久化的最新日志条目,因此,虽然主存储服务器的本地日志文件中的日志条目,与从存储服务器的本地日志文件中的日志条目同步,但是主存储服务器仍能够及时通知从存储服务器当前已被设定数量的其他从存储服务器持久化的最新日志条目。进一步的,从存储服务器将当前已被设定数量的其他从存储服务器持久化的最新日志条目及其之前的日志条目,从本地日志文件中读取并存储至存储引擎中。这样就避免了从存储服务器的存储引擎中的日志条目更新滞后。此时,从存储服务器的存储引擎中的日志条目与主存储服务器的存储引擎中的日志条目同步。In this embodiment, it is first determined whether the first log information is successfully written from the storage server. Since the first log information includes the latest log entry in the local log file of the primary storage server, when it is determined that the secondary storage server has successfully written the first log information, it indicates that the log entry in the local log file of the primary storage server is the same as Synchronize log entries from the local log file of the storage server. At this time, determine whether there is data written in the local log file of the primary storage server; if it is determined that there is no data written, send the second log information to the secondary storage server to notify the secondary storage server to store the third log information, and the second log information There is no log entry; because the third log information contains the latest log entries persisted by other secondary storage servers that are currently set, although the log entries in the local log file of the primary storage server are different from those of the secondary storage server. The log entries in the log file are synchronized, but the primary storage server can still notify the secondary storage server of the latest persistent log entries that have been set by the number of other secondary storage servers in a timely manner. Further, the slave storage server reads and stores the current set number of other latest log entries persisted from the storage server and previous log entries from the local log file to the storage engine. This avoids the lag in updating log entries from the storage engine of the storage server. At this time, the log entries in the storage engine of the slave storage server are synchronized with the log entries in the storage engine of the master storage server.
在一个实施例中,上述S2200可通过如下S2210-S2113来实现:In an embodiment, the above S2200 can be implemented by the following S2210-S2113:
S2210、在确定从服务器已成功写入第一日志信息时,启动计时。S2210: Start timing when it is determined that the slave server has successfully written the first log information.
在一个实施例中,上述S2210的具体实现可以为,在接收到来自从存储服务器的写入成功的通知时,主存储服务器启动计时。In an embodiment, the above-mentioned S2210 may be specifically implemented as follows: upon receiving a notification of a successful write from the slave storage server, the master storage server starts timing.
S2211、在计时时间到达预设时间的期间,监测本地日志文件中是否增加有日志条目。S2211, during the period when the timing time reaches the preset time, monitor whether there are log entries added to the local log file.
在一个实施例中,上述的预设时间可以为5min,或者其他。对此本实施例不做限定。所谓的计时时间到达预设时间的期间,即计时开始至预设时间的时长内。In an embodiment, the above-mentioned preset time may be 5 minutes, or other. This embodiment is not limited. The so-called period during which the timing time reaches the preset time is the duration from the start of the timing to the preset time.
在一个实施例中,在计时时间到达预设时间的期间,主存储服务器可以通过监测本地日志文件的存储量是否(相对于启动计时时刻所对应的本地日志文件的存储量)变大,以监测本地日志文件中是否增加有日志条目。In one embodiment, during the period when the timing time reaches the preset time, the main storage server can monitor whether the storage amount of the local log file (relative to the storage amount of the local log file corresponding to the time when the timing is started) increases to monitor Whether there are log entries added to the local log file.
在另一个实施例中,由于客户端在向主存储服务器上传表格数据时,主存储服务器会将上传的表格数据生成对应的日志条目,并写入至本地日志文件中。因此,在计时时间到达预设时间的期间,主存储服务器可以通过监测有没有接收到客户端上传的表格数据,以监测本地日志文件中是否增加有日志条目。In another embodiment, when the client uploads the table data to the main storage server, the main storage server generates corresponding log entries for the uploaded table data and writes them to the local log file. Therefore, during the period when the timing time reaches the preset time, the main storage server can monitor whether the table data uploaded by the client is received to monitor whether there are log entries in the local log file.
S2212、如果确定本地日志文件中增加有日志条目,则确定有数据写入。S2212. If it is determined that a log entry is added to the local log file, it is determined that data is written.
在一个实施例中,当本地日志文件的存储容量(相对于启动计时时刻所对应的本地日志文件的存储量)变大时,可确定本地日志文件中增加有日志条目,进一步确定有数据写入。In one embodiment, when the storage capacity of the local log file (relative to the storage amount of the local log file corresponding to the start timing) becomes larger, it can be determined that there are log entries added to the local log file, and further it can be determined that data is written .
在另一个实施例中,当监测有接收到客户端上传的表格数据时,可确定本地日志文件中增加有日志条目,进一步确定有数据写入。In another embodiment, when it is monitored that the form data uploaded by the client is received, it can be determined that there are log entries added to the local log file, and it is further determined that data is written.
S2213、如果确定本地日志文件中未增加有日志条目,则确定没有数据写入。S2213. If it is determined that no log entry is added to the local log file, it is determined that no data is written.
在一个实施例中,当本地日志文件的存储量(相对于启动计时时刻所对应的本地日志文件的存储量)不变时,可确定本地日志文件中未增加有日志条目,进一步确定没有数据写入。In one embodiment, when the storage amount of the local log file (relative to the storage amount of the local log file corresponding to the start timing) does not change, it can be determined that no log entries are added to the local log file, and it is further determined that no data is written. Into.
在另一个实施例中,当监测自身没有接收到客户端上传的表格数据 时,可确定本地日志文件中未增加有日志条目,进一步确定没有数据写入。In another embodiment, when the monitoring itself does not receive the form data uploaded by the client, it can be determined that no log entries have been added to the local log file, and it can be further determined that no data is written.
在上述S2210-S2113的基础上,本实施例提供的分布式存储系统的数据同步方法还包括如下S2114和S2115:On the basis of the foregoing S2210-S2113, the data synchronization method of the distributed storage system provided in this embodiment further includes the following S2114 and S2115:
S2114、在计时时间到达预设时间之前的任意时刻,如果监测到本地日志文件中增加有日志条目,则停止计时。S2114: At any time before the timing time reaches the preset time, if a log entry is detected in the local log file, stop timing.
在本实施例中,在计时时间到达预设时间之前的任意时刻,如果监测到本地日志文件中增加有日志条目,此时主存储服务器的本地日志文件中的日志条目,与从存储服务器的本地日志文件中的日志条目由同步状态变为不同步的状态。In this embodiment, at any time before the timing time reaches the preset time, if it is detected that a log entry is added to the local log file, the log entry in the local log file of the primary storage server is different from the local log entry of the secondary storage server. The log entries in the log file change from a synchronized state to an unsynchronized state.
S2115、向从存储服务器发送第四日志信息,其中,第四日志信息至少包含本地日志文件中增加的日志条目。S2115. Send fourth log information to the slave storage server, where the fourth log information includes at least log entries added to the local log file.
在本实施例中,主存储服务器的本地日志文件中的日志条目,与从存储服务器的本地日志文件中的日志条目由同步状态变为不同步的状态,此时需要主存储服务器中将增加的日志条目发送至从存储服务器中,即需要向从存储服务器发送增加的日志条目。此时,主存储服务器将新增加的日志条目中的部分日志条目,或者所有增加的日志条目作为第四日志信息,以发送至从存储服务器。需要说明的是,在计时时间到达预设时间之前的任意时刻,如果监测到本地日志文件中增加有日志条目,表明主存储服务器在将本地日志文件中最后的一条日志条目同步至从存储服务器后,发生新数据被写入的情况,那么,在执行向从存储服务器发送第四日志信息的同时,可以通知从存储服务器当前已被大多数存储服务器持久化的最新日志条目的信息,以使得从存储服务器读取该条最新的日志条目及其之前的所有日志条目,并将读取的内容存储在从存储服务器对应的存储引擎中。In this embodiment, the log entries in the local log file of the primary storage server and the log entries in the local log file of the secondary storage server change from a synchronized state to a non-synchronized state. At this time, the primary storage server needs to increase the number of log entries in the local log file of the secondary storage server. Log entries are sent to the slave storage server, that is, additional log entries need to be sent to the slave storage server. At this time, the primary storage server uses part of the newly added log entries or all the added log entries as the fourth log information to send to the secondary storage server. It should be noted that at any time before the timing time reaches the preset time, if a log entry is detected in the local log file, it indicates that the primary storage server has synchronized the last log entry in the local log file to the secondary storage server. , When new data is written, then, while the fourth log information is sent to the slave storage server, the slave storage server can be notified of the latest log entry information that has been persisted by most storage servers, so that the slave The storage server reads the latest log entry and all previous log entries, and stores the read content in the storage engine corresponding to the slave storage server.
在一个实施例中,对应于上述S2300,本实施例提供的分布式存储系统的数据同步方法还包括如下S2400:In one embodiment, corresponding to the foregoing S2300, the data synchronization method of the distributed storage system provided in this embodiment further includes the following S2400:
S2400、当确定从存储服务器没有成功写入第一日志信息时,接收从 存储服务器发送的第一序列号,第一序列号为从存储服务器中最新的日志条目的编号。S2400. When it is determined that the first log information has not been successfully written from the storage server, receive the first serial number sent from the storage server, where the first serial number is the number of the latest log entry from the storage server.
在本实施例中,从存储服务器接收到第一日志信息后,根据自身的本地日志文件中的第一序列号对第一日志信息进行匹配。若匹配失败时,从存储服务器不会将第一日志信息写入至本地日志文件中,即从存储服务器未将第一日志信息成功写入至本地日志文件中。此时向主存储服务器发送写入失败的通知。当主存储服务器接收到来自从存储服务器的写入失败的通知的回复消息时,确定从存储服务器没有成功写入第一日志信息。关于根据自身的本地日志文件中的第一序列号对第一日志信息进行匹配的具体实现方式可以参见上述的关于从存储服务器根据从存储服务器的本地日志文件中的当前最新的一条日志条目进行匹配的实现方式。In this embodiment, after receiving the first log information from the storage server, the first log information is matched according to the first serial number in the own local log file. If the matching fails, the secondary storage server will not write the first log information to the local log file, that is, the secondary storage server does not successfully write the first log information to the local log file. At this time, a notification of write failure is sent to the primary storage server. When the primary storage server receives the reply message of the notification of the writing failure from the secondary storage server, it is determined that the secondary storage server has not successfully written the first log information. For the specific implementation of matching the first log information according to the first serial number in its own local log file, please refer to the above about matching from the storage server according to the current latest log entry in the local log file of the slave storage server. The way to achieve it.
在一个实施例中,主存储服务器在确定从存储服务器没有成功写入第一日志信息时,主存储服务器可指示从存储服务器上报其本地日志文件中新的日志条目的编号,即第一序列号。可以理解的是,从存储服务器中最新的日志条目的编号,即从存储服务器中最新的日志条目的最大编号。In one embodiment, when the primary storage server determines that the secondary storage server has not successfully written the first log information, the primary storage server may instruct the secondary storage server to report the number of the new log entry in its local log file, that is, the first serial number . It can be understood that the number of the latest log entry from the storage server is the largest number of the latest log entry from the storage server.
在另一个实施例中,从存储服务器可在向主存储服务器发送写入失败的通知时,主动上报自身的本地日志文件中最新的日志条目的编号,即第一序列号。In another embodiment, the secondary storage server may actively report the number of the latest log entry in its local log file, that is, the first sequence number, when sending a notification of write failure to the primary storage server.
在本实施例中,主存储服务器在确定从存储服务器没有成功写入第一日志信息时,接收从存储服务器发送的第一序列号时,可以获知从存储服务器中的本地日志文件中的写入的日志条目的信息。In this embodiment, when the primary storage server determines that the secondary storage server has not successfully written the first log information, when it receives the first serial number sent from the storage server, it can learn the writing in the local log file from the storage server. Information about the log entry.
在上述S2400的基础上,本实施例提供的分布式存储系统的数据同步方法还包括如下S2500:On the basis of the foregoing S2400, the data synchronization method of the distributed storage system provided in this embodiment further includes the following S2500:
S2500、根据第一序列号,向从存储服务器发送第五日志信息,其中,第五日志信息包括第一序列号至第二序列号之间连续的所有序列号对应的日志条目,其中,第二序列号为本地日志文件中写入时间最新的日志条目对应的编号。S2500. Send fifth log information to the slave storage server according to the first serial number, where the fifth log information includes log entries corresponding to all consecutive serial numbers between the first serial number and the second serial number, where the second The serial number is the number corresponding to the log entry with the latest write time in the local log file.
在本实施例中,由于第二序列号为本地文件中写入时间最新的日志条目条对应的编号,也即主服务器向从服务器转发本地日志文件中的最新一条日志条目,因此,若主服务器确定出从服务器已经成功写入第五日志信息后,可以继续执行确定是否有数据写入,如果确定没有数据写入,向从存储服务器发送第二日志信息,以通知从存储服务器存储第三日志信息的步骤。In this embodiment, since the second serial number is the number corresponding to the log entry with the latest writing time in the local file, that is, the master server forwards the latest log entry in the local log file to the slave server. Therefore, if the master server After determining that the secondary server has successfully written the fifth log information, you can continue to determine whether there is data written. If it is determined that no data has been written, send the second log information to the secondary storage server to notify the secondary storage server to store the third log Information steps.
在本实施例中,第五日志信息包括第一序列号至第二序列号之间连续的所有序列号对应的日志条目中,包括第二序列号对应的日志条目。In this embodiment, the fifth log information includes all log entries corresponding to consecutive serial numbers between the first serial number and the second serial number, including log entries corresponding to the second serial number.
在本实施例中,当确定从存储服务器没有成功写入第一日志信息时,主存储服务器向从存储服务器发送第五日志信息。基于此,当从存储服务器接收到第一日志信息后,将第五日志信息中的日志条目写入至本地日志文件中时,主存储服务器的本地日志文件和从存储服务器的本地日志文件中的日志条目同步。这样,相比传统的,当确定从存储服务器没有成功写入第一日志信息时,主存储服务器仅将第一日志信息、以及第一日志信息中编号最小的日志条目的前一日志条目发送至从存储服务器,能够快速的使得主存储服务器的本地日志文件和从存储服务器的本地日志文件中的日志条目同步。In this embodiment, when it is determined that the slave storage server has not successfully written the first log information, the master storage server sends the fifth log information to the slave storage server. Based on this, when the first log information is received from the storage server and the log entries in the fifth log information are written to the local log file, the local log file of the primary storage server and the local log file of the secondary storage server Log entries are synchronized. In this way, compared to the traditional one, when it is determined that the slave storage server has not successfully written the first log information, the master storage server only sends the first log information and the log entry preceding the log entry with the smallest number in the first log information to From the storage server, the local log file of the primary storage server can be quickly synchronized with the log entries in the local log file of the secondary storage server.
<装置实施例><Device Example>
如图3所示,本实施例还提供了一种分布式存储系统的数据同步装置30,该装置30包括第一确定模块31、第二确定模块32、发送模块33。其中:第一确定模块31,用于确定所述从存储服务器是否成功写入第一日志信息,所述第一日志信息包括所述主存储服务器的本地日志文件中最新一条日志条目;第二确定模块32,用于当确定所述从存储服务器已成功写入所述第一日志信息后,确定是否有数据写入;发送模块33,用于如果确定没有数据写入,向所述从存储服务器发送第二日志信息,以通知所述从存储服务器存储第三日志信息,所述第二日志信息不具有日志条目;其中,所述第三日志信息包含当前已被设定数量的其他从存储服务器持久化的最 新日志条目。As shown in FIG. 3, this embodiment also provides a data synchronization device 30 of a distributed storage system. The device 30 includes a first determining module 31, a second determining module 32, and a sending module 33. Wherein: the first determination module 31 is configured to determine whether the slave storage server successfully writes the first log information, the first log information includes the latest log entry in the local log file of the master storage server; the second determination The module 32 is used for determining whether there is data writing after it is determined that the secondary storage server has successfully written the first log information; the sending module 33 is used for sending data to the secondary storage server if it is determined that there is no data writing. Sending the second log information to notify the secondary storage server to store the third log information, the second log information does not have log entries; wherein the third log information includes the currently set number of other secondary storage servers The latest persistent log entry.
在一个实施例中,所述第二确定模块32,具体用于:在确定所述从服务器已成功写入所述第一日志信息时,启动计时;在计时时间到达预设时间的期间,监测所述本地日志文件中是否增加有日志条目;如果确定所述本地日志文件中增加有日志条目,则确定有数据写入;如果确定所述本地日志文件中未增加有日志条目,则确定没有数据写入。In one embodiment, the second determining module 32 is specifically configured to: start timing when it is determined that the slave server has successfully written the first log information; and monitor when the timing time reaches a preset time Whether there are log entries added to the local log file; if it is determined that there are log entries added to the local log file, it is determined that data is written; if it is determined that there are no log entries added to the local log file, it is determined that there is no data Write.
在一个实施例中,本实施例提供的分布式存储系统的数据同步装置30还包括第二发送模块。其中,第二发送模块用于:在所述计时时间到达预设时间之前的任意时刻,如果监测到所述本地日志文件中增加有日志条目,则停止计时;向所述从存储服务器发送第四日志信息,其中,所述第四日志信息至少包含所述本地日志文件中增加的日志条目。In an embodiment, the data synchronization device 30 of the distributed storage system provided in this embodiment further includes a second sending module. Wherein, the second sending module is configured to: at any time before the timing time reaches the preset time, if it is detected that a log entry is added to the local log file, stop timing; Log information, where the fourth log information includes at least log entries added to the local log file.
在一个实施例中,第二确定模块32还用于:当确定所述从存储服务器没有成功写入第一日志信息时,接收所述从存储服务器发送的第一序列号,所述第一序列号为所述从存储服务器中最新的日志条目的编号。In one embodiment, the second determining module 32 is further configured to: when it is determined that the slave storage server has not successfully written the first log information, receive the first sequence number sent by the slave storage server, and the first sequence The number is the number of the latest log entry from the storage server.
在一个实施例中,本实施例提供的分布式存储系统的数据同步装置30还包括第三发送模块。其中,第三发送模块用于:根据所述第一序列号,向所述从存储服务器发送第五日志信息,其中,所述第五日志信息包括所述第一序列号至第二序列号之间连续的所有序列号对应的日志条目,其中,所述第二序列号为本地日志文件中写入时间最新的日志条目对应的编号。In an embodiment, the data synchronization device 30 of the distributed storage system provided in this embodiment further includes a third sending module. Wherein, the third sending module is configured to: send fifth log information to the slave storage server according to the first serial number, wherein the fifth log information includes the first serial number to the second serial number. The log entries corresponding to all consecutive serial numbers in between, where the second serial number is the number corresponding to the log entry with the latest writing time in the local log file.
在一个实施例中,第一确定模块31具体用于:接收来自从存储服务器的回复信息;当所述回复信息为写入成功的通知,则确定所述从存储服务器成功写入第一日志信息;当所述回复信息为写入失败的通知,则确定所述从存储服务器写入第一日志信息失败。In one embodiment, the first determining module 31 is specifically configured to: receive reply information from the secondary storage server; when the reply information is a notification of successful writing, determine that the secondary storage server successfully writes the first log information ; When the reply message is a notification of writing failure, it is determined that the writing of the first log information from the storage server fails.
<设备实施例><Equipment Example>
如图4所示,本申请实施例还提供了一种分布式存储系统的存储服务器40,该存储服务器40包括上述装置实施例中的分布式存储系统的数据同步装置30。或者,该分布式存储系统的存储服务器包括存储器41和处 理器42。其中,存储器用于存储可执行的指令,所述指令用于控制所述处理器执行根据上述方法实施例中任一项所述的方法。As shown in FIG. 4, an embodiment of the present application also provides a storage server 40 of a distributed storage system. The storage server 40 includes the data synchronization device 30 of the distributed storage system in the foregoing device embodiment. Alternatively, the storage server of the distributed storage system includes a storage 41 and a processor 42. The memory is used to store executable instructions, and the instructions are used to control the processor to execute the method according to any one of the above method embodiments.
<计算机存储介质><Computer storage media>
本申请还提供一种计算机存储介质,所述存储介质存储有计算机指令,当所述存储介质中的计算机指令由处理器执行时,实现如上述方法实施例中任一项所述的方法。The present application also provides a computer storage medium that stores computer instructions, and when the computer instructions in the storage medium are executed by a processor, the method according to any one of the above method embodiments is implemented.
在本申请的又一实施例中,还提供了一种计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本申请的各个方面的计算机可读程序指令。In yet another embodiment of the present application, there is also provided a computer program product that may include a computer-readable storage medium, which carries computer-readable program instructions for enabling a processor to implement various aspects of the present application.
在本申请的又一实施例中,还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述实施例中任一数据同步方法的步骤。In another embodiment of the present application, there is also provided a computer program product containing instructions, which when run on a computer, causes the computer to execute the steps of any data synchronization method in the foregoing embodiments.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。The computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, mechanical encoding device, such as a printer with instructions stored thereon The protruding structure in the hole card or the groove, and any suitable combination of the above. The computer-readable storage medium used here is not interpreted as the instantaneous signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (for example, light pulses through fiber optic cables), or through wires Transmission of electrical signals.
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载 到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .
用于执行本申请操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本申请的各个方面。The computer program instructions used to perform the operations of this application may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, status setting data, or in one or more programming languages. Source code or object code written in any combination, the programming language includes object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as "C" language or similar programming languages. Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server carried out. In the case of a remote computer, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to access the Internet). connection). In some embodiments, an electronic circuit, such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), can be customized by using the status information of the computer-readable program instructions. The computer-readable program instructions are executed to realize various aspects of the present application.
这里参照根据本申请实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本申请的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Here, various aspects of the present application are described with reference to flowcharts and/or block diagrams of methods, devices (systems) and computer program products according to embodiments of the present application. It should be understood that each block of the flowcharts and/or block diagrams, and combinations of blocks in the flowcharts and/or block diagrams, can be implemented by computer-readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些 计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine that makes these instructions when executed by the processor of the computer or other programmable data processing device , A device that implements the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium. These instructions make computers, programmable data processing apparatuses, and/or other devices work in a specific manner. Thus, the computer-readable medium storing the instructions includes An article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。It is also possible to load computer-readable program instructions on a computer, other programmable data processing device, or other equipment, so that a series of operation steps are executed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process , So that the instructions executed on the computer, other programmable data processing apparatus, or other equipment realize the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
附图中的流程图和框图显示了根据本申请的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。对于本领域技术人员来说公知的是,通过硬件方式实现、通过软件方式实现以及通过软件和硬件结合的方式实现都是等价的。The flowcharts and block diagrams in the accompanying drawings show the possible implementation architecture, functions, and operations of the system, method, and computer program product according to multiple embodiments of the present application. In this regard, each block in the flowchart or block diagram may represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction contains one or more components for realizing the specified logical function. Executable instructions. In some alternative implementations, the functions marked in the block may also occur in a different order than the order marked in the drawings. For example, two consecutive blocks can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart, can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that realization by hardware, realization by software, and realization by a combination of software and hardware are all equivalent.
以上已经描述了本申请的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。本申请的范围由所附权利要求来限定。The embodiments of the present application have been described above, and the above description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Without departing from the scope and spirit of the described embodiments, many modifications and changes are obvious to those of ordinary skill in the art. The choice of terms used herein is intended to best explain the principles, practical applications, or technical improvements of the technologies in the market, or to enable other ordinary skilled in the art to understand the embodiments disclosed herein. The scope of the application is defined by the appended claims.
工业实用性Industrial applicability
本申请所提供的方案中,首先确定从存储服务器是否成功写入第一日志信息。由于第一日志信息包括主存储服务器的本地日志文件中最新一条日志条目,因此,当确定从存储服务器已成功写入第一日志信息后,说明主存储服务器的本地日志文件中的日志条目,与从存储服务器的本地日志文件中的日志条目同步。此时,确定主存储服务器的本地日志文件中是否有数据写入;如果确定没有数据写入,向从存储服务器发送第二日志信息,以通知从存储服务器存储第三日志信息,第二日志信息不具有日志条目;由于第三日志信息包含当前已被设定数量的其他从存储服务器持久化的最新日志条目,因此,虽然主存储服务器的本地日志文件中的日志条目,与从存储服务器的本地日志文件中的日志条目同步,但是主存储服务器仍能够及时通知从存储服务器当前已被设定数量的其他从存储服务器持久化的最新日志条目。进一步的,从存储服务器将当前已被设定数量的其他从存储服务器持久化的最新日志条目及其之前的日志条目,从本地日志文件中读取并存储至存储引擎中。这样就避免了从存储服务器的存储引擎中的日志条目更新滞后,此时,从存储服务器的存储引擎中的日志条目与主存储服务器的存储引擎中的日志条目同步。In the solution provided by this application, it is first determined whether the first log information is successfully written from the storage server. Since the first log information includes the latest log entry in the local log file of the primary storage server, when it is determined that the secondary storage server has successfully written the first log information, it indicates that the log entry in the local log file of the primary storage server is the same as Synchronize log entries from the local log file of the storage server. At this time, determine whether there is data written in the local log file of the primary storage server; if it is determined that there is no data written, send the second log information to the secondary storage server to notify the secondary storage server to store the third log information, and the second log information There is no log entry; because the third log information contains the latest log entries persisted by other secondary storage servers that are currently set, although the log entries in the local log file of the primary storage server are different from those of the secondary storage server. The log entries in the log file are synchronized, but the primary storage server can still notify the secondary storage server of the latest persistent log entries that have been set by the number of other secondary storage servers in a timely manner. Further, the slave storage server reads and stores a set number of other latest log entries persisted from the storage server and previous log entries from the local log file to the storage engine. This avoids the lag in updating the log entries in the storage engine of the slave storage server. At this time, the log entries in the storage engine of the slave storage server are synchronized with the log entries in the storage engine of the primary storage server.

Claims (11)

  1. 一种分布式存储系统的数据同步方法,所述分布式存储系统包括主存储服务器和从存储服务器,所述方法由所述主存储服务器实施,包括:A data synchronization method for a distributed storage system. The distributed storage system includes a primary storage server and a secondary storage server. The method is implemented by the primary storage server and includes:
    确定所述从存储服务器是否成功写入第一日志信息,所述第一日志信息包括所述主存储服务器的本地日志文件中最新一条日志条目;Determining whether the slave storage server successfully writes first log information, where the first log information includes the latest log entry in the local log file of the master storage server;
    当确定所述从存储服务器已成功写入所述第一日志信息后,确定是否有数据写入;After it is determined that the slave storage server has successfully written the first log information, determining whether there is data to be written;
    如果确定没有数据写入,向所述从存储服务器发送第二日志信息,以通知所述从存储服务器存储第三日志信息,所述第二日志信息不具有日志条目;If it is determined that no data is written, sending second log information to the slave storage server to notify the slave storage server to store the third log information, and the second log information does not have a log entry;
    其中,所述第三日志信息包含当前已被设定数量的其他从存储服务器持久化的最新日志条目。Wherein, the third log information includes a set number of other latest log entries persisted from the storage server.
  2. 根据权利要求1所述的方法,其中,所述当确定所述从存储服务器已成功写入所述第一日志信息后,确定是否有数据写入,包括:The method according to claim 1, wherein the determining whether data is written after it is determined that the slave storage server has successfully written the first log information comprises:
    在确定所述从服务器已成功写入所述第一日志信息时,启动计时;Start timing when it is determined that the slave server has successfully written the first log information;
    在计时时间到达预设时间的期间,监测所述本地日志文件中是否增加有日志条目;During the period when the timing time reaches the preset time, monitoring whether there are log entries added to the local log file;
    如果确定所述本地日志文件中增加有日志条目,则确定有数据写入;If it is determined that there are log entries added to the local log file, it is determined that data is written;
    如果确定所述本地日志文件中未增加有日志条目,则确定没有数据写入。If it is determined that no log entry is added to the local log file, it is determined that no data is written.
  3. 根据权利要求2所述的方法,其中,所述方法还包括:The method according to claim 2, wherein the method further comprises:
    在所述计时时间到达预设时间之前的任意时刻,如果监测到所述本地日志文件中增加有日志条目,则停止计时;At any time before the timing time reaches the preset time, if it is detected that a log entry is added to the local log file, stop timing;
    向所述从存储服务器发送第四日志信息,其中,所述第四日志信息至少包含所述本地日志文件中增加的日志条目。Sending fourth log information to the slave storage server, where the fourth log information includes at least log entries added to the local log file.
  4. 根据权利要求1-3任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1-3, wherein the method further comprises:
    当确定所述从存储服务器没有成功写入第一日志信息时,接收所述从存储服务器发送的第一序列号,所述第一序列号为所述从存储服务器中最 新的日志条目的编号。When it is determined that the slave storage server has not successfully written the first log information, receiving the first serial number sent by the slave storage server, where the first serial number is the number of the latest log entry in the slave storage server.
  5. 根据权利要求4所述的方法,其中,所述方法还包括:The method according to claim 4, wherein the method further comprises:
    根据所述第一序列号,向所述从存储服务器发送第五日志信息,其中,所述第五日志信息包括所述第一序列号至第二序列号之间连续的所有序列号对应的日志条目;According to the first serial number, send fifth log information to the slave storage server, where the fifth log information includes logs corresponding to all consecutive serial numbers between the first serial number and the second serial number entry;
    其中,所述第二序列号为本地日志文件中写入时间最新的日志条目对应的编号。Wherein, the second serial number is the number corresponding to the log entry with the latest writing time in the local log file.
  6. 根据权利要求1-3任一项所述的方法,其中,所述确定所述从存储服务器是否成功写入第一日志信息,包括:The method according to any one of claims 1 to 3, wherein the determining whether the slave storage server successfully writes the first log information comprises:
    接收来自所述从存储服务器的回复信息;Receiving reply information from the slave storage server;
    当所述回复信息为写入成功的通知,则确定所述从存储服务器成功写入第一日志信息;When the reply message is a notification of successful writing, it is determined that the slave storage server successfully writes the first log information;
    当所述回复信息为写入失败的通知,则确定所述从存储服务器写入第一日志信息失败。When the reply information is a notification of writing failure, it is determined that the writing of the first log information from the storage server has failed.
  7. 一种分布式存储系统的数据同步装置,包括:A data synchronization device of a distributed storage system includes:
    第一确定模块,设置为确定所述从存储服务器是否成功写入第一日志信息,所述第一日志信息包括所述主存储服务器的本地日志文件中最新一条日志条目;A first determining module, configured to determine whether the slave storage server successfully writes first log information, where the first log information includes the latest log entry in the local log file of the master storage server;
    第二确定模块,设置为当确定所述从存储服务器已成功写入所述第一日志信息后,确定是否有数据写入;The second determining module is configured to determine whether data is written after it is determined that the slave storage server has successfully written the first log information;
    发送模块,设置为如果确定没有数据写入,向所述从存储服务器发送第二日志信息,以通知所述从存储服务器存储第三日志信息,所述第二日志信息不具有日志条目;A sending module, configured to send second log information to the slave storage server if it is determined that no data is written, so as to notify the slave storage server to store third log information, and the second log information does not have a log entry;
    其中,所述第三日志信息包含当前已被设定数量的其他从存储服务器持久化的最新日志条目。Wherein, the third log information includes a set number of other latest log entries persisted from the storage server.
  8. 根据权利要求7所述的装置,其中,所述第二确定模块,设置为:The device according to claim 7, wherein the second determining module is configured to:
    在确定所述从服务器已成功写入所述第一日志信息时,启动计时;Start timing when it is determined that the slave server has successfully written the first log information;
    在计时时间到达预设时间的期间,监测所述本地日志文件中是否增加 有日志条目;During the period when the timing time reaches the preset time, monitoring whether there are log entries added to the local log file;
    如果确定所述本地日志文件中增加有日志条目,则确定有数据写入;If it is determined that there are log entries added to the local log file, it is determined that data is written;
    如果确定所述本地日志文件中未增加有日志条目,则确定没有数据写入。If it is determined that no log entry is added to the local log file, it is determined that no data is written.
  9. 一种分布式存储系统的存储服务器,包括:处理器和存储器;A storage server of a distributed storage system, including: a processor and a memory;
    所述存储器用于存储可执行的指令,所述指令用于控制所述处理器执行根据权利要求1-6中任一项所述的方法。The memory is used to store executable instructions, and the instructions are used to control the processor to execute the method according to any one of claims 1-6.
  10. 一种计算机存储介质,所述存储介质存储有计算机指令,当所述存储介质中的计算机指令由处理器执行时,实现如权利要求1-6任一项所述的方法。A computer storage medium, the storage medium stores computer instructions, and when the computer instructions in the storage medium are executed by a processor, the method according to any one of claims 1-6 is implemented.
  11. 一种计算机程序产品,当其在计算机上运行时,使得计算机执行如权利要求1-6任一项所述的方法。A computer program product, when it runs on a computer, causes the computer to execute the method according to any one of claims 1-6.
PCT/CN2020/127873 2019-11-15 2020-11-10 Data synchronization method, apparatus and device for distributed storage system, and storage medium WO2021093735A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911121853.XA CN112822227B (en) 2019-11-15 2019-11-15 Data synchronization method, device, equipment and storage medium of distributed storage system
CN201911121853.X 2019-11-15

Publications (1)

Publication Number Publication Date
WO2021093735A1 true WO2021093735A1 (en) 2021-05-20

Family

ID=75852186

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/127873 WO2021093735A1 (en) 2019-11-15 2020-11-10 Data synchronization method, apparatus and device for distributed storage system, and storage medium

Country Status (2)

Country Link
CN (1) CN112822227B (en)
WO (1) WO2021093735A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114661339A (en) * 2022-05-26 2022-06-24 浙江所托瑞安科技集团有限公司 Method and device for automatically submitting local data to remote server
CN116166477A (en) * 2022-11-30 2023-05-26 贵州华谊联盛科技有限公司 Dual-activity gateway system and method for storing different brands of objects

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105912628A (en) * 2016-04-07 2016-08-31 北京奇虎科技有限公司 Synchronization method and device for master database and slave database
CN106648994A (en) * 2017-01-04 2017-05-10 华为技术有限公司 Method, equipment and system for backup operation on log
CN108616598A (en) * 2018-05-10 2018-10-02 新华三技术有限公司成都分公司 Method of data synchronization, device and distributed memory system
US20190095293A1 (en) * 2016-07-27 2019-03-28 Tencent Technology (Shenzhen) Company Limited Data disaster recovery method, device and system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170063986A1 (en) * 2015-08-31 2017-03-02 Microsoft Technology Licensing, Llc Target-driven tenant identity synchronization
US11216346B2 (en) * 2017-11-20 2022-01-04 Sap Se Coordinated replication of heterogeneous database stores
CN109857523B (en) * 2017-11-30 2023-05-09 阿里巴巴集团控股有限公司 Method and device for realizing high availability of database
CN108763578B (en) * 2018-06-07 2023-03-10 腾讯科技(深圳)有限公司 Index file updating method and server
CN110213317B (en) * 2018-07-18 2021-10-29 腾讯科技(深圳)有限公司 Message storage method, device and storage medium
CN110119329B (en) * 2019-02-27 2024-02-23 咪咕音乐有限公司 Data replication disaster recovery method and disaster recovery system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105912628A (en) * 2016-04-07 2016-08-31 北京奇虎科技有限公司 Synchronization method and device for master database and slave database
US20190095293A1 (en) * 2016-07-27 2019-03-28 Tencent Technology (Shenzhen) Company Limited Data disaster recovery method, device and system
CN106648994A (en) * 2017-01-04 2017-05-10 华为技术有限公司 Method, equipment and system for backup operation on log
CN108616598A (en) * 2018-05-10 2018-10-02 新华三技术有限公司成都分公司 Method of data synchronization, device and distributed memory system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114661339A (en) * 2022-05-26 2022-06-24 浙江所托瑞安科技集团有限公司 Method and device for automatically submitting local data to remote server
CN114661339B (en) * 2022-05-26 2022-08-16 浙江所托瑞安科技集团有限公司 Method and device for automatically submitting local data to remote server
CN116166477A (en) * 2022-11-30 2023-05-26 贵州华谊联盛科技有限公司 Dual-activity gateway system and method for storing different brands of objects
CN116166477B (en) * 2022-11-30 2024-02-13 郭东升 Dual-activity gateway system and method for storing different brands of objects

Also Published As

Publication number Publication date
CN112822227A (en) 2021-05-18
CN112822227B (en) 2022-02-25

Similar Documents

Publication Publication Date Title
US10169163B2 (en) Managing backup operations from a client system to a primary server and secondary server
WO2021093735A1 (en) Data synchronization method, apparatus and device for distributed storage system, and storage medium
JP6602369B2 (en) Secure data access after memory failure
US9712631B2 (en) Push notification via file sharing service synchronization
KR101685215B1 (en) Automatic discovery of alternate mailboxes
US9875161B2 (en) Data replication across servers
US10970178B2 (en) Generating a health condition message on a health condition detected at a server to send to a host system accessing the server
US20200019559A1 (en) Synchronizing object in local object storage node
US10942835B2 (en) Processing a health condition message on a health condition to determine whether to perform a swap operation
US11151062B2 (en) Optimized locking for replication solutions
CN113032412B (en) Data synchronization method, device, electronic equipment and computer readable medium
JP7114772B2 (en) Certificate sending method, certificate receiving method, cloud and terminal equipment
US20210084116A1 (en) System and method for bridging gaps between traditional resource management solutions with cloud-based management solutions
US9948707B2 (en) Reconnection of a client to a server in a transaction processing server cluster
CN110765075A (en) Storage method and equipment of automatic driving data
CN111510480B (en) Request sending method and device and first server
WO2021082868A1 (en) Data managmenet method for distributed storage system, apparatus, and electronic device
CN107633026B (en) data synchronization exception handling method and device and server
US10528408B2 (en) Symmetric connectivity over SCSI where the initiator and target are symmetric
CN111352944B (en) Data processing method, device, electronic equipment and storage medium
CN110362582B (en) Method and device for realizing zero-shutdown upgrading
CN112749228B (en) Data synchronization method, device, storage server and storage medium
US8813095B2 (en) Audio feedback for command line interface commands
CN110703990A (en) Data archiving and storing method and device, electronic equipment and storage medium
US11954541B1 (en) Highly available message ingestion by a data intake and query system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20887688

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20887688

Country of ref document: EP

Kind code of ref document: A1