WO2021078294A1 - Procédé et appareil de coordination de services pour un système de stockage distribué, et dispositif électronique - Google Patents

Procédé et appareil de coordination de services pour un système de stockage distribué, et dispositif électronique Download PDF

Info

Publication number
WO2021078294A1
WO2021078294A1 PCT/CN2020/123516 CN2020123516W WO2021078294A1 WO 2021078294 A1 WO2021078294 A1 WO 2021078294A1 CN 2020123516 W CN2020123516 W CN 2020123516W WO 2021078294 A1 WO2021078294 A1 WO 2021078294A1
Authority
WO
WIPO (PCT)
Prior art keywords
control server
service coordination
coordination device
servers
main control
Prior art date
Application number
PCT/CN2020/123516
Other languages
English (en)
Chinese (zh)
Inventor
黎海兵
Original Assignee
北京金山云网络技术有限公司
北京金山云科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京金山云网络技术有限公司, 北京金山云科技有限公司 filed Critical 北京金山云网络技术有限公司
Publication of WO2021078294A1 publication Critical patent/WO2021078294A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 
    • H04L67/1046Joining mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 
    • H04L67/1048Departure or maintenance mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 
    • H04L67/1051Group master selection mechanisms

Definitions

  • This application relates to the field of distributed storage technology, and more specifically, to a service coordination method of a distributed storage system, a service coordination device of a distributed storage system, an electronic device, and a distributed storage system.
  • Distributed storage is a storage solution that distributes data to multiple independent devices.
  • the distributed network storage system adopts an expandable system structure and utilizes multiple storage servers to share the storage load. It not only improves the reliability, availability, and access efficiency of the system, it is also easy to expand.
  • a dedicated control server usually coordinates the storage of data on multiple data servers. Metadata used to describe data attributes is stored in the control server, which can realize functions such as storage location records, historical data records, and resource search.
  • the number of control servers is usually multiple, and the main control server provides control services, and the other control servers serve as backups.
  • the distributed application coordination service can be used to coordinate the operation of multiple control servers, such as notifying multiple control servers to perform master selection operations.
  • existing coordination schemes are prone to unowned, dual-master, etc., which affect the stability of the distributed storage system.
  • One purpose of this application is to provide a new technical solution for service coordination of a distributed storage system.
  • a service coordination method for a distributed storage system includes a service coordination device and a plurality of control servers.
  • the method is implemented by any of the control servers.
  • the method includes: sending a query request to the service coordination device to obtain a query result, the query result indicating whether the multiple control servers include a main control server; and determining whether to send to other control servers according to the query result Sending a master selection instruction, where the master selection instruction is used to determine a master control server from the plurality of control servers.
  • the sending a query request to the service coordination device to obtain a query result, the query result characterizing whether a main control server is included in the plurality of control servers includes: using a preset first At a time interval, sending a query request to the service coordination device; receiving a query result sent from the service coordination device in response to the query request, the query result indicating whether the multiple control servers include a main control server; Wherein, the first time interval is less than a preset session timeout duration of the service coordination device.
  • the query result is whether the service coordination device has recorded the identification of the main control server.
  • the method further includes: when the control server is the master control server, periodically sending a connection request to the service coordination device; if the service coordination device is not received within a set time window In response to the connection request, the service provided as the main control server is stopped.
  • periodically sending a connection request to the service coordination device includes: sending a connection request to the service coordination device at a preset second time interval Sending the connection request; wherein the second time interval is less than a preset session timeout duration of the service coordination device.
  • the set time window is less than the set session timeout duration of the service coordination device.
  • the determining whether to send a master selection instruction to other control servers according to the query result, the master selection instruction being used to determine a master control server from the plurality of control servers includes: When the query result indicates that the main control server is not included in the plurality of control servers, a master selection instruction is sent to other control servers to determine a main control server from the plurality of control servers.
  • the method further includes: after determining a main control server from the plurality of control servers, sending the determined identification of the main control server to the service coordination device.
  • a service coordination method for a distributed storage system includes a service coordination device and a plurality of control servers that implement the method described in the first aspect of the present application.
  • the method includes: receiving a query request sent by the control server; in response to the query request, obtaining a query result, the query result indicating whether the multiple control servers contain a master Control server; sending the query result to the control server.
  • the query result is whether the server coordination device has recorded the identification of the main control server.
  • the method further includes: receiving a connection request periodically sent by a main control server of the plurality of control servers; and sending a response message for the connection request to the main control server.
  • the method further includes: receiving the determined identification of the main control server sent by the control server; and recording the identification of the main control server.
  • the service coordination device provides coordination services based on a distributed application coordination service (Zookeeper).
  • Ziookeeper distributed application coordination service
  • a service coordination device for a distributed storage system.
  • the distributed storage system includes a service coordination device and a plurality of control servers.
  • the device is applied to any of the control servers and includes:
  • the query module is configured to send a query request to the service coordination device to obtain a query result, and the query result represents whether a main control server is included in the plurality of control servers;
  • the judgment module is configured to determine according to the query result Whether to send a master selection instruction to other control servers, where the master selection instruction is used to determine a master control server from the multiple control servers.
  • the query module when the query module sends a query request to the service coordination device to obtain a query result, and the query result indicates whether a main control server is included in the multiple control servers, the query module is set to: At a preset first time interval, a query request is sent to the service coordination device; a query result sent from the service coordination device in response to the query request is received, and the query result represents whether the multiple control servers include There is a main control server; wherein, the first time interval is less than a preset session timeout duration of the service coordination device.
  • the query result is whether the service coordination device has recorded the identification of the main control server.
  • the device further includes a connection detection module, the connection detection module is configured to: when the control server is the main control server, periodically send a connection request to the service coordination device; In the window, if the response of the service coordination device to the connection request is not received, the service provided as the main control server is stopped.
  • connection detection module when the connection detection module periodically sends a connection request to the service coordination device when the control server is the main control server, it is set to: send a connection request to the service coordination device at a preset second time interval.
  • the service coordination device sends the connection request; wherein, the second time interval is less than a preset session timeout duration of the service coordination device.
  • the set time window is less than the set session timeout duration of the service coordination device.
  • the judgment module determines whether to send a master selection instruction to other control servers according to the query result, it is set to: when the query result indicates that the plurality of control servers does not contain a master When the server is controlled, a master selection instruction is sent to the other control servers to determine a master control server from the multiple control servers.
  • the device further includes an identification sending module, the identification sending module is configured to: after determining a master control server from the plurality of control servers, send the determined master to the service coordination device The ID of the control server.
  • a service coordination device for a distributed storage system, the distributed storage system including a service coordination device and a plurality of control servers that implement the method described in the first aspect of the present application, the device Applied to the service coordination device, it includes: a first receiving module configured to receive a query request sent by the control server; a result obtaining module configured to obtain a query result in response to the query request, and the query result represents the query result.
  • the multiple control servers include a main control server; the first sending module is configured to send the query result to the control server.
  • the query result is whether the server coordination device has recorded the identification of the main control server.
  • the device further includes a second receiving module and a second sending module: the second receiving module is configured to receive a connection request periodically sent by a main control server among the plurality of control servers; The second sending module is configured to send a response message for the connection request to the main control server.
  • the device further includes a third receiving module and a recording module: the third receiving module is configured to receive the determined identification of the main control server sent by the control server; the recording module is configured to record The identifier of the main control server.
  • the service coordination device provides coordination services based on a distributed application coordination service (Zookeeper).
  • Ziookeeper distributed application coordination service
  • an electronic device including a processor and a memory, the memory storing machine executable instructions that can be executed by the processor, and the processor executing the machine executable instructions In order to realize the service coordination method of the distributed storage system described in the first aspect or the second aspect of the present application.
  • a distributed storage system including a user agent server, multiple storage servers, multiple control servers that implement the method described in the first aspect of the present application, and implement the second aspect of the present application.
  • the service coordination device of the method wherein the control server is in communication connection with the user agent server, the plurality of storage servers, and the service coordination device respectively.
  • control server actively inquires whether the main control server is included in the multiple control servers, and determines whether to send the master selection instruction to other control servers according to the query result, which can avoid the situation that the system has no master and improve the system The stability.
  • FIG. 1 shows a schematic diagram of the hardware configuration of a distributed storage system that can be used to implement the embodiments of the present application.
  • Figure 2 shows a schematic structural diagram of a server that can be used to implement the embodiments of the present application.
  • Fig. 3 shows a flowchart of a service coordination method of a distributed storage system according to an embodiment of the present application.
  • Fig. 4 shows a flow chart of a specific example of the implementation of the service coordination method of the distributed storage system according to the embodiment of the present application.
  • Figure 1 shows a schematic structural diagram of a distributed storage system that can be used to implement embodiments of the present application.
  • the distributed storage system 100 includes a user agent server 1000, a storage server 2000, a control server 3000, and a service coordination device 4000.
  • the number of storage servers 2000 and control servers 3000 are both multiple (two or more).
  • the storage server 2000 is set to store target data.
  • the user proxy server 1000 is configured to receive a data read and write request for target data sent by the user terminal, and forward the data read and write request to the control server 3000.
  • the control server 3000 is configured to query the storage server 2000 corresponding to the target data from the metadata stored by itself, and return the identification information of the storage server 2000 to the user agent server 1000.
  • the user agent server 1000 interacts with the corresponding storage server 2000 according to the identification information to complete the read and write operations on the target data.
  • the service coordination device 4000 is configured to coordinate the operation of multiple control servers 3000, such as assigning an identity to the control server 3000, notifying the control server 3000 to elect a master control server, and so on.
  • the service coordination device 4000 is, for example, an electronic device installed with distributed application coordination software.
  • the distributed application coordination service can be arranged in multiple control servers 3000.
  • the multiple control servers 3000 can implement their own services based on the distributed application coordination service. Coordination, no additional service coordination equipment 4000 is needed.
  • the user agent server 1000, the storage server 2000, the control server 3000, and the service coordination device 4000 may communicate with each other through a wired network or a wireless network.
  • the control server and the service coordination device are actually communications between different processes in the same device.
  • the user agent server 1000, the storage server 2000, the control server 3000, and the service coordination device 4000 all have the hardware configuration of the server 1100 as shown in FIG. 2.
  • the server 1100 may include a processor 1110, a memory 1120, an interface device 1130, a communication device 1140, a display device 1150, and an input device 1160.
  • the processor 1110 may be, for example, a central processing unit CPU or the like.
  • the memory 1120 includes, for example, ROM (Read Only Memory), RAM (Random Access Memory), nonvolatile memory such as a hard disk, and the like.
  • the interface device 1130 includes, for example, a USB interface, a serial interface, and the like.
  • the communication device 1140 can perform wired or wireless communication, for example.
  • the display device 1150 is, for example, a liquid crystal display.
  • the input device 1160 may include, for example, a touch screen, a keyboard, and the like.
  • the memory 1120 of the server 1100 is configured to store instructions, which are used to control the processor 1110 to operate to support the implementation of the service coordination method according to any embodiment of this specification.
  • Technicians can design instructions according to the scheme disclosed in this specification. How the instruction controls the processor to operate is well known in the art, so it will not be described in detail here.
  • server 1100 in the embodiment of this specification may only involve some of the devices, for example, only the processor 1110, the memory 1120, and the Communication device 1140.
  • the file batch comparison system 1000 shown in FIG. 1 is only for explanatory purposes, and is by no means intended to limit this specification, its application or purpose.
  • This embodiment provides a service coordination method for a distributed storage system.
  • the method is implemented by, for example, any control server 3000 in FIG. 1. As shown in Figure 3, the method includes the following steps S1100-S1200:
  • Step S1100 Send a query request to the service coordination device to obtain a query result.
  • the query result represents whether a main control server is included in the multiple control servers.
  • the distributed storage system includes multiple control servers.
  • One main control server is selected from multiple control servers through elections, etc., and the main control server provides control services, and other control servers serve as backups.
  • control server sends a registration request to the service coordination device during the initial startup phase.
  • service coordination device assigns a unique identity to each control server according to preset rules.
  • the service coordination device coordinates the master selection process and records the identification of the master control server.
  • the query request is sent to the server coordination device.
  • the service coordination device may be inquiring of the current multiple control servers, which one has the master control server identifier and the status of the master control server.
  • the system state of the current distributed storage system may include: a plurality of control servers include the system state of the main control server (hereinafter referred to as the main state, the main state includes, for example, one main control server, or two main control servers).
  • the control server) and multiple control servers do not include the system state of the main control server (hereinafter referred to as the lack of main state).
  • control server actively sends a query request to the service coordination device to obtain the query result.
  • the query result indicates whether the main control server is included in the multiple control servers.
  • step S1100 further includes: sending a query request to the service coordination device at a preset first time interval; receiving a query result sent from the service coordination device in response to the query request, and the query result represents a plurality of control servers Whether a main control server is included; wherein, the first time interval is less than the preset session timeout duration of the service coordination device.
  • a long connection is maintained between the control server and the service coordination device.
  • the service coordination device receives a message (such as a heartbeat packet) sent by the control server that it is in an active state, The service coordination device judges that the connection is in a normal state, and continues to monitor the connection state between the two.
  • control server periodically queries the current system status at a preset first time interval.
  • the first time interval is less than the session timeout duration set by the service coordination device, which is beneficial to avoid session expiration.
  • the query result is whether the service coordination device records the identification of the main control server.
  • the identification of the main control server is, for example, the IP or other identification of the main control server, or the unique identification assigned by the service coordination device (for example, assign a specific character string as the identification, or add a character string to the IP address of the main control server) As an identity, etc.).
  • the process of the control server inquiring whether the current system state is the missing master state includes: requesting the service coordination device to provide the current system state; and determining the current system state according to the current system state provided by the service coordination device in response to the request.
  • the service coordination device records that the current system is in a master state or an unowned state, and the identity of the current master control server in the master state. After the control server sends a request to provide the current system state to the service coordination device, it determines whether the current system is in the active state or the unmaintained state according to the received information returned by the service coordination device that the current system is in the active state or the unactive state.
  • Step S1200 Determine whether to send a master selection instruction to other control servers according to the query result.
  • the master selection instruction is used to determine a master control server from a plurality of control servers.
  • the control server when the query result indicates that the main control server is not included in the plurality of control servers, the control server sends a master selection instruction to other control servers to determine a main control server from the plurality of control servers.
  • control server actively sends the master selection instruction when the current system state queried is the master lack state, thereby initiating the master selection operation of co-electing the master control server with other control servers, without waiting for notification from the service coordination device Then initiate the master election operation.
  • control servers other than the main control server are in a non-election state, and may only communicate with the service coordination device, without communicating with other control servers.
  • the control server other than the master control server enters the election state, communicates with other control servers, and initiates the master election operation.
  • the process of selecting the master is, for example: each control server sends a vote and receives votes from other control servers; processes and counts votes according to preset election rules; updates its own status according to the election results, for example, Update its own status as master control server or non-master (slave) control server.
  • multiple control servers vote according to the principle that the one with the smallest identity is elected to select the main control server.
  • control servers can only communicate with the service coordination device, without mutual communication.
  • the communication between the control server and the service coordination device communication is manifested as the communication between the process corresponding to the control service and the process corresponding to the coordination service.
  • the control server actively inquires whether the main control server is included in the multiple control servers, and determines whether to send the master selection instruction to other control servers according to the query result, which is beneficial to avoid system failure. The case of the Lord.
  • control servers can actively initiate the master selection based on the queried system status, thereby avoiding the system having no master status.
  • the service coordination method of the distributed storage system further includes: when the control server is the main control server, periodically sending a connection request to the service coordination device; if within the set time window, the service coordination device does not receive In response to the connection request, the service provided as the main control server is stopped.
  • the main control server actively connects with the service coordination device, for example, actively sends a request to obtain the connection status, and judges whether the connection is successful according to whether the service coordination device returns a corresponding message . If all connections within the set time window fail, then the main control server actively stops the service provided as the main control server, that is, withdraws from the node position of the main control server. In this way, it is helpful to avoid the dual-master state of the system.
  • the current main control server of the system can work normally, but the communication connection with the service coordination device is interrupted.
  • the service coordination device clears the recorded identity of the main control server and notifies other control servers to restart Initiate the election of the master. After re-election of the master, there will be two master control servers in the system.
  • the original main control server cannot successfully connect with the service coordination device, it will actively withdraw from the position of the master node, thereby avoiding the dual-master state of the system.
  • the process of the main control server actively connecting to the service coordination device includes: actively connecting to the service coordination device at a preset second time interval, where the second time interval is less than the session timeout duration set by the service coordination device .
  • the main control server actively connects to the service coordination device at a preset second time interval.
  • the second time interval is less than the session timeout duration set by the service coordination device, which is beneficial to avoid the situation that the session between the main control server and the service coordination device expires.
  • the above-mentioned set time window is also smaller than the session timeout duration set by the service coordination device.
  • the service coordination method of the distributed storage system further includes: after determining a master control server from the plurality of control servers, sending the determined identification of the master control server to the service coordination device.
  • the identifier of the main control server is, for example, the IP or other identifiers of the main control server, such as a unique identifier assigned by the service coordination device. In this way, it is beneficial for the service coordination device to obtain the identification of the re-selected main control server in time, thereby maintaining the normal operation of the distributed storage system.
  • This embodiment also provides another service coordination method for a distributed storage system.
  • the method is implemented by, for example, the service coordination device 4000 in FIG. 1, or when the distributed application coordination service is arranged in multiple control servers, Implemented by multiple control servers.
  • the method includes the following steps: receiving a query request sent by a control server; in response to the query request, obtaining a query result, the query result indicating whether a main control server is included in a plurality of control servers; and sending the query result to the control server.
  • the query result is whether the server coordination device records the identification of the main control server.
  • the identifier of the main control server is, for example, the IP or other identifiers of the main control server, such as a unique identifier assigned by the service coordination device.
  • the service coordination method of the distributed storage system further includes: receiving a connection request periodically sent by a main control server among the multiple control servers; and sending a response message for the connection request to the main control server.
  • the main control server actively connects with the service coordination device, for example, actively sends a request to obtain the connection status, and judges whether the connection is successful according to whether the service coordination device returns a corresponding message . If all connections within the set time window fail, then the main control server actively stops the service provided as the main control server, that is, withdraws from the node position of the main control server. In this way, it is helpful to avoid the dual-master state of the system.
  • the process of the service coordination device regularly acquiring the survival status of the main control server is, for example: the service coordination device receives a message about its own active status periodically sent by the master control server, and judges the master control server when the message is obtained. Is alive.
  • the service coordination method of the distributed storage system further includes: receiving a connection request periodically sent by the main control server among the multiple control servers; receiving the determined identification of the main control server sent by the control server; recording the main control server The ID of the server. In this way, it is beneficial for the service coordination device to obtain the identification of the re-selected main control server in time, thereby maintaining the normal operation of the distributed storage system.
  • the service coordination device is a service coordination device that provides coordination services based on a distributed application coordination service (Zookeeper).
  • Distributed Application Coordination Service (Zookeeper) is a distributed, open source distributed application coordination service. It is the manager of the cluster and monitors the status of each node in the cluster to perform the next reasonable operation according to the feedback submitted by the node. .
  • the service coordination device can effectively manage a cluster composed of multiple control servers, and coordinate multiple control servers to provide external control services.
  • FIG. 4 shows a specific example of the implementation of the service coordination method of the distributed storage system provided by this embodiment.
  • each control server sends a registration request to the service coordination device, that is, step S101 is executed.
  • the service coordination device allocates a globally unique identity to each control server, that is, step S102 is executed.
  • a plurality of control servers are elected according to the identities assigned by the control servers, and the main control server is selected according to the principle of the least identity being elected.
  • the main control server provides the control server externally, and other control servers serve as backups, that is, step S103 is executed.
  • the main control server actively connects to the service coordination device periodically, and the time interval of the regular connection is less than the session timeout time set by the service coordination device, that is, step S104a is executed.
  • the slave control server also actively queries the service coordination device for the system status periodically, and the time interval of the regular query is less than the session timeout time set by the service coordination device, that is, step S104b is executed.
  • the main control server can provide services normally but the communication with the service coordination device is disconnected during the operation of the system. In this case, the service coordination device cannot obtain the survival status reported by the main control server.
  • the record of the main control server is cleared, and the system becomes the main lack state, that is, step S105 is executed.
  • the slave control server After that, the slave control server knows that the current state of the system is the lack of master status by periodically querying the system status. On this basis, it sends a master selection instruction to other control servers, actively initiates the re-election operation, and selects a new master control server, namely Step S106 and step S107 are executed.
  • the original master control server fails to perform master control connections with the service coordination device within the set time window, and on this basis, actively exits the master node position, that is, steps S108 and S109 are executed.
  • the slave control server actively initiates the master re-election based on the inquiry that the system is in a master-deficient state, which is beneficial to avoid the situation that the system is not master.
  • the original main control server actively retreats from the main node position when the connection with the service coordination device fails, which avoids the simultaneous existence of the new main control server and the original main control server, which is beneficial to avoid the situation of dual system masters.
  • This embodiment provides a service coordination device for a distributed storage system.
  • the distributed storage system includes a service coordination device and a plurality of control servers.
  • the device is applied to any control server and includes a query module and a judgment module.
  • the query module is configured to send a query request to the service coordination device to obtain a query result, and the query result represents whether a main control server is included in a plurality of control servers.
  • the judgment module is configured to determine whether to send a master selection instruction to other control servers according to the query result, and the master selection instruction is used to determine a master control server from a plurality of control servers.
  • the query module when the query module sends a query request to the service coordination device to obtain the query result, the query result indicates whether the main control server is included in the multiple control servers, and the query module is set to: at a preset first time interval, Send a query request to the service coordination device; receive the query result sent from the service coordination device in response to the query request, and the query result represents whether a main control server is included in the multiple control servers; wherein, the first time interval is less than the preset of the service coordination device The session timeout duration of.
  • the query result is whether the service coordination device records the identification of the main control server.
  • the device further includes a connection detection module, and the connection detection module is configured to: when the control server is the main control server, periodically send a connection request to the service coordination device; if the service is not received within the set time window Coordinating the device's response to the connection request will stop the service provided as the main control server.
  • connection detection module when the connection detection module sends a connection request to the service coordination device regularly when the control server is the main control server, it is set to: send the connection request to the service coordination device at a preset second time interval ; Wherein, the second time interval is less than the preset session timeout duration of the service coordination device.
  • the set time window is smaller than the session timeout duration set by the service coordination device.
  • the judging module determines whether to send the master selection instruction to other control servers according to the query result, it is configured to send the selection to other control servers when the query result indicates that the master control server is not included in the plurality of control servers.
  • the main command is used to determine a main control server from a plurality of control servers.
  • the device further includes an identification sending module, and the identification sending module is configured to send the determined identification of the master control server to the service coordination device after determining a master control server from the plurality of control servers.
  • This embodiment also provides a service coordination device for a distributed storage system.
  • the distributed storage system includes a service coordination device and a plurality of control servers implementing the methods described in the method embodiments.
  • the device is applied to the service coordination device and includes: The first receiving module, the result obtaining module and the first sending module.
  • the first receiving module is configured to receive the query request sent by the control server.
  • the result obtaining module is configured to obtain the query result in response to the query request, and the query result represents whether the main control server is included in the plurality of control servers.
  • the first sending module is configured to send the query result to the control server.
  • the query result is whether the server coordination device records the identification of the main control server.
  • the device further includes a second receiving module and a second sending module: the second receiving module is configured to receive connection requests periodically sent by the main control server among the plurality of control servers; the second sending module is configured to send the connection request to the main control server.
  • the control server sends a response message for the connection request.
  • the device further includes a third receiving module and a recording module: the third receiving module is set to receive the determined identification of the main control server sent by the control server; the recording module is set to record the identification of the main control server.
  • the service coordination device provides coordination services based on a distributed application coordination service (Zookeeper).
  • Ziookeeper distributed application coordination service
  • This embodiment provides an electronic device that includes a processor and a memory.
  • the memory stores machine-executable instructions that can be executed by the processor.
  • the processor executes the machine-executable instructions to implement the distributed storage system described in the method embodiments of the present application. Service coordination method.
  • This embodiment provides a distributed storage system, including a user agent server, multiple storage servers, multiple control servers that implement the first method described in the method embodiment of this application, and the second method described in the method embodiment of this application.
  • the service coordination device of the method wherein the control server is respectively connected to the user agent server, the multiple storage servers in communication connection, and the service coordination device communication connection.
  • This embodiment provides a machine-readable storage medium.
  • the machine-readable storage medium stores machine-executable instructions.
  • the machine-executable instructions When the machine-executable instructions are called and executed by a processor, the machine-executable instructions cause the processor to implement the method embodiments of the present application. Describe the service coordination method of the distributed storage system.
  • the computer program product may include a computer-readable storage medium loaded with computer-readable program instructions for enabling a processor to implement various aspects of the present application.
  • the computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Non-exhaustive list of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, mechanical encoding device, such as a printer with instructions stored thereon
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory flash memory
  • SRAM static random access memory
  • CD-ROM compact disk read-only memory
  • DVD digital versatile disk
  • memory stick floppy disk
  • mechanical encoding device such as a printer with instructions stored thereon
  • the computer-readable storage medium used here is not interpreted as the instantaneous signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (for example, light pulses through fiber optic cables), or through wires Transmission of electrical signals.
  • the computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • the network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .
  • the computer program instructions used to perform the operations of this application may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, status setting data, or in one or more programming languages.
  • Programming languages include object-oriented programming languages-such as Smalltalk, C++, etc., and conventional procedural programming languages-such as "C" language or similar programming languages.
  • Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server carried out.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to access the Internet connection).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), can be customized by using the status information of the computer-readable program instructions.
  • the computer-readable program instructions are executed to realize various aspects of the present application.
  • These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine that makes these instructions when executed by the processor of the computer or other programmable data processing device , A device that implements the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium. These instructions make computers, programmable data processing apparatuses, and/or other devices work in a specific manner. Thus, the computer-readable medium storing the instructions includes An article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
  • each block in the flowchart or block diagram can represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction contains one or more executables for implementing the specified logical functions. instruction.
  • the functions marked in the block may also occur in a different order than the order marked in the drawings. For example, two consecutive blocks can actually be executed in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that realization by hardware, realization by software, and realization by a combination of software and hardware are all equivalent.
  • a distributed application coordination service (Zookeeper) is used to coordinate the operation of multiple control servers, for example, multiple control servers are notified to perform a master selection operation.
  • Zookeeper distributed application coordination service
  • existing coordination schemes are prone to unowned, dual-master, etc., which affect the stability of the distributed storage system.
  • the control server actively inquires whether the main control server is included in the multiple control servers, and determines whether to send a master selection instruction to other control servers according to the query result, which can avoid the situation that the system has no master , Improve the stability of the system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer And Data Communications (AREA)

Abstract

La présente demande concerne un procédé et un appareil de coordination de services pour un système de stockage distribué, et un dispositif électronique. Le procédé consiste à : envoyer une demande d'interrogation à un dispositif de coordination de services pour obtenir un résultat d'interrogation, le résultat d'interrogation indiquant si une pluralité de serveurs de commande comprennent un serveur de commande maître ; et déterminer, en fonction du résultat de la demande, s'il faut envoyer une instruction de sélection de serveur de commande maître à un autre serveur de commande, l'instruction de sélection de serveur de commande maître étant utilisée pour déterminer un serveur de commande maître parmi la pluralité de serveurs de commande.
PCT/CN2020/123516 2019-10-25 2020-10-26 Procédé et appareil de coordination de services pour un système de stockage distribué, et dispositif électronique WO2021078294A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911024570.3A CN112714143A (zh) 2019-10-25 2019-10-25 分布式存储系统的服务协调方法、装置及电子设备
CN201911024570.3 2019-10-25

Publications (1)

Publication Number Publication Date
WO2021078294A1 true WO2021078294A1 (fr) 2021-04-29

Family

ID=75541527

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/123516 WO2021078294A1 (fr) 2019-10-25 2020-10-26 Procédé et appareil de coordination de services pour un système de stockage distribué, et dispositif électronique

Country Status (2)

Country Link
CN (1) CN112714143A (fr)
WO (1) WO2021078294A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115242720A (zh) * 2022-08-03 2022-10-25 北京达佳互联信息技术有限公司 长连接服务的连接方法、装置、电子设备以及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105306566A (zh) * 2015-10-22 2016-02-03 创新科存储技术(深圳)有限公司 一种云存储系统中选举主控节点的方法及系统
US9819541B2 (en) * 2015-03-20 2017-11-14 Cisco Technology, Inc. PTP over IP in a network topology with clock redundancy for better PTP accuracy and stability
CN107579860A (zh) * 2017-09-29 2018-01-12 新华三技术有限公司 节点选举方法及装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101436209B (zh) * 2008-12-15 2011-01-05 中兴通讯股份有限公司 一种多数据库同步的方法和装置
CN104754029B (zh) * 2014-12-31 2018-04-27 北京天诚盛业科技有限公司 确定主管理服务器的方法、装置和系统
CN106533738B (zh) * 2016-10-20 2019-09-10 中国民生银行股份有限公司 分布式批处理的方法、装置和系统
CN106789197A (zh) * 2016-12-07 2017-05-31 高新兴科技集团股份有限公司 一种集群选举方法及系统
CN107528730B (zh) * 2017-08-28 2021-08-27 北京格是菁华信息技术有限公司 多重冗余方法、多重冗余服务器以及系统
CN108717379B (zh) * 2018-05-08 2023-07-25 平安证券股份有限公司 电子装置、分布式任务调度方法及存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9819541B2 (en) * 2015-03-20 2017-11-14 Cisco Technology, Inc. PTP over IP in a network topology with clock redundancy for better PTP accuracy and stability
CN105306566A (zh) * 2015-10-22 2016-02-03 创新科存储技术(深圳)有限公司 一种云存储系统中选举主控节点的方法及系统
CN107579860A (zh) * 2017-09-29 2018-01-12 新华三技术有限公司 节点选举方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115242720A (zh) * 2022-08-03 2022-10-25 北京达佳互联信息技术有限公司 长连接服务的连接方法、装置、电子设备以及存储介质

Also Published As

Publication number Publication date
CN112714143A (zh) 2021-04-27

Similar Documents

Publication Publication Date Title
JP4637842B2 (ja) クラスタ化されたコンピューティングシステムにおける高速なアプリケーション通知
US11249788B2 (en) Cloud management platform, and virtual machine management method and system
US9960964B2 (en) System, method and apparatus to manage services in a network
US9367261B2 (en) Computer system, data management method and data management program
WO2018137572A1 (fr) Procédé, dispositif et système de gestion de stratégie
US9390156B2 (en) Distributed directory environment using clustered LDAP servers
US11445013B2 (en) Method for changing member in distributed system and distributed system
US10963353B2 (en) Systems and methods for cross-regional back up of distributed databases on a cloud service
US11546228B2 (en) Zero-touch configuration of network devices using hardware metadata
US11330078B1 (en) Method and system for managing updates of a data manager
WO2021078294A1 (fr) Procédé et appareil de coordination de services pour un système de stockage distribué, et dispositif électronique
EP3648405A1 (fr) Système et procédé pour créer un quorum hautement disponible pour solutions groupées
WO2021082868A1 (fr) Procédé de gestion de données pour système de stockage distribué, appareil et dispositif électronique
US10992770B2 (en) Method and system for managing network service
EP3570169A1 (fr) Procédé et système de traitement de défaillance de dispositif
EP3399696A1 (fr) Système et procédé d'automatisation du processus de découverte
US11637737B2 (en) Network data management framework
JP6644902B2 (ja) ハイパースケール環境における近隣監視
US11321185B2 (en) Method to detect and exclude orphaned virtual machines from backup
US10972343B2 (en) System and method for device configuration update
US9548940B2 (en) Master election among resource managers
US20200233853A1 (en) Group membership and leader election coordination for distributed applications using a consistent database
JP2006113828A (ja) 作業負荷管理可能なクラスタシステム
US20240028478A1 (en) Clustered asset backup in non-federated way
US20240171463A1 (en) Configuration update method, apparatus, and system, and computer-readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20879706

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20879706

Country of ref document: EP

Kind code of ref document: A1