WO2017113261A1 - 加锁请求的处理方法及服务器 - Google Patents

加锁请求的处理方法及服务器 Download PDF

Info

Publication number
WO2017113261A1
WO2017113261A1 PCT/CN2015/100006 CN2015100006W WO2017113261A1 WO 2017113261 A1 WO2017113261 A1 WO 2017113261A1 CN 2015100006 W CN2015100006 W CN 2015100006W WO 2017113261 A1 WO2017113261 A1 WO 2017113261A1
Authority
WO
WIPO (PCT)
Prior art keywords
lock
server
resource
request
lock request
Prior art date
Application number
PCT/CN2015/100006
Other languages
English (en)
French (fr)
Inventor
冯锐
陈�光
刘军
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CA2960982A priority Critical patent/CA2960982C/en
Priority to AU2015408848A priority patent/AU2015408848B2/en
Priority to JP2017522597A priority patent/JP6357587B2/ja
Priority to PCT/CN2015/100006 priority patent/WO2017113261A1/zh
Priority to SG11201703260QA priority patent/SG11201703260QA/en
Priority to KR1020177008985A priority patent/KR102016702B1/ko
Priority to EP15911889.2A priority patent/EP3232609B1/en
Priority to CN201580008587.3A priority patent/CN107466456B/zh
Priority to BR112017011541-7A priority patent/BR112017011541B1/pt
Publication of WO2017113261A1 publication Critical patent/WO2017113261A1/zh
Priority to US16/013,175 priority patent/US10846185B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1031Controlling of the operation of servers by a load balancer, e.g. adding or removing servers that serve requests
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • G06F11/203Failover techniques using migration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/805Real-time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/825Indexing scheme relating to error detection, to error correction, and to monitoring the problem or solution involving locking

Definitions

  • the present invention relates to computer technology, and in particular, to a method and a system for processing a lock request.
  • the lock server implements mutually exclusive access to the same resource by multiple nodes at the same time.
  • the host needs to perform some operations on the resource, it first needs to request the lock permission from the lock server. After the host acquires the lock permission, it can perform corresponding operations on the resource, such as a read operation or a write operation. Therefore, the performance, availability, and reliability of the lock server directly affect the performance, availability, and reliability of the entire distributed system.
  • a host communicates with a node through a NAS (Network Attacked Storage) network.
  • NAS Network Attacked Storage
  • Each node is provided with a lock server, and each node is also connected to a storage system, and files such as files are stored in the storage system.
  • the host When the host needs to perform operations (such as read operations or write operations) on the resources in the storage system, first apply for the lock permission to the lock server through the application on the host, obtain the lock permission assigned by the server to the resource, and then perform the file permission. operating.
  • the correspondence between the allocated lock permissions of the resource and the application can be stored in each node or in the shared storage accessible by each node. For example, when a host needs to read a file in the storage system, it first requests the lock permission of the file from the lock server in a node, and the host can read the file after obtaining the lock permission of the file.
  • the corresponding relationship between the lock permission of the file and the application with the lock permission is stored in the node, and the application with the lock permission is the node or the node. Even if the node has the lock permission, this node can further analyze which application in the node needs to use the resources in the storage system.
  • a lock server fails, the service on the fail lock server needs to be switched to a lock server that has not failed (hereafter referred to as a non-fail lock server).
  • a protocol such as NFS (Network File System) or Samba
  • NFS Network File System
  • Samba Samba
  • the service on the fail-lock server is switched to the non-fail-lock server, and the host can apply for the lock.
  • a first aspect of the present invention provides a method for processing a lock request, which may be applied to a first lock server, wherein the first lock server is a takeover lock server of a second lock server, and the first lock server stores The lock management scope of the second lock server, the method includes: the first lock server enters a silent state after learning that the second lock server is faulty, and the silent range of the silent state is that the second lock server has been allocated
  • the first lock server receives the first lock request, the first lock request is used to request the first resource to be locked, and the first lock request carries the first resource identifier;
  • the first lock server detects that the first resource belongs to the management scope of the second lock server; the first lock server queries the first resource information record table, and the first resource information record table records that the first resource information record table has been
  • the second lock server allocates the ID of the resource of the lock authority. If the first resource identifier is not recorded in the first resource information record table, the first lock server follows the A first lock request to lock the first resource allocation permission
  • the first lock server when the first lock server is silent, the resources in the original management scope of the first lock server are not included in the silent range, and therefore can be processed normally. Furthermore, in the distributed lock management system composed of the first lock server, the second lock server, and other lock servers, the lock server other than the first lock server and the second lock server may not enter during the first lock server silent period. Quiet and continue to work normally.
  • the method further includes: the first lock server receives a second lock request, and the second lock request is used to request to lock a second resource, the second add The lock request carries an identifier of the second resource; the first lock server detects that the second resource belongs to a management scope of the first lock server; and the first lock server sends the second lock request according to the second lock request The second resource allocates a lock authority.
  • the resources in the original management scope of the first lock server are not included in the silent range, and therefore can be processed normally.
  • the method further includes: after the first lock server enters the silent state, the first lock server receives the third plus a lock request, the third lock request is used to request a lock on the third resource, and the third lock request carries an identifier of the third resource; the first lock server detects that the third resource belongs to The management scope of the second lock server; the first lock server queries the first resource information record table, and if the ID of the resource requested by the third lock request has been recorded in the first resource information record table, Then, the first lock server refuses to assign a lock right to the third resource according to the third lock request.
  • the resource is locked by the resource that has been allocated by the second lock server, and the processing is rejected, thereby avoiding the lock conflict.
  • the method further includes: the first lock server recording the first resource identifier into the second resource information record table
  • the second resource information record table is configured to record an ID of a resource to which the first lock server has been assigned a lock right, and the second resource information record table is stored in a third lock server.
  • the lock situation of the first lock server can be recorded.
  • the first lock server fails in the future, it can be taken over by its corresponding takeover lock server.
  • the takeover method is similar as described.
  • the step of the first lock server storing the lock management range of the second lock server includes: a lock server receives the first notification message, where the first notification message carries the identification information of the second lock server; the first lock server determines according to the identifier of the second lock server and the lock server takeover relationship.
  • the first lock server is a takeover lock server of the second lock server; the first lock server receives a lock management scope of the second lock server and stores the lock.
  • Applying the method provides a solution for how the first lock server obtains the lock management scope of the second lock server.
  • the method may further include: the protocol server receiving the packet from the host, and parsing the first packet from the packet a lock request; the protocol server forwards the first lock request to a lock agent; the lock agent determines according to the first resource identifier carried in the first lock request, and determines that the first resource is managed When the first lock server is used, the first lock request is sent to the first lock server.
  • the protocol server and the lock agent are added, and a lock management technology executed by the lock server, the protocol server and the lock agent is provided.
  • the method may further include: after the first lock server enters a silent state, receiving a lock reaffirmation request, where the lock reaffirms the request And carrying the identifier of the fourth resource, and the fourth resource is allocated by the second lock server, where the fourth resource is a resource that the second lock server has allocated rights; according to the second lock server The assigned rights reassign the same rights to the fourth resource.
  • the method may further include: after all the resources that have been assigned rights to the second lock server are reassigned, The first lock server exits the silent state; or, after the preset time is reached, the first lock server exits the silent state.
  • an eighth possible implementation manner of the first aspect after the first lock server exits the silence, the first lock server updates the management scope of the first lock server, and the updated first lock server
  • the management scope includes a management scope of the first lock server before the update and a management scope of the second lock server.
  • the takeover relationship may be calculated by the management node and broadcast to each lock server. It can also be updated by each lock server.
  • the first resource information table may be stored in the first lock server, or may be stored in another lock server, or in the non-lock server, and can be acquired by the first lock server.
  • each aspect and each possible implementation manner can be run in a virtual machine environment, that is, the lock server runs in the virtual machine. Therefore, the lock server can be three possible implementations of hardware, software that executes hardware, and software that runs in a virtual machine.
  • the first lock server after the start of the takeover, for example, during the silent period, the first lock server further sends a query message to the lock agent of the non-faulty node, and after the lock agent of the non-failed node receives the query message, Sending a feedback message to the first lock server, the feedback message carries the lock permission applied by the lock agent through the second lock server, and is recorded by the first lock server into the detailed resource information record table.
  • the invention also provides a lock request management device and an implementation manner of the server, having the above first party And the functions of each possible implementation.
  • the present invention also provides a non-transitory computer readable storage medium and computer program product, when the memory-loaded non-volatile computer readable storage medium of the storage device provided by the present invention and computer instructions contained in the computer program product
  • the central processing unit (CPU) of the storage device executes the computer instruction
  • the storage device is respectively caused to perform various possible implementations of the first aspect and various possible implementations. It can be run in a device or server to be executed.
  • FIG. 1 is a topological diagram of a usage environment of a lock management system according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of an embodiment of a lock server management scope and a lock server takeover relationship provided by the present invention.
  • FIG. 3 is a flowchart of a method for processing a lock request according to an embodiment of the present invention.
  • FIG. 4 is a structural diagram of an embodiment of a lock request management apparatus of the present invention.
  • FIG. 5 is a block diagram showing an embodiment of a server of the present invention.
  • the embodiment of the present invention proposes establishing a connection relationship of each lock server.
  • the takeover lock server of the fail lock server can be obtained according to the takeover relationship.
  • a lock server is a server that can handle lock requests.
  • the lock request may be an acquire lock request or a reclaim lock request.
  • the lock request can be a read lock request or a write lock request.
  • the lock request is applied to lock a resource. After the lock is applied, the owner of the lock acquires the lock permission. That is to say, only the owner has the corresponding operation authority for the resource.
  • a read lock request is used to apply for permission to read a resource
  • a write lock request is used to apply for permission to write data to a resource.
  • the lock reiterates the request, and the permission owner re-applies Please have obtained the lock permission.
  • the host originally accesses the storage system through node 1, and then node 1 fails.
  • the host changes access to the storage system through node 2.
  • the host reaffirms the request by issuing a lock to node 2 to obtain the lock permission that has been obtained before.
  • the lock request may also include a release lock request to release lock permissions on the file so that other hosts can apply for lock permissions on the file.
  • fail lock server When the lock server fails, it is called a fail lock server.
  • the lock management of the fail-lock server is taken over by its takeover lock server. Only the lock server enters the silent state, and the rest of the lock servers do not enter the silent state, and the lock request can be processed normally. Compared with the prior art, the impact of the lock server failure on the entire system is reduced.
  • the takeover lock server that has entered the silent state enters the silent state only for some resources, it is still possible to normally respond to a part of the lock request (the lock request of the resource that has not entered the silent state).
  • the utilization of the lock server is further improved, and the impact on the system after the lock server enters silence is reduced.
  • the lock server does not process the lock request.
  • the lock reiterate request can be processed.
  • the lock server may process the lock request, such as a read lock request to the resource, giving a lock permission; a write lock request to the resource, by recycling the allocated write lock, Give lock permission.
  • this part of the resource is the resource that the fault lock server has allocated permissions
  • the silent state is the takeover lock server of the fault lock server.
  • the lock request originally managed by the takeover lock server is maintained in a normal state and is not affected by the silent state. If the received lock request is within the scope of the failsafe service management, and before that, the fail lock server has not assigned lock permissions for the resource requested by the lock request lock. Then the takeover lock server can respond normally to this lock request and assign lock permissions to it. If the received lock request is within the scope of the fail-lock service management, and before that, the fail-lock server has assigned the lock permission for the resource requested by the lock request, then it is denied the lock permission.
  • the "lock request” refers to the lock request that was taken over by the lock server after the fault lock server failed within the scope of the fault lock service management.
  • the embodiment of the present invention can be applied to a distributed system, where the distributed system is composed of multiple nodes, and each node manages the lock authority of a part of files.
  • a node is, for example, a lock server and may include a processor, an external interface, and a memory.
  • a lock server fails in a distributed system, the non-failed lock server in the distributed system enters a silent state, and a management method of lock authority is proposed.
  • the node can also integrate the protocol server and the lock agent into a combination of a lock server, a protocol server, and a lock agent.
  • the identifier of the resource to which the lock authority is assigned is backed up to the designated lock server.
  • the specified lock server may be the takeover lock server of the lock server, or may be accessed by the backup server of the lock server.
  • Other lock servers After receiving the lock request, the failover server of the fault lock server determines whether the lock authority requested by the lock request has been allocated according to the identifier of the backed up resource, and if it is allocated, returns a response message of the rejection; if not, Assign the lock permission requested by the lock request to the host.
  • a node can include only a lock server, and can also integrate other functional modules, such as a protocol server and a lock agent.
  • the allocation record is generated, and the record information is allocated, for example: ⁇ node 1, file A, write permission ⁇ , which means that node 1 is assigned the write permission to file A; ⁇ node 2, file B, read permission ⁇ , this means that node 2 has read access to file B.
  • the protocol server can convert the node's allocation record into the host's allocation record. For example, if host 1 is issued the lock request, node 1 is converted to host 1 and becomes ⁇ host 1, file A, write permission ⁇ , which means that host 1 has write access to file A; the node can take this Information can be sent to the corresponding host for storage.
  • the identity of the resource to which the lock permission is assigned is backed up to the specified server, such as the backup lock server of the lock server to which the permission is assigned, or another lock server. You do not need to back up the specific contents of the lock permission. That is, the specified server knows which resources are assigned lock permissions, but does not know what lock permissions are. Since only the identifier of the resource to which the lock authority is assigned can be backed up, and the specific content of the lock authority is not backed up, the system resources are not occupied much, and the resources of the distributed system are not greatly affected.
  • the distributed system involved is composed of multiple nodes, and the host passes through the NAS network.
  • the network communicates with the node, and the node is connected to the storage system.
  • the resources stored in the storage system are used by the host.
  • the host applies for the lock permission of the resource through the node, and the lock server in the node manages the lock permission.
  • Nodes and storage devices can be separate or combined.
  • the lock request from the host can be based on the Network File System (NFS) protocol or based on the Server Message Block (SMB) protocol.
  • the protocol server can handle one or more protocols from the host.
  • the NFS server supports the NFS protocol
  • the SMB server supports the SMB protocol.
  • the communication between different protocol servers and the upper-layer host works similarly.
  • a lock request processed by the protocol server can be used by the lock agent.
  • the protocol server has a one-to-one correspondence with the lock agent.
  • the protocol server 1 and the lock agent 1 are in one-to-one correspondence
  • the protocol server 2 and the lock agent 2 in the node 2 are in one-to-one correspondence, and so on.
  • the signal transmission between the protocol server and the lock agent is performed according to the correspondence.
  • the lock server can be located in the same node as the protocol server and the lock agent, or it can be located in a separate node or in other nodes.
  • the internal communication of the node is performed using a computer internal protocol such as a bus. Network communication such as FC and Ethernet can be used between nodes.
  • the server and the protocol server and the lock agent are located in one node as an example.
  • the node 1 has the protocol server 1, the lock agent 1 and the lock server 1.
  • Each lock server can grant different lock permissions to lock agents on different nodes.
  • the lock agent on this node can apply for permissions from the lock server of this node, and can also apply lock permissions from the lock server on other nodes.
  • a management node can be set up to control each node, or any node can control and manage all nodes.
  • the nodes that manage and control each node are generally the master node, and can also be called the management node. In the embodiment of the present invention, this is not limited, and is not shown separately in the figure.
  • the host When a read/write operation is required on a resource (such as a file, a directory, a file block, or a data block) in the storage system, the host sends a lock request to the corresponding protocol server through the network. Host can root The corresponding protocol server is determined according to the information carried in the lock request, and the corresponding protocol server may be determined according to the IP address segment, and the existing implementation manner may be used, which is not limited in the embodiment of the present invention. After receiving the lock request, the protocol server sends a lock request to the lock agent corresponding to the protocol server.
  • the lock agent determines which lock server handles the lock request based on the lock server management scope, and then sends the lock request to the determined lock server for processing.
  • the lock server management scope can be set in advance or can be determined by using a hash consistency loop.
  • the lock server management scope can be stored in the cache of the node where the lock agent is located; it can also be stored in the shared storage for sharing by lock agents in the distributed system.
  • the lock agent 2 determines that the lock request should be processed by the lock server 3 according to the locally stored lock server management scope, and sends the lock request to the lock server 3 for processing. It is also possible not to store the management scope of the lock server locally, and the lock request carries the file ID, and the lock agent can know which lock server is managed by the lock server by query or calculation. The lock agent can also directly send the lock request to the lock server located in the same node, and the lock server in the same node is forwarded to the lock server responsible for processing the lock request according to the lock server management scope.
  • the lock agent 2 sends the received lock request to the lock server 2, and the lock server 2 determines that the lock request should be handled by the lock server 4 according to the locally stored lock server management scope, and the lock server 2 forwards the lock request. Handle to lock server 4.
  • Both of these processing methods can utilize existing technologies and will not be described separately herein.
  • the lock server stores its own assigned lock permissions.
  • the lock agent stores the lock permissions that it applies to the lock server.
  • the management node in the distributed system notifies the lock server and the lock agent to update the corresponding lock server management scope. After the management node updates the lock server management scope, the update result is broadcasted to each lock agent and lock server in the distributed system.
  • the lock server After the lock server receives the lock request, during the lock server is in a normal working state (ie, not in a silent state), the lock server handles the lock request in the same manner as the prior art, for example, assigning lock authority to the host according to the lock request, This will not be described separately.
  • the distributed system in the embodiment of the present invention may also be a virtualized distributed system, and the lock server runs in the virtual machine.
  • the lock agent and protocol server can also run in the virtual machine. Since its functionality is identical to that of a non-virtualized environment, it is not covered separately.
  • the lock server management scope and lock server takeover relationship in the distributed system can be seen in FIG. 2.
  • the lock server logically forms a ring.
  • the scope of the lock server management in a distributed system is determined according to the counterclockwise direction of the consistent hash ring (in another embodiment, it can also be clockwise), and the consistent hash ring is passed through the distributed system.
  • the ID of the lock server is hashed.
  • the ID of the lock server 1 is 1
  • the ID of the lock server 2 is 2
  • the ID of the lock server 3 is 3
  • the ID of the lock server 4 is 4.
  • Each lock server uses the consistency hash algorithm for the ID. Hash calculation, and in a clockwise direction, the calculation results are arranged in order of small to large, forming a consistent hash ring.
  • the consistent hash loops obtained in each lock server are the same.
  • the consistency hash ring is 0-232
  • hash(2) 8000
  • hash(3) 1024
  • hash (4) 512
  • the location of the lock server on the hash ring is the lock server 4, the lock server 3, the lock server 1 and the lock server 2.
  • the management scope of the lock server 4 is (8000-232] and [0-512]
  • the management scope of the lock server 3 is (512, 1024)
  • the management scope of the lock server 1 is (1024, 5000)
  • the lock server The management scope of 2 is (5000, 8000).
  • the takeover relationship between the lock servers is determined according to the clockwise direction of the consistency hash ring, that is, the lock server
  • the takeover lock server of 1 is 2
  • the takeover lock server of the lock server 2 is the lock server 4
  • the takeover lock server of the lock server 4 is 3
  • the takeover lock server of the lock server 3 is 1.
  • the takeover relationship is not unique, as long as it can make each lock server have a takeover server.
  • the administrator can also configure the takeover server of each lock server.
  • the takeover lock server of the lock server 1 is configured as the lock server 2
  • the takeover lock server of the lock server 2 is configured as the lock server 3
  • the takeover lock of the lock server 3 The server is configured as a lock server 4
  • the takeover lock server of the lock server 4 is configured as a lock server 1.
  • a lock server can take over multiple lock servers. For example, if the lock server 3 and the lock server 1 fail at the same time, the takeover lock server of both of them is the lock server 4.
  • the lock agent determines, according to the stored lock server management scope, which lock server should be processed by the lock server. If it is determined that the lock server that handles the lock request fails (the lock node broadcasts the notification message to the lock agent in the distributed system when the lock server fails), the lock agent determines to take over the lock server according to the lock server takeover relationship, and the lock request is made. Send to the takeover lock server for processing.
  • the lock server management scope and the lock server takeover relationship can be uniformly configured by the management node and sent to all lock agents for storage; or the management node can calculate the consistent hash ring and send it to each lock agent, and can also pass the management node.
  • the lock agent is configured in advance, and the same consistent hash ring is calculated by each lock agent.
  • the lock agent After receiving the lock request, the lock agent uses the consistent hash algorithm to perform hash calculation on the file identifier carried in the lock request. To see which range the calculated result falls into, the corresponding lock server is responsible for processing.
  • the lock request is a lock request
  • the file identifier (such as the file name) carried in the lock request is (foo1.txt)
  • the lock agent performs a hash calculation on (foo1.txt)
  • the result is 4500, which should be
  • the lock server 1 manages, and the lock agent sends a lock request to the lock server 1.
  • the lock request is a lock re-request request, reiterating that the file information carried in the lock request is (foo8.txt), and the lock agent performs a hash calculation on (foo8.txt), and the obtained result is 9000, which should be managed by the lock server 4.
  • the lock agent sends a lock reiterate request to the lock server 4.
  • the host can use the lock to reiterate the request and regain the permissions previously requested by the failover server from the takeover lock server. If the lock lock server exits the silent period and the lock reiterates that the request has not been completed, the unexecuted lock reiterate request is no longer executed. For details on the part of the lock reaffirmation request, refer to step 309.
  • the lock agent When a lock server fails, the lock agent identifies the failed lock server in the consistent hash ring as a failure. After receiving the lock request, the lock agent performs a hash calculation on the file identifier carried in the lock request, and determines, according to the management scope of the lock server, which lock server falls within the management scope of the lock server, and if the determined lock server is in a fault state. , the lock agent is then based on the lock server The takeover relationship determines the takeover lock server of the fault lock server, and sends the lock request to the takeover lock server for processing.
  • the takeover lock server After receiving the lock request, the takeover lock server performs a hash calculation based on the file identifier to obtain a hash value, and finds that the hash value falls within its own takeover range, so it needs to process the lock request. If other non-locking servers receive the lock request, hash the value according to the file identifier and find that the hash value does not fall within its own takeover scope, then no processing is performed.
  • the lock agent identifies the lock server 2 in the consistency hash ring as a failure after receiving the notification message.
  • the lock agent receives the file information carried in the lock reaffirmation request (foo5.txt), and the lock agent performs a hash calculation on (foo5.txt), and the obtained result is 7000.
  • the lock server 2 should be responsible for processing. .
  • the lock server 2 is in a fault state at this time.
  • the takeover lock server of the fail lock server 2 is the lock server 4, so the lock agent will retransmit the lock request to the takeover lock server 4 for processing.
  • the lock server 4 performs a hash calculation on (foo5.txt), and the obtained takeover is 7000, which belongs to its own takeover scope, and therefore handles the lock reiterate request.
  • the host uses the application on the host to send a lock request to the protocol server, and the protocol server sends the lock request to the corresponding lock agent, and the lock agent carries the lock request.
  • the identifier of the file (such as FSID or FID) is hashed, and according to the calculation result, it is determined which management server the file belongs to, and the lock request is sent to the lock server for corresponding processing.
  • the hash algorithm that hashes the file's identity needs to be the same as the hash algorithm that generates the consistent hash ring.
  • the file identifier carried in the lock request is (foo2.txt), and the lock agent performs a hash calculation on the file identifier (foo2.txt), and the result is 6500.
  • the scope between the lock server 1 and the lock server 2 on the consistency hash ring is the management scope of the lock server 2, and the lock request is processed by the lock server 2.
  • the lock agent When the lock server 2 fails, the lock agent identifies the lock server 2 in the consistent hash ring as a failure. At this time, after receiving the lock request, the lock agent performs hash calculation on the file information (foo3.txt) carried in the lock request, and the result is 7500, and the lock server 1 and the lock server 2 that fall on the consistency hash ring are obtained. Between the range, but because the lock server 2 is in a fault state, according to the consistency hash ring, The takeover lock server of the lock server 2 is the lock server 4, that is, the management scope of the scope lock server 4, and therefore, the lock proxy sends the lock request to the lock server 4 for processing.
  • the takeover lock server of the lock server 2 is the lock server 4, that is, the management scope of the scope lock server 4, and therefore, the lock proxy sends the lock request to the lock server 4 for processing.
  • the method for obtaining a consistent hash ring by using the consistency hash algorithm according to the name of the node or the lock server ID may be an existing technology, and details are not described herein again.
  • the embodiment of the present invention provides a method for processing lock permissions in a distributed system, and the method embodiment is applied to a lock server.
  • the method for implementing the protocol server and the lock agent in the embodiment is the same as the method described in the foregoing, and is not described in the embodiment of the method.
  • the specific process is shown in FIG. 3.
  • the method can be applied to the distributed system shown in FIG.
  • the distributed system of the embodiment of the present invention there are four lock servers, namely, a lock server 1, a lock server 2, a lock server 3, and a lock server 4.
  • the number of lock servers in this embodiment is for illustrative purposes only, and the specific number is based on the actual service requirements, and the implementation principle is the same as that of the embodiment.
  • the following takes the takeover lock server as the first lock server, and the faulty lock server as the second lock server as an example to describe the processing of the lock request, as shown in FIG. 3.
  • Step 301 When a lock server fails in the distributed system, the management node broadcasts a notification message to the lock server in the distributed system.
  • the second lock server is faulty, so the notification message carries the ID of the second lock server as the identification information of the second lock server.
  • a notification message notifying that a lock server has failed is referred to as a first notification message.
  • the lock server receiving the first notification message determines whether it is the takeover lock server of the second lock server according to the ID carried in the first notification message and the locally stored lock server management scope, and if yes, the second lock server The resource that has been assigned the permission enters the silent state; otherwise, it does not enter the silent state.
  • the first lock server can start a timer. In the future, after the timer reaches the preset time, the first lock server exits the silence and updates the takeover relationship of the first lock server.
  • Another way to detect a fault is to take over the lock server according to the takeover relationship information.
  • the lock server should send a detection message, and when it detects that the corresponding lock server is faulty, it enters a silent state.
  • Step 302 The first lock server receives a lock request, where the lock request carries an identifier of the target resource.
  • the target resource is a resource that needs to be locked, is the request object of the lock request, or is a resource waiting for the lock permission to be allocated.
  • the protocol server and the lock agent need to send a lock request to the lock server, where the lock request carries the resource identifier of the resource requesting the lock.
  • the resource identifier may be the ID of the file to be operated or the ID of the Logic Unit Number (LUN) to be operated.
  • the request for the lock request is to read the resource or write the resource.
  • the first lock server determines, according to the resource identifier, whether the requested resource belongs to its own management scope. For example, the resource identifier is hashed. If the obtained value belongs to the hash value range preset by the first lock server, it belongs to the management scope of the first lock server; otherwise, it does not belong to the management scope of the first lock server.
  • the target resource requested by the lock request is located in the storage system, and the lock server manages its lock authority. This process has been described above and will not be described here.
  • How to send the lock request originally sent to the second lock server will be sent to the first lock server for processing.
  • the lock server and the host directly set the router, and the router records the takeover relationship.
  • the router sends the lock request originally sent to the second lock server to the second lock server. Take over the lock server. If the lock agent on the same node as the second lock server is not faulty, the foregoing scheme may be adopted, and the lock request originally sent to the second lock server is sent by the lock agent to the takeover lock server of the second lock server.
  • the lock server 1 is the second lock server.
  • the lock server 2 is the takeover lock server of the lock server 1, and then the first lock server here is the lock server 2.
  • Step 303 The first lock server queries a first resource information record table, where the first resource information is The information record table records the resource identifier of the resource to which the second lock server has assigned the lock authority.
  • the takeover lock server of the second lock server is in a silent state.
  • Each lock server including the first lock server, confirms whether it is currently in a silent state after receiving the lock request. If it is in the silent state, further determine the identifier carried in the lock request, and know that the lock request belongs to the lock request within the scope of the takeover, go to step 303. If the lock request is received not by the first lock server but by another lock server, step 303 is not executed, and the entire process is exited.
  • the protocol server in the node becomes the failure protocol server.
  • the host that originally accessed the storage system through the faulty protocol server needs to access the storage system through the takeover protocol server of the faulty protocol server. That is, the work of the failed protocol server is taken over by the takeover protocol server.
  • the takeover protocol server completes the takeover, its node is in a silent state (the silent range is the resource to which the fault lock server has been assigned permissions). After the takeover is completed, the silent node exits the silent state. In addition, if the preset time is exceeded, the silent node exits the silent state even if the takeover is not completed.
  • the node where the silent lock server is located is also in a silent state, that is, if the node is composed of a lock server, a protocol server, and a lock agent, the protocol server and the lock agent of the node also enter a silent state.
  • the takeover protocol server takes over the work of the faulty protocol server.
  • the process of taking over includes: the host reaffirms the request through the lock, and re-applies the permissions previously owned by the faulty protocol server.
  • the lock server in a silent state directly returns a rejected response message for any lock request.
  • the first resource information record table is queried.
  • the resource identifier of the resource to which the second lock server has been assigned the lock authority is stored in the first resource information record table.
  • the first resource information record table may not store the specific content without the lock permission, for example, the read permission or the write permission, so the occupied storage space is greatly reduced.
  • the first resource information record table may be stored locally on the first lock server; it may also be stored in other servers; it may be stored locally on the first lock server and in other designated servers. After entering the silent state, the resource information record table is stored in other designated lock servers.
  • the resource information record table is stored locally on the first lock server, after the first lock server enters the silent state, The first lock server sends the resource information record table to the takeover lock server of the first lock server for storage. Or, after the resource information record of the first lock server is changed, it is synchronously synchronized to the takeover lock server of the first lock server for storage, and is kept synchronized.
  • the lock server when the lock server receives the lock request for a resource for the first time, the lock server sends the information that the resource has been assigned the lock authority to the takeover lock server of the lock server, and takes over the takeover lock server. This information is stored in the first resource information record table. The timing at which the lock server sends this information can also be after assigning lock permissions to this resource. If the lock server subsequently receives a lock request for the same resource, the information is no longer sent to the takeover server of the lock server, regardless of whether the requested permissions are the same.
  • the specific implementation method is: when the lock server receives a lock request, it determines whether the notification information of "this resource has been assigned lock permission" has been sent to the takeover lock server of the lock server, and if not, a notification message is sent; Otherwise no notification message will be sent.
  • the first resource information record table may store the resource identifier of the first lock server to which the lock authority has been allocated, in addition to the resource identifier of the resource that the second lock server has allocated the lock permission, for the first lock server to process after exiting the silent state. Query when the lock request.
  • step 303 After step 303, step 304 or step 305 is performed.
  • Step 304 When the target resource identifier is included in the first resource information record table, the first lock server returns a rejected response message.
  • the resource identifier When the resource identifier is stored in the resource information record table, it indicates that the resource has been assigned lock authority by the second lock server. At this time, the first lock server cannot process the lock request, so as to avoid the lock permission conflict of the same resource, and the first lock server returns the rejected response message to the host through the lock proxy and the protocol server.
  • the first lock server if the query finds that the authority has been allocated by the first lock server for the additional lock request, the first lock server also returns a reject message. In this regard, it will not be detailed.
  • Step 305 When the resource identifier is not in the resource information record table, the first lock server allocates lock authority to the resource according to the authority requested by the lock request, and uses a lock proxy and a protocol service. The device returns the assigned lock permission to the host.
  • the first lock server may allocate a lock authority for the resource corresponding to the resource identifier.
  • the first lock server returns the assigned lock authority to the requesting host through the corresponding lock proxy and protocol server, allowing the requesting host to operate on the resource.
  • the takeover lock server of the second lock server can process a part of the lock request, and only A lock request for a resource to which a lock authority has been assigned cannot be processed if the requested resource has been assigned lock permissions. Therefore, the present embodiment more precisely controls and reduces the range of the impact of the lock server in the distributed system, and improves the performance and reliability of the distributed system.
  • the first lock server can record the assigned authority in the local detailed resource information record table.
  • the detailed resource information record table records the specific content of the permission, such as the resource identifier, the lock authority, the lock permission type, and the current status of the lock authority.
  • the detailed resource information record table and the first resource information record table may be separate or integrated.
  • the first lock server After the first lock server takes over the failed server, the first lock server also sends a query message to the lock agent of the non-failed node. After receiving the query message, the lock agent of each non-faulty node sends a feedback message to the first lock server, and the feedback message carries the lock permission applied by the lock agent through the second lock server, and records the detailed resource information of the lock server. In the record table. Therefore, the information recorded in the detailed resource information record table is updated, in addition to recording the specific content of the rights assigned by the first lock server, and the specific content of the rights assigned by the second lock server.
  • the lock requests mentioned in steps 304 and 305 are both lock requests that should otherwise be handled by the second lock server in accordance with the takeover scope. Even if the takeover lock server enters a silent state, the lock request for these resources can be handled as if it had not entered the silent state, as the resources that are originally within the scope of the takeover lock server are not silent.
  • the method embodiment may further perform step 306.
  • Step 306 The first lock server stores the target resource identifier into the second resource information record table.
  • the second resource information record table is similar in form to the first resource information record table, and is configured to record a resource identifier of a resource to which the first lock server has assigned the lock authority. Therefore, after the first lock server is faulty, the takeover lock server of the first lock server can take over the first lock server.
  • the specific steps are similar to steps 302-305, which are not detailed here.
  • the first lock server When the first lock server is not in the silent state, the first lock server records the target resource identifier into the second resource information table after assigning the lock authority to the target resource in the lock request.
  • the second resource information table stores the resource identifier of the resource to which the first lock server has assigned the lock authority.
  • step 306 when the first lock server fails, the takeover lock server of the first lock server transitions from a non-silent state to a silent state, and the silence range is a resource to which the first lock service has been assigned rights.
  • the takeover lock server of the first lock server may follow the operation of step 305 as the target.
  • the resource allocates a lock permission; otherwise, as in step 304, a response message of the rejection is returned.
  • the lock server stores the necessary information locally, such as the resource identifier, the lock authority, the lock permission type, and the current state of the lock authority, which are not further described herein. .
  • the resource identifier of the resource to which the lock authority is assigned is separately stored.
  • the lock server stores the resource identifier in a separate resource information record table, and stores the resource information record table in a takeover lock server of the lock server.
  • step 304 or step 306 the method embodiment may further include the following step 307.
  • step 307 the silent state is exited.
  • a takeover time may be preset, and when the predetermined time is reached, the first lock server exits the silent state regardless of whether the takeover work is completed.
  • the new management scope of the first lock server has been expanded, which is a collection of both the old management scope and the management scope of the second lock server.
  • the takeover of the second lock server by the first lock server is completed.
  • the range of takeovers in the system also needs to be changed.
  • the takeover scope of the takeover lock server (named the third lock server) of the first lock server is also mechanically updated with the management scope of the first lock server.
  • the lock server in the distributed system may start a timer, and when the predetermined time is reached, the lock server in the silent state exits the silent state.
  • the first notification message is sent by the management node by means of a broadcast to notify the lock server in the distributed system that the lock server has failed.
  • the non-second lock server in the distributed system After the non-second lock server in the distributed system receives the first notification message, it determines whether it is the takeover lock server of the second lock server according to the lock server takeover relationship stored in the local or shared storage; if it is the second lock server After taking over the lock server, it enters the silent state and starts the timer.
  • the predetermined time is reached, the silent state is exited, the lock server management scope and the lock server takeover relationship are updated; if it is not the takeover lock server of the second lock server, the silent state is not entered and the normal operation is maintained.
  • the non-takeover lock server may also mark the locally stored lock server management scope and the second lock server in the lock server takeover relationship as a fault state, and the non-takeover lock server updates the lock server management scope. Take over the relationship with the lock server.
  • the algorithm for the lock server update lock server management scope and the lock server takeover relationship in the distributed system is the same.
  • the specific method can be obtained by hashing the ID of the lock server as described above, and will not be described in detail herein.
  • the update takeover relationship There are several ways to trigger the update takeover relationship. It can also be triggered by the management node. That is, after receiving the notification message of the management node, the lock server updates the management scope and the lock server takeover relationship. In this way, the management node needs to start a timer, and when the timer reaches a predetermined time, broadcast a notification message to the distributed system. After receiving the notification message of the management node, the non-second lock server that can work normally in the distributed system separately updates the local storage lock server management scope and the lock server takeover relationship.
  • step 307 the following steps may be included:
  • Step 308 After the first lock server exits the silent state, the first resource information record table is deleted.
  • the first resource information record table may be stored locally on the first lock server or may be stored in other servers. When stored in another server, the first lock server can notify other servers to delete the first resource information record table.
  • Recorded in the first resource information record table is a resource identifier of a resource to which the second lock server has assigned the lock authority, and the content thereof is, for example, "resource ID: assigned rights".
  • the first lock server can re-request the request for the lock on the resource within the silent range. Therefore, the first lock server may further include step 309 between step 301 and step 307.
  • Step 309 the first lock server receives a lock reaffirmation request, the lock reaffirms the identifier of the other target resource, and the lock authority assigned by the second lock server, wherein the second lock is The lock permissions assigned by the server are assigned to another target resource before the second lock server fails. Then, the first lock server reassigns the lock authority to the another target resource according to the lock authority that the second lock server has allocated, and the reassigned lock authority and the second lock server allocate another target resource before the fault The lock permissions are the same. Obviously, the rights holder of the reassigned lock privilege is also the same as the previous privilege owner. The lock reaffirms that the request was initiated by the host, and the first lock server can process the multiple lock reiterate requests before exiting the quiesce. After exiting silent, the lock re-request is no longer processed.
  • the second lock server occurs after assigning permission to the resource owner to write to a resource.
  • the first lock server after receiving the lock reiterate request, assigns the permission owner the write permission to the resource again.
  • the resource identifier of the resource to which the lock authority is assigned is stored in the takeover lock server, and when the lock server fails, the second lock server in the silent state
  • the takeover lock server determines whether the received lock request can be processed according to the stored resource identifier.
  • the system resources are occupied as little as possible, and only the resource identifier of the resource to which the lock authority is assigned is backed up.
  • the information on the lock server can also be fully backed up if system resources permit.
  • the detailed resource information record table of a lock server is backed up, for example, to the takeover lock server of the first lock server.
  • the processing method at this time is similar to the principle of the foregoing method, except that more information is backed up, which will occupy more system resources.
  • the backup on the first lock server has full lock permission, when the second lock server is taken over, the lock agent on all nodes is not required to re-apply the already applied lock permission to the takeover lock server. That is to say, the step of the first lock server also sending a query message to the lock agent of the non-faulty node mentioned in step 305 can be omitted, so that the silence time can be minimized.
  • the above method can be applied in a virtualized distributed system.
  • the lock server runs in a virtual machine.
  • a lock server's takeover lock server is set to be in a physical node, then when the current lock server fails, the takeover time can be shortened because the data transfer in the same physical node is faster.
  • a new lock server can be deployed on the node.
  • the another first lock server may be directly migrated to the node, that is, the address mapping relationship of the another first lock server may be modified; or a new lock server may be created on the node, and another first lock may be created.
  • the lock service on the server is migrated to the newly created lock server.
  • directly migrate another first lock server to the node Just go up.
  • the lock server management scope and the lock server takeover relationship of the lock server can be updated under certain conditions.
  • the non-fail lock server will update the lock server management scope and the lock server takeover relationship according to predetermined rules.
  • the management node may notify the non-second lock server in the distributed system to update the lock server management scope and the lock server takeover relationship, or may update the lock server takeover relationship to the updated lock server takeover relationship after the management node updates the lock server takeover relationship.
  • a lock server in a distributed system For example, when a new lock server is added, the management node notifies the lock server in the distributed system to update the lock server takeover relationship, respectively.
  • the update takeover relationship is based on two possibilities: one is that the locked server has failed, or the other is no longer used; the other is that a new lock server is added. The two cases are explained below.
  • the non-second lock server in the distributed system receives the first notification message of the management node (the first notification message is used to notify the non-second lock server in the distributed system)
  • the non-failed lock servers in the distributed system will update their own lock server management scope and lock server takeover relationship.
  • the non-second lock server may update its own lock server management scope and lock server takeover relationship according to a preset method (such as a consistent hash algorithm), or may update the lock server management scope and lock server takeover relationship by the management node. It is then broadcast to the lock server in the distributed system.
  • the lock server management range and the lock server takeover relationship of the non-second lock server may be stored in the non-second lock server or may be stored in the shared storage, which is not limited in the embodiment of the method.
  • the lock server in the distributed system When a new lock server is added to the distributed system, the lock server in the distributed system also needs to update its own lock server management scope and lock server takeover relationship.
  • the lock server in the distributed system receives the second notification message, where the second notification message carries the identifier of the newly added lock server.
  • the lock server in the distributed system updates its own lock server management scope and lock server takeover relationship.
  • the management node When a new lock server is added to the distributed system, the management node sends a second notification message to the lock server in the distributed system, and the second notification message carries the ID of the newly added lock server. After the lock server (including the newly added lock server) receives the second notification message, according to the predetermined rule Then (such as the consistency hash algorithm) calculate the new lock server management scope and lock server takeover relationship. Similarly, after the management node updates the lock server management scope and the lock server takeover relationship, the updated lock server management scope and the lock server takeover relationship are sent to each lock server in the distributed system.
  • the predetermined rule such as the consistency hash algorithm
  • each lock server determines its own new backup lock server according to the updated lock server management scope and the lock server takeover relationship, and sends the stored resource information record table or resource identifier. Give the new backup lock server.
  • the embodiment of the present invention further provides a lock request management device 4 for processing a lock request.
  • the lock management device 4 is, for example, a lock server, and its structure is as shown in FIG. It can be applied to FIG. 3 and the above-mentioned method embodiments. Since the method embodiment and the corresponding FIG. 3 have been described in detail, only the functions of each module of the lock management device 4 will be briefly described below, and detailed functions can be referred to. The previous method embodiment.
  • a plurality of lock request management devices 4 may constitute a distributed lock management system.
  • the lock server 4 includes a receiving module 41, a storage module 42 and a lock request processing module 43, and a silent module 44.
  • the receiving module 41 is configured to receive a first lock request and a notification message, where the first lock request carries a first resource identifier, and the storage module 42 is configured to store a lock management scope of another lock request management device. And a first resource information record table in which the resource identifier of the resource to which the lock authority has been assigned by the another lock request management device is recorded; the lock request processing module 43 is configured to process the a lock request received by the storage module 42; the silent module 44 is configured to set the lock request management device 4 to a silent state after the failure of the notification message to learn that the another lock request management device is faulty, and to silence The scope is the resource that the other lock request management device has assigned the permission; wherein, after entering the silent state, the lock request processing module 43 is specifically configured to: when the lock request belongs to the silent range The first resource information record table is queried, and if the first resource identifier is not recorded in the first resource information record table, the first lock request is given according to the first lock request The first resource allocation lock authority is described.
  • the receiving module 41 is further configured to receive a second lock request, where the second lock request is used to request to lock the second resource.
  • Narrative The second lock request carries the identifier of the second resource; the lock request processing module 43 is further configured to: after detecting that the second resource belongs to the management scope of the lock request management device 4, follow the second A lock request assigns lock rights to the second resource.
  • the receiving module 41 is further configured to receive a third lock request, where the third lock request is used to request to lock the third resource.
  • the third lock request carries an identifier of the third resource;
  • the lock request processing module 43 is further configured to: after detecting that the third resource belongs to the management scope of the another lock request management device And querying the first resource information record table, if the resource identifier of the resource requested by the third lock request is already recorded in the first resource information record table, rejecting the third resource lock assignment according to the third lock request Permissions.
  • the lock request processing module 43 is further configured to: receive a lock re-request request, the lock re-request request carries an identifier of the fourth resource, and the fourth The resource is allocated by the another lock request management device, the fourth resource is a resource to which the other lock request management device has been assigned rights; according to the other lock request management device has been assigned the authority, The fourth resource reassigns the same rights.
  • the storage module 42 is further configured to receive a first notification message, where the first notification message carries identification information of the another lock request management device, and the receiving module 41 is further configured to The identifier of the another lock request management device and the lock request management device 4 take over the relationship, and after determining that the lock request management device is the takeover lock request management device 4 of the other lock request management device, send another lock request management
  • the lock management scope of the device is provided to the storage module 42.
  • the foregoing storage module 42 is configured to store the lock management scope of the another lock request management device, and specifically includes: the storage module 42 is configured to receive from the receiving module 41 receives the lock management range of the other lock request management device and stores it.
  • the lock request management device 4 may further include a protocol server module 45 and a lock proxy module 46: the protocol server module 45 is configured to receive a message from the host and parse the message from the message. The first lock request forwards the first lock request to the lock proxy module 46; the lock proxy module 46 is configured to use the first resource identifier carried in the first lock request It is determined that, when it is determined that the first resource is the lock request processing module 43 , the first locking request is sent to the lock request processing module 43 by the receiving module 41 .
  • the silent module 44 is further configured to: after the allocating the rights to the resources that the other lock request management device has already assigned rights, the lock request management device exits the silent state; or After the set time, the lock request management device 4 exits the silent state.
  • the storage module 42 is further configured to: after the lock request management device 4 exits the silent state: update a management scope of the lock request management device, and an updated management scope of the lock request management device, The management scope of the lock request management device before the update and the management range of the other lock request management device are included.
  • the server 5 includes an interface 51, a memory 52, and a processor 53.
  • the server 5 can perform the method in the method embodiment, in particular, the steps of the method performed by its processor 53.
  • Interface 51 provides an external data interface
  • memory 52 provides a data storage space.
  • the interface 51 provides an external interface, such as receiving a lock request and a lock reiterate request.
  • the memory 52 is configured to store a lock management scope of another server, and a first resource information record table in which the resource that has been assigned the lock authority by the another server is recorded Resource ID. It can be seen from the method embodiment that the memory 52 can also be used to store other information, such as a second resource information record table and a detailed resource information record table. It can also be used to store the lock management scope of another server.
  • the processor 53 is configured to perform the various steps in the method embodiment by running the program. For example, setting the server to a silent state after learning that the another server is faulty, wherein the silent range of the silent state is a resource of another server that has been assigned rights; receiving the first locking request, The first lock request is used to request a lock on the first resource, and the first lock request carries a first resource identifier; detecting that the first resource belongs to a management scope of the another lock server; The first lock server queries the first resource information record table, and if the first resource identifier is not recorded in the first resource information record table, the first lock server follows the first A lock request assigns a lock authority to the first resource.
  • Each of the operations in the method embodiment can be performed by the processor 53. For example, silence, exit silence, query, judgment, and assign permissions.
  • the server 5 may further protocol the server module 54 and the lock proxy module 55.
  • the protocol server module 54 is configured to receive a message from the host, and parse the first lock request from the message; and further, forward the first lock request to the lock agent module.
  • the lock proxy module 55 is configured to determine, according to the first resource identifier carried in the first lock request, when it is determined that the first resource is the server 5, send the first lock request to The interface.
  • the first resource information recording table is not limited to a form or a form. Instead, it uses its stored content as its definition.
  • aspects of the invention, or possible implementations of various aspects may be embodied as a system, method, or computer program product.
  • aspects of the invention, or possible implementations of various aspects may be in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or a combination of software and hardware aspects, They are collectively referred to herein as "circuits," “modules,” or “systems.”
  • aspects of the invention, or possible implementations of various aspects may take the form of a computer program product, which is a computer readable program code stored in a computer readable medium.
  • the computer readable medium can be a computer readable signal medium or a computer readable storage medium.
  • the computer readable storage medium includes, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing, such as random access memory (RAM), read only memory (ROM), Erase programmable read-only memory (EPROM or flash memory), optical fiber, portable read-only memory (CD-ROM).
  • the processor in the computer reads the computer readable program code stored in the computer readable medium such that the processor is capable of performing the various functional steps specified in each step of the flowchart, or a combination of steps; A device that functions as specified in each block, or combination of blocks.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Hardware Redundancy (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Telephonic Communication Services (AREA)
  • Computer And Data Communications (AREA)

Abstract

本发明提出一种锁请求的处理技术,第一锁服务器是第二锁服务器的接管锁服务器,所述第一锁服务器在获知所述第二锁服务器发生故障后进入静默状态,静默范围是第二锁服务器已经分配过权限的资源,第一锁服务器收到原本发送给第二锁服务器加锁请求,如果第二锁服务器未曾给这个资源分配资源,则所述第一锁服务器按照加锁请求给所述给相应的资源分配锁权限。应用该方案,可以把锁服务器发生故障所影响的范围减小,提高所管理系统系统的稳定性。

Description

加锁请求的处理方法及服务器 技术领域
本发明涉及计算机技术,尤其是涉及一种锁请求的处理方法及其系统。
背景技术
在由多个主机组成的分布式系统中,由锁服务器实现在同一时间、多个节点对同一资源的互斥访问。当主机需要对资源进行一些操作时,首先需要向锁服务器请求锁权限,当主机获取锁权限之后,才能对资源进行相应的操作,例如读操作或者写操作。所以,锁服务器的性能高低、可用性以及可靠性直接影响到整个分布式系统的性能、可用性和可靠性。
在分布式系统中,主机通过NAS(Network Attacked Storage,网络附属存储)网络与节点通信。每个节点中都设置有锁服务器,各个节点还和存储系统相连,存储系统中存储有文件等资源。
当主机需要对存储系统中的资源进行操作(例如读操作或者写操作)时,先通过主机上的应用程序向锁服务器申请锁权限,获得所服务器给资源分配的锁权限之后,再对文件进行操作。资源的已分配的锁权限与应用程序之间的对应关系,可以存储在各个节点中,也可以存储在各个节点都可以访问的共享存储中。例如,主机需要对存储系统中的文件进行读操作时,首先向某个节点中的锁服务器申请该文件的锁权限,主机在获得该文件的锁权限之后,才能对该文件进行读操作。文件的锁权限与拥有锁权限的应用的对应关系存储在节点中,拥有锁权限的是节点或者节点中的应用。即使拥有锁权限的是节点,这个节点也可以通过进一步的分析,获知是节点中的哪个应用需要使用存储系统中的资源。
当某个锁服务器发生故障时,故障锁服务器上的业务需要切换到未发生故障的锁服务器(后文称为非故障锁服务器)上。在使用NFS(Network File System,网络文件系统)或者Samba等协议时,为了加速主机的访问效率,在故障锁服务器上的业务切换到非故障锁服务器上,主机可以通过锁重申请 求来重新申请各应用已经获取的文件的锁权限。这样,在分布式锁服务器集群中,需要安全的对锁重申请求和加锁请求进行控制,避免由于锁权限控制不当而导致多个应用看到的数据不一致,甚至多个应用同时读写数据时造成数据崩溃的问题。
而在现有技术中,当有锁服务器发生故障时,分布式系统中余下的全部锁服务器都进入静默状态。在静默期间,分布式系统中的锁服务器都只能处理重新申请锁权限的锁重申请求,不能处理申请新的锁权限的加锁请求。待故障锁服务器的已授权的锁权限的锁重申请求处理完成后,分布式系统中的锁服务器才退出静默状态,正常处理加锁请求。
在上述现有技术方案中,当分布式系统中有锁服务器发生故障时,分布式系统中的所有锁服务器需要进入静默状态,此时,对任意锁服务器发出的加锁请求,都会被拒绝处理,导致系统的性能和可靠性明显下降。
发明内容
本发明的第一方面,提供一种锁请求的处理方法,可以应用于第一锁服务器,其中,所述第一锁服务器是第二锁服务器的接管锁服务器,所述第一锁服务器存储有所述第二锁服务器的锁管理范围,该方法包括:所述第一锁服务器在获知所述第二锁服务器发生故障后进入静默状态,所述静默状态的静默范围是第二锁服务器已经分配过权限的资源;第一锁服务器接收第一加锁请求,所述第一加锁请求用于请求给第一资源加锁,所述第一加锁请求中携带有第一资源标识;所述第一锁服务器检测到所述第一资源属于所述第二锁服务器的管理范围;所述第一锁服务器查询第一资源信息记录表,所述第一资源信息记录表记录有已被所述第二锁服务器分配了锁权限的资源的ID,如果所述第一资源信息记录表中未记录所述第一资源标识,则所述第一锁服务器按照所述第一加锁请求给所述第一资源分配锁权限。
应用该方法,当第二锁服务器故障后,第一锁服务器仅部分进入静默,静默期间,可以对原本第二锁服务未曾分配锁权限的加锁请求进行处理。提高了系统效率。
此外,应用该方法,第一锁服务器静默期间,第一锁服务器原本的管理范围内的资源不纳入静默范围,因此可以正常处理。再者,由第一锁服务器、第二锁服务器以及其他锁服务器组成的分布式锁管理系统中,第一锁服务器静默期间,第一锁服务器、第二锁服务器之外的锁服务器可以不进入静默,继续正常工作。
第一方面的第一种可能实现方式中,进一步包括:所述第一锁服务器接收第二加锁请求,所述第二加锁请求用于请求给第二资源加锁,所述第二加锁请求中携带有第二资源的标识;所述第一锁服务器检测到所述第二资源属于所述第一锁服务器的管理范围;所述第一锁服务器按照所述第二加锁请求给所述第二资源分配锁权限。
应用该方法,第一锁服务器静默期间,第一锁服务器原本的管理范围内的资源不纳入静默范围,因此可以正常处理。
第一方面的第二种可能实现方式中,在前述任意方面或者任意实现方式的基础上,可以进一步包括:在所述第一锁服务器进入所述静默状态后,第一锁服务器接收第三加锁请求,所述第三加锁请求用于请求给第三资源加锁,所述第三加锁请求中携带有第三资源的标识;所述第一锁服务器检测到所述第三资源属于所述第二锁服务器的管理范围;所述第一锁服务器查询第一资源信息记录表,如果所述第一资源信息记录表中已经记录所述第三加锁请求所请求的资源的ID,则所述第一锁服务器拒绝按照第三加锁请求给第三资源分配锁权限。
应用该方法,对于已经由第二锁服务器分配过权限的资源加锁请求,拒绝处理,避免了加锁冲突。
第一方面的第三种可能实现方式中,在前述任意方面或者任意实现方式的基础上,可以进一步包括:所述第一锁服务器将所述第一资源标识记录到第二资源信息记录表中;其中,所述第二资源信息记录表用于记录所述第一锁服务器已分配了锁权限的资源的ID,所述第二资源信息记录表存储在第三锁服务器中。
应用该方法,可以对第一锁服务器加锁情况进行记录,当第一锁服务器未来故障后,可以由其对应的接管锁服务器进行接管。接管方法如述类似。
第一方面的第四种可能实现方式中,在前述任意方面或者任意实现方式的基础上,其中,所述第一锁服务器存储所述第二锁服务器的锁管理范围的步骤包括:所述第一锁服务器接收第一通知消息,所述第一通知消息中携带有所述第二锁服务器的标识信息;所述第一锁服务器根据所述第二锁服务器的标识和锁服务器接管关系,确定所述第一锁服务器为所述第二锁服务器的接管锁服务器;所述第一锁服务器接收所述第二锁服务器的锁管理范围并进行存储。
应用该方法,提供了一种第一锁服务器如何获得第二锁服务器的锁管理范围的方案。
第一方面的第五种可能实现方式中,在前述任意方面或者任意实现方式的基础上,可以进一步包括:协议服务器接收来自主机的报文,并从所述报文中解析出所述第一加锁请求;所述协议服务器把所述第一加锁请求转发给锁代理;所述锁代理根据第一加锁请求中携带的第一资源标识进行判断,当判断出管理所述第一资源的是所述第一锁服务器时,把所述第一加锁请求发送给所述第一锁服务器。
应用该方法,增加了协议服务器和锁代理,提供了一种由锁服务器、协议服务器和锁代理共同执行的锁管理技术。
第一方面的第六种可能实现方式中,在前述任意方面或者任意实现方式的基础上,可以进一步包括:所述第一锁服务器进入静默状态之后,接收锁重申请求,所述锁重申请求中携带第四资源的标识,以及所述第四资源由所述第二锁服务器分配的权限,所述第四资源是所述第二锁服务器已经分配过权限的资源;按照所述第二锁服务器已经分配的权限,给所述第四资源重新分配相同的权限。
应用该方法,在静默期间,通过对锁重申进行处理,恢复在第二锁服务器故障前已经分配的锁权限。
第一方面的第七种可能实现方式中,在前述任意方面或者任意实现方式的基础上,可以进一步包括:在给所述第二锁服务器已经分配过权限的资源全部重新分配权限后,所述第一锁服务器退出静默状态;或者,在达到预设时间后,所述第一锁服务器退出静默状态。
应用该方法,提供一种静默退出机制。避免锁服务器长时间处于静默状态。
可选的,第一方面的第八种可能实现方式:第一锁服务器退出静默后,所述第一锁服务器更新所述第一锁服务器的管理范围,更新后的所述第一锁服务器的管理范围,包括更新前的所述第一锁服务器的管理范围以及所述所述第二锁服务器的管理范围。
可选的,接管关系可以由管理节点计算后广播给各个锁服务器。也可以由各个锁服务器自行更新。
应用该方法,完成了第一锁服务器对第二锁服务器的整个接管流程。
第一方面的第九种可能实现方式:第一资源信息表可以存储在第一锁服务器中,也可以存储在其他锁服务器中,或者非锁服务器中,能够被第一锁服务器获取即可。
应用该方法,提高了第一只有信息表存储自由度,方便用户根据实际需要设计产品。
第一方面第十种可能的实现方式中,各方面以及各可能实现方式均可以运行在虚拟机环境中,也就是说锁服务器运行在虚拟机中。因此,锁服务器可以是硬件、执行硬件的软件、运行在虚拟机中的软件三种可能实现方式。
第一方面第十一种可能的实现方式中,在开始接管后,例如静默期间,第一锁服务器还向非故障节点的锁代理发送查询消息,非故障节点的锁代理收到查询消息后,向第一锁服务器发送反馈消息,反馈消息中携带本锁代理通过第二锁服务器申请到的锁权限,并由第一锁服务器记录到详细资源信息记录表中。
本发明还提供锁请求管理装置以及服务器的实现方式,具有以上第一方 面以及各可能实现方式的功能。
相应地,本发明还提供了非易失性计算机可读存储介质和计算机程序产品,当本发明提供的存储设备的内存加载非易失性计算机可读存储介质和计算机程序产品中包含的计算机指令,存储设备的中央处理单元(Center Processing Unit,CPU)执行该计算机指令时,分别使存储设备执行本第一方面以及各可能实现方式的各种可能实现方案。可以运行在装置或者服务器中得以执行。
附图说明
图1是本发明实施例提供的一种锁管理系统使用环境拓扑图。
图2是本发明提供的锁服务器管理范围和锁服务器接管关系实施例示意图。
图3是本发明实施例锁请求处理方法流程图。
图4是本发明锁请求管理装置实施例结构图。
图5是本发明服务器实施例结构图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。
本发明实施例中,本发明实施例提出建立各个锁服务器接管关系。这样,当其中一个锁服务器发生故障时,根据所述接管关系可以获得故障锁服务器的接管锁服务器。
锁服务器是能够处理锁请求的服务器。锁请求(lock request)可以是加锁请求(acquire lock request)或锁重申请求(reclaim lock request)。加锁请求可以是读锁请求或者写锁请求。加锁请求申请的是对某个资源进行加锁,加锁后权限拥有者获得锁权限,也就是说仅权限拥有者对这个资源拥有相应的操作权限。例如读锁请求是用于申请获得读取资源的权限;写锁请求是用于申请对某个资源写入数据的权限。锁重申请求,是权限拥有者重新申 请已经获得的锁权限。例如主机原本通过节点1访问存储系统,后来节点1出现了故障,主机改为通过节点2访问存储系统,主机通过向节点2发出锁重申请求,以获得之前已经获得的锁权限。
锁请求还可以包括释放锁请求(release lock request),用于释放对文件的锁权限,以便其他主机可以对文件申请锁权限。
当锁服务器发生故障时,称其为故障锁服务器。故障锁服务器的锁管理工作由其接管锁服务器接管。仅接管锁服务器进入静默状态,其余锁服务器不进入静默状态,可以正常处理锁请求。和现有技术相比,减小了锁服务器故障对整个系统造成的影响。
进一步的,即使已经进入静默状态的接管锁服务器,也仅针对部分资源进入静默状态,因此,仍然可以对一部分加锁请求(未进入静默状态的资源的加锁请求)进行正常的响应。进一步提高了锁服务器的利用率,并且降低了锁服务器进入静默后对系统造成的影响。对于静默状态所针对的资源,锁服务器对加锁请求不做处理。对于静默状态所针对的资源,可以对锁重申请求进行处理。对非静默状态所针对的资源的锁请求,锁服务器可以对加锁请求进行处理,例如对资源的读锁请求,给予锁权限;对资源的写锁请求,通过回收已分配写锁的方式,给予锁权限。针对部分资源进入静默状态,也可以看做对这部分资源的加锁请求进入静默状态,这部分资源是故障锁服务器分配过权限的资源,而进入静默状态的是故障锁服务器的接管锁服务器。
原本就由接管锁服务器管理的加锁请求,维持在正常态,不受静默状态影响。如果收到的加锁请求在故障锁服务管理范围内,并且在这之前,故障锁服务器不曾为这个锁请求锁请求的资源分配锁权限。那么接管锁服务器可以对这个锁请求进行正常响应,为其分配锁权限。如果收到的加锁请求在故障锁服务管理范围内,并且在这之前,故障锁服务器曾经为这个锁请求锁请求的资源分配过锁权限,那么就拒绝为其分配锁权限。
由于对于原本就由接管锁服务器管理的加锁请求,不受静默状态影响,不进入静默状态。和静默之前的处理方式相同,因此可以不用详细介绍。后 文中,在没有特别说明的情况下,“加锁请求”是指:原本在故障锁服务管理范围内,在故障锁服务器发生故障后,被接管锁服务器接管的加锁请求。
本发明实施例可以应用在分布式系统中,分布系统由多个节点组成,每个节点管理一部分文件的锁权限。节点例如是锁服务器,可以包括处理器、对外接口和存储器。当分布式系统中有锁服务器发生故障时,分布式系统中的非故障锁服务器进入静默状态,提出一种锁权限的管理方法。节点还可以集成协议服务器和锁代理,成为锁服务器、协议服务器和锁代理三者的组合。
锁服务器在分配锁权限之后,将分配了锁权限的资源的标识备份到指定的锁服务器中这个指定的锁服务器可以是锁服务器的接管锁服务器,也可以是能够被锁服务器的备份服务器访问的其他锁服务器。故障锁服务器的接管锁服务器在接收到加锁请求之后,根据备份的资源的标识确定加锁请求所请求的锁权限是否已分配,如果已分配,则返回拒绝的响应消息;如果未分配,则为主机分配加锁请求所请求的锁权限。节点可以仅包括锁服务器,还可以集成其他功能模块,例如协议服务器、锁代理。
锁服务器分配权限后生成分配记录,分配记录信息例如:{节点1,文件A,写入权限},这表示节点1分配了对文件A的写入权限;{节点2,文件B,读出权限},这表示节点2对文件B拥有读出权限。按照发出加锁请求的主机,协议服务器可以把节点的分配记录转换成主机的分配记录。例如,如果发出加锁请求的是主机1,把节点1转换成主机1,成为{主机1,文件A,写入权限},这表示主机1拥有对文件A的写入权限;节点可以把这个信息可以发送给对应的主机进行存储。
分配了锁权限的资源的标识备份到指定的服务器,例如分配权限的锁服务器的备份锁服务器,或者其他锁服务器。可以不用备份锁权限的具体内容。也就是说,指定的服务器知道哪些资源分配了锁权限,但是不知道锁权限是什么。由于可以只备份分配了锁权限的资源的标识,不备份锁权限的具体内容,因此对系统资源的占用也不多,不会对分布式系统的资源造成大的影响。
本发明实施例中,涉及的分布式系统由多个节点组成,主机通过NAS网 络和节点通信,节点与存储系统相连。存储系统中存储的资源供主机使用,主机通过节点申请对资源的锁权限,节点中的锁服务器对锁权限进行管理。
节点和存储设备可以分离,也可以组合在一起。每个节点中有协议服务器和锁代理。来自主机的锁请求可以是基于网络文件系统(Network File System,NFS)协议,或者基于服务信息块(Server Message Block,SMB)协议。协议服务器可以处理来自主机的一种或者多种协议,例如,NFS服务器支持NFS协议、SMB服务器支持SMB协议,不同的协议服务器与上层主机之间的通信工作原理类似。经过协议服务器处理后的锁请求可以被锁代理使用。
如附图1所示,以分布式系统中2个主机,4个节点为例进行说明。主机和节点的数量根据需求可以进行调整,其实现原理相同。主机通过节点访问存储系统。节点中,协议服务器与锁代理一一对应。例如,节点1中协议服务器1和锁代理1一一对应,节点2中协议服务器2和锁代理2一一对应,以此类推。按照对应关系进行协议服务器和锁代理之间的信号传递。
锁服务器可以与协议服务器和锁代理位于同一节点中,也可以单独位于独立的节点中,还可以位于其他节点中。节点内部使用总线等计算机内部协议进行通信。节点之间,可以使用FC、以太网等网络通信。在本发明实施例中,以服务器与协议服务器和锁代理位于一个节点中为例进行说明,例如,节点1中有协议服务器1、锁代理1和锁服务器1。每个锁服务器都可以为不同节点上的锁代理授予不同的锁权限。本节点上的锁代理可以从本节点的锁服务器上申请权限,还可以从其他节点上的锁服务器上申请锁权限。
分布式系统中可以单独设立一个管理节点来控制管理各个节点,也可以由任一节点来兼任控制管理所有节点。管理控制各节点的节点一般为主节点,也可以称之为管理节点。在本发明实施例中,对此不做限定,也未在图中单独示出。
当需要对存储系统中的资源(例如文件、目录、文件块或者数据块)进行读写操作时,主机通过网络发送锁请求给对应的协议服务器。主机可以根 据锁请求中携带的信息确定对应的协议服务器,也可以根据IP地址段确定对应的协议服务器,均可以采用现有的实现方式,在本发明实施例中不予限定。协议服务器接收到锁请求之后,将锁请求发送给与这个协议服务器对应的锁代理。
锁代理根据锁服务器管理范围确定由哪个锁服务器处理锁请求,然后将锁请求发送给确定的锁服务器进行处理。锁服务器管理范围可以预先设定,也可以利用哈希一致性环来确定。锁服务器管理范围可以存储在锁代理所在的节点的缓存中;也可以存储在共享存储中,供分布式系统中的锁代理都共享。
例如,锁代理2接收到加锁请求之后,根据本地存储的锁服务器管理范围确定出加锁请求应该有锁服务器3处理,将加锁请求发送给锁服务器3处理。也可以不在本地存储锁服务器的管理范围,而锁请求中携带有文件ID,锁代理通过查询或者计算,就可以知道这个文件的锁权限由哪个锁服务器管理。锁代理也可以直接将锁请求发送给位于同一节点中的锁服务器,同一节点中的锁服务器再根据锁服务器管理范围转发给负责处理该锁请求的锁服务器。例如锁代理2将接收到的加锁请求发送给锁服务器2,锁服务器2根据本地存储的锁服务器管理范围确定该加锁请求应该由锁服务器4负责处理,锁服务器2将该加锁请求转发给锁服务器4处理。这两种处理方式都可以利用现有的技术,在此不再另行描述。
锁服务器存储自己分配的锁权限。锁代理中存储自己向锁服务器所申请的锁权限。当分布式系统中锁服务器管理范围有变动时,分布式系统中的管理节点通知锁服务器和锁代理更新相应的锁服务器管理范围。也可以由管理节点更新锁服务器管理范围之后,把更新结果广播给分布式系统中的各个锁代理和锁服务器。
当锁服务器接收到锁请求之后,在锁服务器处于正常工作状态(即不处于静默状态)期间,锁服务器对锁请求的处理方式与现有技术相同,例如按照锁请求给主机分配锁权限,在此不再另行描述。
本发明实施例中的分布式系统也可以是一个虚拟化的分布式系统,锁服务器运行在虚拟机中。锁代理、协议服务器也可以运行在虚拟机中。由于其功能和非虚拟化环境下完全相同,因此不做单独介绍。
分布式系统中的锁服务器管理范围和锁服务器接管关系可以参见附图2。
如附图2所示,锁服务器在逻辑上组成一个环形。分布式系统中的锁服务器管理范围根据一致性哈希环的逆时针方向(在另外一种实施方式中,也可以采用顺时针方向)来确定,一致性哈希环是通过对分布式系统中锁服务器的ID进行哈希计算得到的。例如分布式系统中锁服务器1的ID为1,锁服务器2的ID为2,锁服务器3的ID为3,锁服务器4的ID为4,各个锁服务器分别对ID利用一致性哈希算法进行哈希计算,并按顺时针方向,将计算结果按从小到大的顺序进行排列,形成一致性哈希环。这样,各个锁服务器中得到的一致性哈希环是相同的。如附图2所示,一致性哈希环为0-232,对锁服务器的ID进行哈希计算得到的结果依次为hash(1)=5000,hash(2)=8000,hash(3)=1024,hash(4)=512,按顺时针方向,从0开始,锁服务器在哈希环上的位置依次是锁服务器4、锁服务器3、锁服务器1和锁服务器2。此时,锁服务器4的管理范围为(8000-232]和[0-512],锁服务器3的管理范围为(512,1024],锁服务器1的管理范围为(1024,5000],锁服务器2的管理范围为(5000,8000]。此时,在如图2由锁服务器逻辑上组成的环中,按照一致性哈希环的顺时针方向确定锁服务器之间的接管关系,即锁服务器1的接管锁服务器为2,锁服务器2的接管锁服务器为锁服务器4,锁服务器4的接管锁服务器为3,锁服务器3的接管锁服务器为1。
当然,本发明实施例中提出了一种确定锁服务器的接管服务器的方法。需要指出的是,接管关系并不唯一,只要能够使得每一个锁服务器拥有一个接管服务器即可。例如,也可以由管理员配置每个锁服务器的接管服务器,例如:锁服务器1的接管锁服务器配置为锁服务器2,锁服务器2的接管锁服务器配置为锁服务器3,锁服务器3的接管锁服务器配置为锁服务器4,锁服务器4的接管锁服务器配置为锁服务器1。
一个锁服务器可以对多个锁服务器进行接管。例如,如果锁服务器3和锁服务器1同时发生故障,那么它们二者的接管锁服务器为锁服务器4。
在本发明实施例中,锁代理接收到锁请求(例如锁重申请求或者加锁请求)后,根据存储的锁服务器管理范围确定该锁请求应该由哪个锁服务器处理。如果确定处理锁请求的锁服务器发生故障时(锁服务器发生故障时,管理节点会将通知消息广播给分布式系统中的锁代理),锁代理根据锁服务器接管关系确定接管锁服务器,将锁请求发送给接管锁服务器处理。
锁服务器管理范围和锁服务器接管关系,可以由管理节点统一配置,并发送给所有的锁代理保存;也可以由管理节点计算出一致性哈希环之后发送给各个锁代理,还可以通过管理节点事先配置锁代理,由各个锁代理分别计算得到相同的一致性哈希环。
锁代理接收到锁请求之后,对锁请求中携带的文件标识利用一致性哈希算法进行哈希计算,看计算出的结果落入哪个范围,则由对应的锁服务器负责处理。例如,锁请求为加锁请求,加锁请求中携带的文件标识(例如文件名)为(foo1.txt),锁代理对(foo1.txt)进行哈希计算,得到的结果为4500,应该由锁服务器1管理,锁代理将加锁请求发送给锁服务器1。再如,锁请求为锁重申请求,重申锁请求中携带的文件信息为(foo8.txt),锁代理对(foo8.txt)进行哈希计算,得到的结果为9000,应该由锁服务器4管理,锁代理将锁重申请求发送给锁服务器4。
在接管锁服务器处于静默期期间,对于静默范围内的资源,主机可以使用锁重申请求,从接管锁服务器那里重新获得自己以前从故障锁服务器申请的权限。如果接管锁服务器退出静默期后,锁重申请求还没执行完成,则不再执行未执行的锁重申请求。关于锁重申请求的部分,具体可以参见步骤309。
当某个锁服务器发生故障后,锁代理将一致性哈希环中发生故障的锁服务器标识为故障。当锁代理接收到锁请求之后,对锁请求中携带的文件标识进行哈希计算,根据锁服务器管理范围确定计算出来的结果落入哪个锁服务器的管理范围,如果确定出的锁服务器为故障状态,锁代理再根据锁服务器 接管关系确定所述故障锁服务器的接管锁服务器,将锁请求发送给接管锁服务器处理。接管锁服务器收到锁请求后,根据文件标识进行哈希计算得到哈希值,发现哈希值落入了自己的接管范围,因此自己需要对锁请求进行处理。如果是其他非锁服务器收到锁请求后,根据文件标识进行哈希计算得到哈希值,发现哈希值没有落入自己的接管范围,那么不做处理。
例如,分布式系统中的锁服务器2发生故障,锁代理在接收到通知消息后将一致性哈希环中的锁服务器2标识为故障。锁代理接收到锁重申请求中携带的文件信息为(foo5.txt),锁代理对(foo5.txt)进行哈希计算,得到的结果是7000,根据锁管理范围,应该由锁服务器2负责处理。但是锁服务器2此时处于故障状态,根据锁服务器接管关系,故障锁服务器2的接管锁服务器是锁服务器4,因此锁代理将重申锁请求发送给接管锁服务器4处理。锁服务器4对(foo5.txt)进行哈希计算,得到的接管是7000,属于自己的接管范围,因此对锁重申请求进行处理。
当有主机需要对分布式系统中的某个文件进行操作时,主机利用主机上的应用向协议服务器发送锁请求,协议服务器将锁请求发送给对应的锁代理,锁代理对锁请求中携带的文件的标识(标识例如是FSID,或者是FID)进行哈希计算,根据计算结果确定该文件属于哪个锁服务器的管理范围,将锁请求发送给该锁服务器进行相应的处理。对文件的标识进行哈希计算的哈希算法需要与生成一致性哈希环的哈希算法相同。例如,锁请求中携带的文件标识为(foo2.txt),锁代理对该文件标识(foo2.txt)进行哈希计算,得到结果为6500,我们可以看出,如附图2所示,落入一致性哈希环上的锁服务器1与锁服务器2之间范围,为锁服务器2的管理范围,该锁请求由锁服务器2处理。
当锁服务器2发生故障时,锁代理将一致性哈希环中的锁服务器2标识为故障。此时锁代理接收到锁请求后,对锁请求中携带的文件信息(foo3.txt)进行哈希计算,得到结果为7500,落入一致性哈希环上的锁服务器1与锁服务器2之间范围,但是由于锁服务器2处于故障状态,根据一致性哈希环, 锁服务器2的接管锁服务器为锁服务器4,即,范围锁服务器4的管理范围,因此,锁代理将锁请求发送给锁服务器4处理。
根据节点的名称或者锁服务器ID利用一致性哈希算法得到一致性哈希环的方法可以采用已有的技术,在此不再赘述。
基于附图2所示的分布式系统中的锁服务器管理范围和锁服务器接管关系,本发明实施例提供一种分布式系统中锁权限的处理方法,此方法实施例应用于锁服务器,本方法实施例中涉及的协议服务器与锁代理的实现方法流程与前文描述的方法相同,在本方法实施例中不再另行说明,具体的流程如附图3所示。
本方法可以应用在附图1所示的分布式系统。在本发明实施例的分布式系统中,有4个锁服务器,分别为锁服务器1,锁服务器2,锁服务器3和锁服务器4。本实施例中的锁服务器的数量仅是为了示例性的说明,具体数量以实际的业务需要为准,其实现原理和本实施例相同。
下面以接管锁服务器是第一锁服务器,发生故障的锁服务器是第二锁服务器为例对加锁请求的处理进行详细说明,参见附图3。
步骤301,当分布式系统中有锁服务器发生故障时,管理节点会向分布式系统中的锁服务器广播通知消息。本实施例中,发生故障的是第二锁服务器,因此通知消息中携带有第二锁服务器的ID作为第二锁服务器的标识信息。为了与其他的通知消息相区分,将通知有锁服务器发生故障的通知消息称之为第一通知消息。
收到第一通知消息的锁服务器,根据第一通知消息中携带的ID和本地存储的锁服务器管理范围,判断自己是不是第二锁服务器的接管锁服务器,如果是,则针对第二锁服务器原来已分配了权限的资源进入静默状态;否则不进入静默状态。
此外,收到第一消息后,第一锁服务器可以启动定时器。将来在定时器到达预设时间后,第一锁服务器退出静默,更新第一锁服务器的接管关系。
另外一种检测故障的方式是,接管锁服务器按照接管关系信息定时向对 应的锁服务器发送检测消息,当检测到对应的锁服务器故障后,进入静默状态。
步骤302:第一锁服务器接收加锁请求,所述加锁请求中携带目标资源的标识。目标资源是需要被加锁的资源,是加锁请求的请求对象,或者说,是等待被分配锁权限的资源。
主机对存储系统中的资源进行读或写等操作时,需要通过协议服务器和锁代理向锁服务器发送加锁请求,加锁请求中携带有请求加锁的资源的资源标识。资源标识可以是需要操作的文件的ID或者是需要操作的逻辑单元号(Logic Unit Number,LUN)的ID,加锁请求所请求的是对资源进行读出或者对资源进行写入的权限。
第一锁服务器根据资源标识,判断请求的资源是否属于自己的管理范围。例如,对资源标识进行哈希运算,如果得到的值属于第一锁服务器预设的哈希值范围,那么就属于第一锁服务器的管理范围;否则,不属于第一锁服务器的管理范围。
加锁请求所请求加锁的目标资源,位于存储系统,由锁服务器对其锁权限进行管理。这个过程在前文已进行了介绍,在此不再另行说明。
如何把原本发送给第二锁服务器的锁请求会发送给第一锁服务器处理。实现办法有很多种,例如在锁服务器和主机直接设置路由器,路由器中记录接管关系,当第二锁服务器发生故障后,路由器把原本发送给第二锁服务器的锁请求发送给第二锁服务器的接管锁服务器。如果和第二锁服务器在同一节点的锁代理没有故障,可以采用前述的方案,由锁代理把原本发送给第二锁服务器的锁请求发送给第二锁服务器的接管锁服务器。
在附图1所示的分布式系统中,假设锁服务器1发生了故障,也就是说锁服务器1为第二锁服务器。根据附图2中所示的锁服务器管理范围和锁服务器接管关系,锁服务器2是锁服务器1的接管锁服务器,那么这里的第一锁服务器是锁服务器2。
步骤303:所述第一锁服务器查询第一资源信息记录表,所述第一资源信 息记录表记录有第二锁服务器已分配了锁权限的资源的资源标识。
第二锁服务器的接管锁服务器处于静默状态。包括第一锁服务器在内的各锁服务器,在接收到加锁请求之后,先确认自己目前是否处于静默状态。如果处于静默状态,进一步判断加锁请求中携带的标识,得知这个加锁请求属于自己接管范围内的加锁请求,则执行步骤303。如果收到加锁请求的不是第一锁服务器,而是其他锁服务器,则不执行步骤303,退出整个流程。
节点故障后,节点中的协议服务器成为故障协议服务器。原本通过故障协议服务器访问存储系统的主机,需要改为通过故障协议服务器的接管协议服务器访问存储系统。也就是说,由接管协议服务器接管故障协议服务器的工作。在接管协议服务器完成接管之前,其所在节点处于静默状态(静默范围是故障锁服务器已经分配权限的资源),在完成接管后,静默节点退出静默状态。此外,如果超出预设的时间后,即使接管未完成,静默节点退出静默状态。
静默锁服务器所在的节点也处于静默状态,也就是说如果节点由锁服务器和协议服务器、锁代理共同组成,那么节点的协议服务器、锁代理也会进入静默状态。在静默期间,接管协议服务器接管故障协议服务器的工作,接管的过程包括:主机通过锁重申请求,重新申请以前通过故障协议服务器拥有的权限。
在现有技术的实现中,处于静默状态的锁服务器,对任何锁请求会直接返回拒绝的响应消息。在本发明实施例中,第一锁服务器在处于静默状态时,会查询第一资源信息记录表。第一资源信息记录表中存储有第二锁服务器已分配了锁权限的资源的资源标识。第一资源信息记录表可以不存储没有锁权限的具体内容,例如具体是读权限还是写权限,因此占用存储空间大大减少。第一资源信息记录表可以存储在第一锁服务器本地;也可以存储在其他服务器中;可以同时存储在第一锁服务器本地和指定的其他服务器中。在进入静默状态之后,将资源信息记录表存储到其他指定的锁服务器中。例如,资源信息记录表存储在第一锁服务器本地,在第一锁服务器进入静默状态之后, 第一锁服务器将资源信息记录表发送到第一锁服务器的接管锁服务器中存储。或者在第一锁服务器的资源信息记录发生改变后,及时同步到第一锁服务器的接管锁服务器中存储,保持同步。
在本发明实施例中,锁服务器在首次收到对一个资源的加锁请求时,会把这个资源已经分配了锁权限这一信息,发送给锁服务器的接管锁服务器,在接管锁服务器上把这个信息存入第一资源信息记录表。锁服务器发送这个信息的时机,也可以是在为这个资源分配锁权限之后。如果这个锁服务器后续又收到对同一个资源的锁请求,不论请求的权限是否相同,都不再把这一信息发送给锁服务器的接管锁服务器。
具体的实现办法是:锁服务器每收到一个加锁请求时,判断“这个资源已经分配了锁权限”这一通知信息是否已经发送给锁服务器的接管锁服务器,如果没有,则发出通知消息;否则不发出通知消息。
第一资源信息记录表中除了存储第二锁服务器已分配锁权限的资源的资源标识外,还可以存储第一锁服务器已分配锁权限的资源标识,供第一锁服务器在退出静默状态后处理锁请求时查询。
步骤303之后,执行步骤304或者步骤305。
步骤304:当所述第一资源信息记录表中有所述目标资源标识时,所述第一锁服务器返回拒绝的响应消息。
当资源信息记录表中存储有所述资源标识时,说明该资源已经由第二锁服务器分配了锁权限。此时第一锁服务器不能处理加锁请求,以免造成同一资源的锁权限冲突,第一锁服务器通过锁代理和协议服务器向主机返回拒绝的响应消息。
当然,在另外的实施例中,如果对于另外的加锁请求,经过查询发现已经由第一锁服务器分配了权限的,第一锁服务器也会返回拒绝消息。关于这一点,不再详述。
步骤305:当所述资源信息记录表中没有所述资源标识时,第一锁服务器按照加锁请求所请求的权限为所述资源分配锁权限,通过锁代理和协议服务 器向主机返回分配的锁权限。
当第一资源信息记录表中没有存储所述资源标识时,说明该资源标识对应的资源没有被分配过锁权限,此时没有主机在对所述资源标识对应的资源进行操作。因此,第一锁服务器可以为所述资源标识对应的资源分配锁权限。第一锁服务器通过相应的锁代理和协议服务器将分配的锁权限返回给发出请求的主机,允许发出请求的主机对资源进行操作。
这样,通过本发明实施例提供的分布式系统中加锁请求的处理方法,当分布式系统中有锁服务器发生故障时,第二锁服务器的接管锁服务器可以处理一部份加锁请求,只有在请求的资源已经分配了锁权限的情况下,才不能处理针对已分配了锁权限的资源的加锁请求。因此,本实施例把分布式系统中锁服务器发生故障时影响的范围进行了更精确的控制和缩小,提高了分布式系统的性能和可靠性。
在分配了锁权限后,第一锁服务器可以把分配的权限记录在本地的详细资源信息记录表中。详细资源信息记录表中记录有权限的具体内容,例如资源标识、锁权限、锁权限类型、锁权限当前状态。详细资源信息记录表和第一资源信息记录表可以是分离的,也可以是整合在一起的。
在第一锁服务器接管了故障服务器后,第一锁服务器还向非故障节点的锁代理发送查询消息。每个非故障节点的锁代理收到查询消息后,向第一锁服务器发送反馈消息,反馈消息中携带本锁代理通过第二锁服务器申请到的锁权限,并记录到锁服务器的详细资源信息记录表中。因此,详细资源信息记录表中记录的信息得到更新,除了记录第一锁服务器分配的权限的具体内容,还有记录第二锁服务器分配的权限的具体内容。
如前所述,步骤304和305提及的加锁请求,都是按照接管范围原本应该由第二锁服务器处理的加锁请求。即使接管锁服务器进入静默状态,对于按照原本就在接管锁服务器处理范围内的资源不进入静默,对这些资源的加锁请求可以像没有进入静默状态那样处理。
步骤305之后,所述方法实施例还可以执行步骤306。
步骤306:所述第一锁服务器将所述目标资源标识存储到第二资源信息记录表中。所述第二资源信息记录表和第一资源信息记录表的形式类似,用于记录第一锁服务器已分配了锁权限的资源的资源标识。以便当第一锁服务器故障后,第一锁服务器的接管锁服务器可以对第一锁服务器进行接管,具体步骤和步骤302——步骤305类似,此处不详述。
当所述第一锁服务器不处于静默状态时,第一锁服务器在为加锁请求中目标资源分配锁权限之后,将目标资源标识记录到第二资源信息表中。第二资源信息表存储的是第一锁服务器分配了锁权限的资源的资源标识。
通过执行步骤306,当第一锁服务器故障,所述第一锁服务器的接管锁服务器从非静默状态转换到静默状态,静默范围是第一锁服务已分配权限的资源。对于第一锁服务器的接管锁服务器收到的加锁请求,如果第二资源信息记录表中未记录目标资源标识时,第一锁服务器的接管锁服务器即可按照步骤305的操作为所述目标资源分配锁权限;反之,则像步骤304一样,返回拒绝的响应消息。
在现有技术的实现中,锁服务器给资源分配锁权限之后,会在锁服务器本地存储必要的信息,例如资源标识、锁权限、锁权限类型、锁权限当前状态等,在此不再另行说明。在本发明实施例中,锁服务器为资源分配锁权限之后,还将分配了锁权限的资源的资源标识另外单独存储。可选的,锁服务器将资源标识存储在单独的一个资源信息记录表中,并将所述资源信息记录表存储到锁服务器的接管锁服务器中。
可选的,步骤304或者步骤306之后,所述方法实施例还可以包括以下步骤307。
步骤307,退出静默状态。
所有锁代理都已经把通过第二锁服务器申请的锁请求内容上报给了第一锁服务器后,意味着接管工作已经完成,也可以提前退出静默。
此外,可以预设一个接管时间,到达预定的时间时,不论接管工作是否完成,所述第一锁服务器退出静默状态。
退出静默状态后,可以更新第一锁服务器管理范围和锁服务器接管关系。第一锁服务器新的管理范围发生了扩大,是自己旧的管理范围和第二锁服务器的管理范围这二者的集合。本步骤执行后,完成了第一锁服务器对第二锁服务器的接管。相应的,系统中的接管范围也要发生改变,比如第一锁服务器的接管锁服务器(命名为第三锁服务器)的接管范围也要随着第一锁服务器的管理范围机械能更新。
分布式系统中的锁服务器接收到第一通知消息之后,可以启动定时器,当到达预定的时间时,处于静默状态的锁服务器退出静默状态。如前文所述,第一通知消息是管理节点通过广播的方式发出,用于通知分布式系统中的锁服务器有锁服务器发生故障。当分布式系统中的非第二锁服务器接收到第一通知消息之后,根据本地或者共享存储中存储的锁服务器接管关系确定自己是否为第二锁服务器的接管锁服务器;如果是第二锁服务器的接管锁服务器,则进入静默状态,并启动定时器。当到达预定的时间时,退出静默状态,更新锁服务器管理范围和锁服务器接管关系;如果不是第二锁服务器的接管锁服务器,则不进入静默状态,保持正常工作。
此外,在接收到第一通知消息之后,非接管锁服务器还可以将本地存储的锁服务器管理范围和锁服务器接管关系中的第二锁服务器标示为故障状态,非接管锁服务器更新锁服务器管理范围和锁服务器接管关系。
分布式系统中锁服务器更新锁服务器管理范围和锁服务器接管关系的算法相同,具体方式可以如前文所述的对锁服务器的ID进行哈希运算得到,在此不再详述。
更新接管关系的触发方式由多种。也可以由管理节点触发。也就是说,在接收到管理节点的通知消息之后,锁服务器更新管理范围和锁服务器接管关系。这样,管理节点需要启动定时器,当定时器到达预定的时间时,向分布式系统中广播通知消息。分布式系统中可以正常工作的非第二锁服务器接收到管理节点的通知消息后,分别更新本地存储的锁服务器管理范围和锁服务器接管关系。
可选的,步骤307之后可以包括以下步骤:
步骤308,所述第一锁服务器退出静默状态后,删除第一资源信息记录表。
第一资源信息记录表可以存储在第一锁服务器的本地,也可以存储在其他服务器中。当存储在其他服务器中时,第一锁服务器可以通知其他服务器删除第一资源信息记录表。
第一资源信息记录表中记录的是第二锁服务器已经分配了锁权限的资源的资源标识,其内容例如是:“资源ID:已分配权限”。当第一锁服务器接管第二锁服务器的锁业务之后,即第一锁服务器进入静默状态,当接收到加锁请求时,第一锁服务器根据第一资源信息记录表来确定是否为加锁请求中携带的资源标识对应的资源分配锁权限。当第一锁服务器退出静默状态之后,即按正常的流程处理加锁请求,此时第一资源信息记录表中记录的信息不再作为如何处理锁请求的依据,因此,此时可以将第一资源信息记录表删除。这样,可以将存储信息最小化,尽量少的占用分布式系统的系统资源,将对分布式系统性能的影响减小到最小。
如前所述,在静默后,第一锁服务器可以对静默范围内的资源的锁重申请求继续处理。因此,第一锁服务器在步骤301和步骤307之间,还可以包括步骤309。
步骤309,第一锁服务器接收锁重申请求,所述锁重申请求中携带另一目标资源的标识,以及所述另一目标资源由所述第二锁服务器分配的锁权限,其中,第二锁服务器分配的锁权限是在第二锁服务器故障之前给另一目标资源分配的。接着,第一锁服务器按照所述第二锁服务器已经分配的锁权限,给所述另一目标资源重新分配锁权限,重新分配的锁权限和第二锁服务器在故障前给另一目标资源分配的锁权限相同。显然,重新分配的锁权限的权限拥有者也与之前的权限拥有者相同。锁重申请求是由主机发起的,第一锁服务器在退出静默之前,可以对多个锁重申请求进行处理。退出静默后,不再处理锁重申请求。
例如,第二锁服务器在给权限拥有者分配了对某个资源写权限后发生故 障,第一锁服务器在收到锁重申请求后,再次给权限拥有者分配对这个资源的写权限。在本发明实施例中,锁服务器给资源首次分配了锁权限之后,将分配了锁权限的资源的资源标识存储到接管锁服务器中,当锁服务器发生故障时,处于静默状态的第二锁服务器的接管锁服务器根据存储的资源标识判断是否可以处理接收到的加锁请求。这样,可以将锁服务器发生故障时影响到的加锁请求的范围减小到最小,而且由于只存储了资源标识,占用很少的系统资源,对系统的性能影响也很小,提高了整个分布式系统的稳定性和可靠性。
在上文提到的方法中,为了实现精简备份,尽量少的占用系统资源,只将分配了锁权限的资源的资源标识进行了备份。在系统资源允许的情况下,也可以将锁服务器上的信息进行完全备份。即将某个锁服务器的详细资源信息记录表全部备份,例如备份到第一锁服务器的接管锁服务器上。此时的处理方法与前述方法原理类似,只是备份的信息更多,将占用较多的系统资源。但是由于第一锁服务器上备份有完整的锁权限,因此在接管第二锁服务器时,不需要所有节点上的锁代理重新上报已经申请的锁权限给接管锁服务器。也就是说,步骤305中提及的,第一锁服务器还向非故障节点的锁代理发送查询消息的步骤可以省去,因此可以将静默时间缩至最短。
上述方法可以应用在虚拟化分布式系统中。此外,在虚拟化分布式系统中,锁服务器运行在虚拟机中。
在虚拟化场景中,如果把一个锁服务器的接管锁服务器设置为在一个物理节点中,那么当前一个锁服务器故障时,由于同一个物理节点中数据传输的更快,可以缩短接管时间。
在虚拟化场景中,当第二锁服务器所在的节点恢复正常时,可以在节点上部署新的锁服务器。可以直接将所述另一第一锁服务器迁移到该节点上,即修改所述另一第一锁服务器的地址映射关系;也可以在该节点上创建新的锁服务器,将另一第一锁服务器上的锁业务迁移到新创建的锁服务器中。在虚拟化分布式系统中,为了简化操作,直接将另一第一锁服务器迁移到节点 上即可。
分布式系统中锁服务器的锁服务器管理范围和锁服务器接管关系满足一定条件下,可以进行更新。如前文所述的,当分布式系统中有锁服务器发生故障后,非故障锁服务器将依据预定的规则更新锁服务器管理范围和锁服务器接管关系。另外,可以由管理节点通知分布式系统中的非第二锁服务器更新锁服务器管理范围和锁服务器接管关系,也可以由管理节点更新锁服务器接管关系之后,将更新后的锁服务器接管关系广播给分布式系统中的锁服务器。例如,当有新的锁服务器加入时,管理节点通知分布式系统中的锁服务器分别更新锁服务器接管关系。
更新接管关系基于两种可能:一种是有锁服务器发生故障,或者其他原因不再使用;另一种是有新的锁服务器加入。下面对这两种情况分别说明。
结合前文所述的方法实施例,分布式系统中的非第二锁服务器在接收到管理节点的第一通知消息后(所述第一通知消息用于通知分布式系统中的非第二锁服务器分布式系统中有锁服务器发生了故障),分布式系统中的各非故障锁服务器将更新自己的锁服务器管理范围和锁服务器接管关系。非第二锁服务器可以依据预先设定的方法(如一致性哈希算法)来更新自己的锁服务器管理范围和锁服务器接管关系,也可以由管理节点更新了锁服务器管理范围和锁服务器接管关系之后广播给分布式系统中的锁服务器。非第二锁服务器的锁服务器管理范围和锁服务器接管关系可以存储在非第二锁服务器本地,也可以存储在共享存储中,在本方法实施例中不做限定。
当分布式系统中有新的锁服务器加入时,分布式系统中的锁服务器也需要更新自己的锁服务器管理范围和锁服务器接管关系。分布式系统中的锁服务器接收第二通知消息,所述第二通知消息中携带新加入的锁服务器的标识。分布式系统中的锁服务器更新自己的锁服务器管理范围和锁服务器接管关系。
当分布式系统中有新的锁服务器加入时,管理节点向分布式系统中的锁服务器发送第二通知消息,第二通知消息中携带有新加入的锁服务器的ID。锁服务器(包括新加入的锁服务器)接收到第二通知消息后,根据预定的规 则(如一致性哈希算法)计算得到新的锁服务器管理范围和锁服务器接管关系。同样的,也可以由管理节点更新锁服务器管理范围和锁服务器接管关系之后,将更新后的锁服务器管理范围和锁服务器接管关系发送给分布式系统中的各个锁服务器。
当锁服务器管理范围和锁服务器接管关系更新后,各锁服务器根据更新后的锁服务器管理范围和锁服务器接管关系来确定自己新的备份锁服务器,将需要存储的资源信息记录表或者资源标识发送给新的备份锁服务器。
本发明实施例还提供了处理锁请求的锁请求管理装置4,锁管理装置4例如是锁服务器,其结构如附图4所示。可以应用于附图3以及上述方法实施例,由于在方法实施例以及对应的附图3中已经有了详细的说明,因此下面仅简单描述锁管理装置4各模块的功能,详细功能均可参考前面的方法实施例。多个锁请求管理装置4可以组成一种在分布式锁管理系统。
所述锁服务器4包括接收模块41、存储模块42以及锁请求处理模块43,以及静默模块44。
其中,接收模块41,用于接收第一加锁请求和通知消息,所述第一加锁请求中携带有第一资源标识;存储模块42,用于存储另一锁请求管理装置的锁管理范围,以及第一资源信息记录表,所述第一资源信息记录表中记录有已被所述另一锁请求管理装置分配了锁权限的资源的资源标识;锁请求处理模块43,用于处理所述存储模块42接收到的加锁请求;静默模块44,用于在通过所述通知消息获知所述另一锁请求管理装置发生故障后,将所述锁请求管理装置4设置为静默状态,静默范围是所述另一锁请求管理装置已经分配过权限的资源;其中,所述锁请求处理模块43,在进入所述静默状态后,具体用于:当所述加锁请求属于所述静默范围时,查询第一资源信息记录表,如果所述第一资源信息记录表中未记录所述第一资源标识,则所述按照所述第一加锁请求给所述第一资源分配锁权限。
可选的,在所述锁请求管理装置4进入静默状态后:所述接收模块41还用于接收第二加锁请求,所述第二加锁请求用于请求给第二资源加锁,所述第 二加锁请求中携带有第二资源的标识;所述锁请求处理模块43还用于,在检测到所述第二资源属于所述锁请求管理装置4的管理范围后,按照所述第二加锁请求给所述第二资源分配锁权限。
可选的,在所述锁请求管理装置4进入所述静默状态后:所述接收模块41还用于接收第三加锁请求,所述第三加锁请求用于请求给第三资源加锁,所述第三加锁请求中携带有第三资源的标识;所述锁请求处理模块43还用于,在检测到到所述第三资源属于所述另一锁请求管理装置的管理范围后,查询第一资源信息记录表,如果所述第一资源信息记录表中已经记录所述第三加锁请求所请求的资源的资源标识,则拒绝按照第三加锁请求给第三资源分配锁权限。
可选的,在所述锁请求管理装置进入静默状态之后,所述锁请求处理模块43还用于:接收锁重申请求,所述锁重申请求中携带第四资源的标识,以及所述第四资源由所述另一锁请求管理装置分配的权限,所述第四资源是所述另一锁请求管理装置已经分配过权限的资源;按照所述另一锁请求管理装置已经分配的权限,给所述第四资源重新分配相同的权限。
可选的,所述存储模块42,还用于接收第一通知消息,所述第一通知消息中携带有所述另一锁请求管理装置的标识信息;所述接收模块41,还用于根据所述另一锁请求管理装置的标识和锁请求管理装置4接管关系,确定所述锁请求管理装置是所述另一锁请求管理装置的接管锁请求管理装置4之后,发送另一锁请求管理装置的锁管理范围给所述存储模块42;而前述所述存储模块42用于存储所述另一锁请求管理装置的锁管理范围,具体包括:所述存储模块42用于从所述接收模块41接收所述另一锁请求管理装置的锁管理范围并进行存储。
可选的,所述锁请求管理装置4还可以进一步包括协议服务器模块45和锁代理模块46:所述协议服务器模块45,用于接收来自主机的报文,并从所述报文中解析出所述第一加锁请求,把所述第一加锁请求转发给所述锁代理模块46;所述锁代理模块46,用于根据第一加锁请求中携带的第一资源标识 进行判断,当判断出的管理所述第一资源的是所述锁请求处理模块43时,通过所述接收模块41把所述第一加锁请求发送给所述锁请求处理模块43。
可选的,所述静默模块44还用于:在给所述另一锁请求管理装置已经分配过权限的资源全部重新分配权限后,所述锁请求管理装置退出静默状态;或者,在达到预设时间后,所述锁请求管理装置4退出静默状态。
可选的,所述存储模块42还用于,在所述锁请求管理装置4退出静默状态之后:更新所述锁请求管理装置的管理范围,更新后的所述锁请求管理装置的管理范围,包括更新前的所述锁请求管理装置的管理范围以及所述另一锁请求管理装置的管理范围。
本发明实施例还提供了另一种在分布式系统中服务器5,其结构如附图5所示。服务器5包括:接口51、存储器52以及处理器53。服务器5可以执行方法实施例中的方法,具体而言,是由其处理器53执行方法的步骤。接口51提供对外的数据接口,而存储器52提供数据存储空间。下面仅对其简单介绍,具体内容从参见前文。
接口51,提供对外接口,例如接收加锁请求、锁重申请求。
存储器52,被配置为用于存储另一服务器的锁管理范围,以及第一资源信息记录表,所述第一资源信息记录表中记录有已被所述另一服务器分配了锁权限的资源的资源标识。由方法实施例可知,存储器52还可以用于存储其他信息,例如第二资源信息记录表、详细资源信息记录表。还可以用于存储另一服务器的锁管理范围。
处理器53,被配置为用于通过运行出程序执行方法实施例中各个步骤。例如:在获知所述另一服务器发生故障后将所述服务器设置为静默状态,其中,所述静默状态的静默范围是另一服务器的已经分配过权限的资源;接收第一加锁请求,所述第一加锁请求用于请求给第一资源加锁,所述第一加锁请求中携带有第一资源标识;检测到所述第一资源属于所述另一锁服务器的管理范围;所述第一锁服务器查询第一资源信息记录表,如果所述第一资源信息记录表中未记录所述第一资源标识,则所述第一锁服务器按照所述第一加 锁请求给所述第一资源分配锁权限。
方法实施例中的各个操作均可以由处理器53执行。例如静默、退出静默、查询、判断以及分配权限等。
可选的,服务器5还可以进一步协议服务器模块54和锁代理模块55。
其中,协议服务器模块54,用于接收来自主机的报文,并从所述报文中解析出所述第一加锁请求;以及还用于把所述第一加锁请求转发给锁代理模块。锁代理模块55,用于根据第一加锁请求中携带的第一资源标识进行判断,当判断出管理所述第一资源的是所述服务器5时,把所述第一加锁请求发送给所述接口。
本发明各实施例涉及的“表”,例如第一资源信息记录表并不是限定它的形式是表格或者表单。而是以其存储的内容作为它的定义。
本发明的各个方面、或各个方面的可能实现方式可以被具体实施为系统、方法或者计算机程序产品。因此,本发明的各方面、或各个方面的可能实现方式可以采用完全硬件实施例、完全软件实施例(包括固件、驻留软件等等),或者组合软件和硬件方面的实施例的形式,在这里都统称为“电路”、“模块”或者“系统”。此外,本发明的各方面、或各个方面的可能实现方式可以采用计算机程序产品的形式,计算机程序产品是指存储在计算机可读介质中的计算机可读程序代码。
计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质包含但不限于电子、磁性、光学、电磁、红外或半导体系统、设备或者装置,或者前述的任意适当组合,如随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或者快闪存储器)、光纤、便携式只读存储器(CD-ROM)。
计算机中的处理器读取存储在计算机可读介质中的计算机可读程序代码,使得处理器能够执行在流程图中每个步骤、或各步骤的组合中规定的功能动作;生成实施在框图的每一块、或各块的组合中规定的功能动作的装置。
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本 发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。

Claims (25)

  1. 一种锁请求的处理方法,其特征在于,应用于第一锁服务器,其中,所述第一锁服务器是第二锁服务器的接管锁服务器,所述第一锁服务器存储有所述第二锁服务器的锁管理范围,该方法包括:
    所述第一锁服务器在获知所述第二锁服务器发生故障后进入静默状态,所述静默状态的静默范围是第二锁服务器已经分配过权限的资源;
    第一锁服务器接收第一加锁请求,所述第一加锁请求用于请求给第一资源加锁,所述第一加锁请求中携带有第一资源标识;
    所述第一锁服务器检测到所述第一资源属于所述第二锁服务器的管理范围;
    所述第一锁服务器查询第一资源信息记录表,所述第一资源信息记录表记录有已被所述第二锁服务器分配了锁权限的资源的资源标识,如果所述第一资源信息记录表中未记录所述第一资源标识,则所述第一锁服务器按照所述第一加锁请求给所述第一资源分配锁权限。
  2. 根据权利要求1所述的锁请求的处理方法,其中,在所述第一锁服务器进入所述静默状态后,所述方法还包括:
    所述第一锁服务器接收第二加锁请求,所述第二加锁请求用于请求给第二资源加锁,所述第二加锁请求中携带有第二资源的标识;
    所述第一锁服务器检测到所述第二资源属于所述第一锁服务器的管理范围;
    所述第一锁服务器按照所述第二加锁请求给所述第二资源分配锁权限。
  3. 根据权利要求1所述的锁请求的处理方法,其中,在所述第一锁服务器进入所述静默状态后,所述方法还包括:
    第一锁服务器接收第三加锁请求,所述第三加锁请求用于请求给第三资源加锁,所述第三加锁请求中携带有第三资源的标识;
    所述第一锁服务器检测到所述第三资源属于所述第二锁服务器的管理范围;
    所述第一锁服务器查询第一资源信息记录表,如果所述第一资源信息记录表中已经记录所述第三加锁请求所请求的资源的资源标识,则所述第一锁服务器拒绝按照第三加锁请求给第三资源分配锁权限。
  4. 根据权利要求1-3中任一所述的锁请求的处理方法,其中,所述方法还包括:
    所述第一锁服务器将所述第一资源标识记录到第二资源信息记录表中;
    其中,所述第二资源信息记录表用于记录所述第一锁服务器已分配了锁权限的资源的资源标识,所述第二资源信息记录表存储在第三锁服务器中。
  5. 根据权利要求1-3中任一所述的锁请求的处理方法,其中,所述第一锁服务器存储所述第二锁服务器的锁管理范围的步骤包括:
    所述第一锁服务器接收第一通知消息,所述第一通知消息中携带有所述第二锁服务器的标识信息;
    所述第一锁服务器根据所述第二锁服务器的标识和锁服务器接管关系,确定所述第一锁服务器为所述第二锁服务器的接管锁服务器;
    所述第一锁服务器接收所述第二锁服务器的锁管理范围并进行存储。
  6. 根据权利要求1-3中任一所述的锁请求的处理方法,其中,所述方法之前,还包括:
    协议服务器接收来自主机的报文,并从所述报文中解析出所述第一加锁请求;
    所述协议服务器把所述第一加锁请求转发给锁代理;
    所述锁代理根据第一加锁请求中携带的第一资源标识进行判断,当判断出管理所述第一资源的是所述第一锁服务器时,把所述第一加锁请求发送给所述第一锁服务器。
  7. 根据权利要求1-3中任一所述的锁请求的处理方法,其中,所述第一锁服务器进入静默状态之后,所述方法还包括:
    接收锁重申请求,所述锁重申请求中携带第四资源的标识,以及所述第四资源由所述第二锁服务器分配的权限,所述第四资源是所述第二锁服务器已经分配过权限的资源;
    按照所述第二锁服务器已经分配的权限,给所述第四资源重新分配相同的权限。
  8. 根据权利要求7中所述的锁请求的处理方法,所述方法还包括:
    在给所述第二锁服务器已经分配过权限的资源全部重新分配权限后,所述第一锁服务器退出静默状态;或者
    在达到预设时间后,所述第一锁服务器退出静默状态。
  9. 根据权利要求8所述的锁请求的处理方法,其中,所述第一锁服务器退出所述静默状态之后,所述方法还包括:
    所述第一锁服务器更新所述第一锁服务器的管理范围,更新后的所述第一锁服务器的管理范围,包括更新前的所述第一锁服务器的管理范围以及所述所述第二锁服务器的管理范围。
  10. 一种锁请求管理装置,其特征在于,用于接管另一锁请求管理装置的锁请求,包括:
    接收模块,用于接收第一加锁请求和通知消息,所述第一加锁请求中携带有第一资源标识;
    存储模块,用于存储另一锁请求管理装置的锁管理范围,以及第一资源信息记录表,所述第一资源信息记录表中记录有已被所述另一锁请求管理装置分配了锁权限的资源的资源标识;
    锁请求处理模块,用于处理所述存储模块接收到的加锁请求;
    静默模块,用于在通过所述通知消息获知所述另一锁请求管理装置发生故障后,将所述锁请求管理装置设置为静默状态,静默范围是所述另一锁请求管理装置已经分配过权限的资源;
    其中,所述锁请求处理模块,在进入所述静默状态后,具体用于:
    当所述加锁请求属于所述静默范围时,查询第一资源信息记录表,如果 所述第一资源信息记录表中未记录所述第一资源标识,则所述按照所述第一加锁请求给所述第一资源分配锁权限。
  11. 根据权利要求10所述的锁请求管理装置,其中,在所述锁请求管理装置进入静默状态后:
    所述接收模块还用于接收第二加锁请求,所述第二加锁请求用于请求给第二资源加锁,所述第二加锁请求中携带有第二资源的标识;
    所述锁请求处理模块还用于,在检测到所述第二资源属于所述锁请求管理装置的管理范围后,按照所述第二加锁请求给所述第二资源分配锁权限。
  12. 根据权利要求10所述的锁请求管理装置,其中,在所述锁请求管理装置进入所述静默状态后:
    所述接收模块还用于接收第三加锁请求,所述第三加锁请求用于请求给第三资源加锁,所述第三加锁请求中携带有第三资源的标识;
    所述锁请求处理模块还用于,在检测到到所述第三资源属于所述另一锁请求管理装置的管理范围后,查询第一资源信息记录表,如果所述第一资源信息记录表中已经记录所述第三加锁请求所请求的资源的资源标识,则拒绝按照第三加锁请求给第三资源分配锁权限。
  13. 根据权利要求10-12任一所述的锁请求管理装置,其中:
    所述接收模块,还用于接收第一通知消息,所述第一通知消息中携带有所述另一锁请求管理装置的标识信息;
    所述接收模块,还用于根据所述另一锁请求管理装置的标识和锁请求管理装置接管关系,确定所述锁请求管理装置是所述另一锁请求管理装置的接管锁请求管理装置之后,发送另一锁请求管理装置的锁管理范围给所述存储模块;
    所述存储模块用于存储所述另一锁请求管理装置的锁管理范围,具体包括:
    所述存储模块用于从所述接收模块接收所述另一锁请求管理装置的锁管理范围并进行存储。
  14. 根据权利要求10-12任一所述的锁请求管理装置,其中,所述锁请求管理装置还包括协议服务器模块和锁代理模块:
    所述协议服务器模块,用于接收来自主机的报文,并从所述报文中解析出所述第一加锁请求,把所述第一加锁请求转发给所述锁代理模块;
    所述锁代理模块,用于根据第一加锁请求中携带的第一资源标识进行判断,当判断出的管理所述第一资源的是所述锁请求处理模块时,通过所述接收模块,把所述第一加锁请求发送给所述锁请求处理模块。
  15. 根据权利要求10-12任一所述的锁请求管理装置,其中,所述锁请求处理模块还用于,在所述锁请求管理装置进入静默状态之后:
    接收锁重申请求,所述锁重申请求中携带第四资源的标识,以及所述第四资源由所述另一锁请求管理装置分配的权限,所述第四资源是所述另一锁请求管理装置已经分配过权限的资源;
    按照所述另一锁请求管理装置已经分配的权限,给所述第四资源重新分配相同的权限。
  16. 根据权利要求15述的锁请求管理装置,所述静默模块还用于:
    在给所述另一锁请求管理装置已经分配过权限的资源全部重新分配权限后,所述锁请求管理装置退出静默状态;或者
    在达到预设时间后,所述锁请求管理装置退出静默状态。
  17. 根据权利要求16所述的锁请求管理装置,其中,所述存储模块还用于,在所述锁请求管理装置退出静默状态之后:
    更新所述锁请求管理装置的管理范围,更新后的所述锁请求管理装置的管理范围,包括更新前的所述锁请求管理装置的管理范围以及所述另一锁请求管理装置的管理范围。
  18. 一种服务器,服务器是另一服务器的锁管理接管服务器,包括:
    接口,被配置为用于接收加锁请求;
    存储器,被配置为用于存储另一服务器的锁管理范围,以及第一资源信息记录表,所述第一资源信息记录表中记录有已被所述另一服务器分配了锁 权限的资源的资源标识;
    处理器,被配置为用于通过运行出程序执行以下步骤:
    在获知所述另一服务器发生故障后将所述服务器设置为静默状态,其中,所述静默状态的静默范围是另一服务器的已经分配过权限的资源;
    接收第一加锁请求,所述第一加锁请求用于请求给第一资源加锁,所述第一加锁请求中携带有第一资源标识;
    检测到所述第一资源属于所述另一锁服务器的管理范围;
    所述第一锁服务器查询第一资源信息记录表,如果所述第一资源信息记录表中未记录所述第一资源标识,则所述第一锁服务器按照所述第一加锁请求给所述第一资源分配锁权限。
  19. 根据权利要求18所述的服务器,其中,在所述服务器进入所述静默状态后,所述方法还包括:
    所述服务器接收第二加锁请求,所述第二加锁请求用于请求给第二资源加锁,所述第二加锁请求中携带有第二资源的标识;
    所述服务器检测到所述第二资源属于所述服务器的管理范围;
    所述服务器按照所述第二加锁请求给所述第二资源分配锁权限。
  20. 根据权利要求18所述的服务器,其中,在所述服务器进入所述静默状态后,所述处理器还被配置为执行:
    接收第三加锁请求,所述第三加锁请求用于请求给第三资源加锁,所述第三加锁请求中携带有第三资源的标识;
    检测到所述第三资源属于所述另一服务器的管理范围;
    查询第一资源信息记录表,如果所述第一资源信息记录表中已经记录所述第三加锁请求所请求的资源的资源标识,则拒绝按照第三加锁请求给第三资源分配锁权限。
  21. 根据权利要求18-20任一所述的服务器,其中,所述存储器用于存储所述另一服务器的锁管理范围,具体包括:
    所述处理器用于接收第一通知消息,所述第一通知消息中携带有所述另 一服务器的标识信息;
    所述处理器用于根据所述另一服务器的标识和服务器接管关系,确定所述服务器为所述另一服务器的接管服务器后,将所述另一服务器的锁管理范围发送给所述存储器;
    所述存储器,用于接收所述另一服务器的锁管理范围并进行存储。
  22. 根据权利要求18-20任一所述的服务器,其中,所述服务器还用于:
    协议服务器模块,用于接收来自主机的报文,并从所述报文中解析出所述第一加锁请求;
    所述协议服务器模块,还用于把所述第一加锁请求转发给锁代理模块;
    所述锁代理模块根据第一加锁请求中携带的第一资源标识进行判断,当判断出管理所述第一资源的是所述服务器时,把所述第一加锁请求发送给所述接口。
  23. 根据权利要求18-20任一所述的服务器,其中,所述服务器进入静默状态之后,所述处理器还用于:
    接收锁重申请求,所述锁重申请求中携带第四资源的标识,以及所述第四资源由所述另一服务器分配的权限,所述第四资源是所述另一服务器已经分配过权限的资源;
    按照所述另一服务器已经分配的权限,给所述第四资源重新分配相同的权限。
  24. 根据权利要求23所述的服务器,所述处理器还用于:
    在给所述另一服务器已经分配过权限的资源全部重新分配权限后,将所述服务器退出静默状态;或者
    在达到预设时间后,将所述服务器退出静默状态。
  25. 根据权利要求24所述的服务器,其中,所述服务器退出所述静默状态之后,所述处理器还被配置为执行:
    更新所述服务器的管理范围,更新后的所述服务器的管理范围,包括更新前的所述服务器的管理范围以及所述所述另一服务器的管理范围。
PCT/CN2015/100006 2015-12-30 2015-12-30 加锁请求的处理方法及服务器 WO2017113261A1 (zh)

Priority Applications (10)

Application Number Priority Date Filing Date Title
CA2960982A CA2960982C (en) 2015-12-30 2015-12-30 Method for processing acquire lock request and server
AU2015408848A AU2015408848B2 (en) 2015-12-30 2015-12-30 Method for processing acquire lock request and server
JP2017522597A JP6357587B2 (ja) 2015-12-30 2015-12-30 獲得ロック要求を処理するための方法及びサーバ
PCT/CN2015/100006 WO2017113261A1 (zh) 2015-12-30 2015-12-30 加锁请求的处理方法及服务器
SG11201703260QA SG11201703260QA (en) 2015-12-30 2015-12-30 Method for processing acquire lock request and server
KR1020177008985A KR102016702B1 (ko) 2015-12-30 2015-12-30 잠금 획득 요청을 처리하는 방법 및 서버
EP15911889.2A EP3232609B1 (en) 2015-12-30 2015-12-30 Locking request processing method and server
CN201580008587.3A CN107466456B (zh) 2015-12-30 2015-12-30 加锁请求的处理方法及服务器
BR112017011541-7A BR112017011541B1 (pt) 2015-12-30 2015-12-30 Método para processar uma solicitação de bloqueio, aparelho de gerenciamento de solicitação de bloqueio e servidor
US16/013,175 US10846185B2 (en) 2015-12-30 2018-06-20 Method for processing acquire lock request and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/100006 WO2017113261A1 (zh) 2015-12-30 2015-12-30 加锁请求的处理方法及服务器

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/013,175 Continuation US10846185B2 (en) 2015-12-30 2018-06-20 Method for processing acquire lock request and server

Publications (1)

Publication Number Publication Date
WO2017113261A1 true WO2017113261A1 (zh) 2017-07-06

Family

ID=59219093

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/100006 WO2017113261A1 (zh) 2015-12-30 2015-12-30 加锁请求的处理方法及服务器

Country Status (10)

Country Link
US (1) US10846185B2 (zh)
EP (1) EP3232609B1 (zh)
JP (1) JP6357587B2 (zh)
KR (1) KR102016702B1 (zh)
CN (1) CN107466456B (zh)
AU (1) AU2015408848B2 (zh)
BR (1) BR112017011541B1 (zh)
CA (1) CA2960982C (zh)
SG (1) SG11201703260QA (zh)
WO (1) WO2017113261A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334823A (zh) * 2019-06-17 2019-10-15 北京大米科技有限公司 预约方法、装置、电子设备及介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI696394B (zh) * 2018-06-25 2020-06-11 新加坡商 聯發科技(新加坡)私人有限公司 5g行動通訊中附加安全能力指示方法及其裝置
CN111698068B (zh) * 2019-03-12 2022-02-18 华为技术有限公司 一种远程干扰管理方法及装置
CN110083465B (zh) * 2019-04-26 2021-08-17 上海连尚网络科技有限公司 一种寄宿应用间的数据传递方法
CN113076187B (zh) * 2020-01-03 2024-01-09 阿里巴巴集团控股有限公司 分布式锁管理方法及装置
US11354195B2 (en) * 2020-02-03 2022-06-07 EMC IP Holding Company LLC System and method for intelligent asset classification
CN111680015B (zh) * 2020-05-29 2023-08-11 北京百度网讯科技有限公司 文件资源处理方法、装置、设备和介质
CN111913809B (zh) * 2020-07-28 2024-03-19 阿波罗智能技术(北京)有限公司 多线程场景下的任务执行方法、装置、设备和存储介质
CN115277379B (zh) * 2022-07-08 2023-08-01 北京城市网邻信息技术有限公司 分布式锁容灾处理方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120060160A1 (en) * 2010-09-08 2012-03-08 International Business Machines Corporation Component-specific disclaimable locks
CN103634347A (zh) * 2012-08-24 2014-03-12 腾讯科技(深圳)有限公司 一种并行业务处理方法、设备及系统
CN103812685A (zh) * 2012-11-15 2014-05-21 腾讯科技(深圳)有限公司 同时在线统计系统及统计方法
CN104702655A (zh) * 2014-03-21 2015-06-10 杭州海康威视系统技术有限公司 云存储资源分配方法及其系统

Family Cites Families (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3134864B2 (ja) * 1997-12-09 2001-02-13 日本電気株式会社 システム結合装置のリカバリシステムおよびリカバリプログラムを記録した記録媒体
US6199105B1 (en) 1997-12-09 2001-03-06 Nec Corporation Recovery system for system coupling apparatuses, and recording medium recording recovery program
US6732171B2 (en) 2002-05-31 2004-05-04 Lefthand Networks, Inc. Distributed network storage system with virtualization
US7181744B2 (en) * 2002-10-24 2007-02-20 International Business Machines Corporation System and method for transferring data between virtual machines or other computer entities
JP4012517B2 (ja) * 2003-04-29 2007-11-21 インターナショナル・ビジネス・マシーンズ・コーポレーション 仮想計算機環境におけるロックの管理
US7496574B2 (en) 2003-05-01 2009-02-24 International Business Machines Corporation Managing locks and transactions
US7356531B1 (en) * 2003-07-25 2008-04-08 Symantec Operating Corporation Network file system record lock recovery in a highly available environment
US7962915B2 (en) * 2005-03-18 2011-06-14 International Business Machines Corporation System and method for preserving state for a cluster of data servers in the presence of load-balancing, failover, and fail-back events
US8566298B1 (en) * 2005-07-28 2013-10-22 Symantec Operating Corporation Method and apparatus for sharing resource locks amongst applications
JP4371321B2 (ja) 2006-03-10 2009-11-25 富士通株式会社 Nfsサーバ、nfsサーバ制御プログラム、nfsサーバ制御方法
US8316190B2 (en) * 2007-04-06 2012-11-20 Waratek Pty. Ltd. Computer architecture and method of operation for multi-computer distributed processing having redundant array of independent systems with replicated memory and code striping
US8990954B2 (en) 2007-06-20 2015-03-24 International Business Machines Corporation Distributed lock manager for file system objects in a shared file system
CN100568184C (zh) * 2007-12-27 2009-12-09 电子科技大学 协同编辑中数据冲突模块的加锁方法
CN101567805B (zh) * 2009-05-22 2011-12-28 清华大学 并行文件系统发生故障后的恢复方法
US8296599B1 (en) * 2009-06-30 2012-10-23 Symantec Corporation System and method for implementing clustered network file system lock management
JP5292350B2 (ja) * 2010-03-30 2013-09-18 日本電信電話株式会社 メッセージキュー管理システム及びロックサーバ及びメッセージキュー管理方法及びメッセージキュー管理プログラム
JP2011242949A (ja) * 2010-05-17 2011-12-01 Fujitsu Ltd ファイル管理プログラム、ファイル管理方法、及び情報処理装置
US8533171B2 (en) * 2011-04-08 2013-09-10 Symantec Corporation Method and system for restarting file lock services at an adoptive node during a network filesystem server migration or failover
WO2013066397A1 (en) 2011-10-31 2013-05-10 Hewlett-Packard Development Company, L.P. File lock preservation
JP5939561B2 (ja) * 2011-12-02 2016-06-22 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation 資源のロックを獲得する装置及び方法
US9141440B2 (en) * 2011-12-29 2015-09-22 Red Hat, Inc. Fault tolerant distributed lock manager
JP5497861B2 (ja) * 2012-08-31 2014-05-21 日本電信電話株式会社 サーバ、ファイル管理システム、ファイル管理方法およびファイル管理プログラム
US9514160B2 (en) * 2013-03-11 2016-12-06 Oracle International Corporation Automatic recovery of a failed standby database in a cluster
CN103731485A (zh) * 2013-12-26 2014-04-16 华为技术有限公司 一种网络设备、集群存储系统及分布式锁管理方法
US20150186201A1 (en) * 2014-01-02 2015-07-02 Intel Corporation Robust link training protocol
US9489269B2 (en) * 2014-05-31 2016-11-08 Oracle International Corporation Global backup lock manager
EP3059932B1 (en) 2014-11-12 2018-09-19 Huawei Technologies Co., Ltd. Lock server malfunction processing method and system thereof in distribution system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120060160A1 (en) * 2010-09-08 2012-03-08 International Business Machines Corporation Component-specific disclaimable locks
CN103634347A (zh) * 2012-08-24 2014-03-12 腾讯科技(深圳)有限公司 一种并行业务处理方法、设备及系统
CN103812685A (zh) * 2012-11-15 2014-05-21 腾讯科技(深圳)有限公司 同时在线统计系统及统计方法
CN104702655A (zh) * 2014-03-21 2015-06-10 杭州海康威视系统技术有限公司 云存储资源分配方法及其系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334823A (zh) * 2019-06-17 2019-10-15 北京大米科技有限公司 预约方法、装置、电子设备及介质

Also Published As

Publication number Publication date
SG11201703260QA (en) 2017-08-30
BR112017011541B1 (pt) 2023-09-26
AU2015408848A1 (en) 2017-07-13
CA2960982A1 (en) 2017-06-30
EP3232609B1 (en) 2019-09-04
EP3232609A1 (en) 2017-10-18
CA2960982C (en) 2021-02-16
JP6357587B2 (ja) 2018-07-11
EP3232609A4 (en) 2018-03-07
KR102016702B1 (ko) 2019-08-30
CN107466456B (zh) 2020-01-17
AU2015408848B2 (en) 2018-10-18
KR20180090181A (ko) 2018-08-10
US10846185B2 (en) 2020-11-24
CN107466456A (zh) 2017-12-12
JP2018503887A (ja) 2018-02-08
US20180300210A1 (en) 2018-10-18
BR112017011541A2 (zh) 2018-07-10

Similar Documents

Publication Publication Date Title
WO2017113261A1 (zh) 加锁请求的处理方法及服务器
US20210247973A1 (en) Virtualized file server user views
US20210004355A1 (en) Distributed storage system, distributed storage system control method, and storage medium
US8458413B2 (en) Supporting virtual input/output (I/O) server (VIOS) active memory sharing in a cluster environment
RU2595755C2 (ru) Восстановление после сбоя кластерного клиента
US8560628B2 (en) Supporting autonomous live partition mobility during a cluster split-brained condition
JP5902716B2 (ja) 大規模記憶システム
US11256582B2 (en) System, and control method and program for input/output requests for storage systems
WO2015096606A1 (zh) 一种网络设备、集群存储系统及分布式锁管理方法
JP6388290B2 (ja) 分散システムにおけるロック・サーバの故障を処理するための方法およびシステム
WO2012068867A1 (zh) 虚拟机管理系统及其使用方法
US20120151095A1 (en) Enforcing logical unit (lu) persistent reservations upon a shared virtual storage device
WO2018063561A1 (en) Technologies for providing network interface support for remote memory and storage failover protection
US11822970B2 (en) Identifier (ID) allocation in a virtualized computing environment
US20200137023A1 (en) Distributed network internet protocol (ip) address management in a coordinated system
WO2013160983A1 (ja) 情報取得方法、計算機システム及び管理計算機
JP2014063356A (ja) 情報処理方法、プログラム、情報処理装置、及び情報処理システム。
CN118158222A (zh) 负载均衡器部署方法、装置、电子设备、存储介质及产品
JP2015106385A (ja) 情報処理装置およびリカバリ管理方法

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2960982

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 20177008985

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2017522597

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 11201703260Q

Country of ref document: SG

REEP Request for entry into the european phase

Ref document number: 2015911889

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2015408848

Country of ref document: AU

Date of ref document: 20151230

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15911889

Country of ref document: EP

Kind code of ref document: A1

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112017011541

Country of ref document: BR

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 112017011541

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20170531