CN117632523A - Distributed lock management method and device, computing device cluster and storage medium - Google Patents

Distributed lock management method and device, computing device cluster and storage medium Download PDF

Info

Publication number
CN117632523A
CN117632523A CN202210979704.2A CN202210979704A CN117632523A CN 117632523 A CN117632523 A CN 117632523A CN 202210979704 A CN202210979704 A CN 202210979704A CN 117632523 A CN117632523 A CN 117632523A
Authority
CN
China
Prior art keywords
locking
distributed lock
distributed
lock
expected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210979704.2A
Other languages
Chinese (zh)
Inventor
陆基伟
赵智杰
戈弋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202210979704.2A priority Critical patent/CN117632523A/en
Publication of CN117632523A publication Critical patent/CN117632523A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • G06F9/526Mutual exclusion algorithms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a management method and device of a distributed lock, a computing device cluster and a storage medium, and belongs to the technical field of computers. In the method, when a computing node of the distributed system occupies a distributed lock corresponding to a target resource, first expected information can be predicted according to historical expected probability; and updating the first expected information according to the request condition of each computing node in the distributed system for the distributed lock, so that the first expected information is unlocked when the updated first expected information meets the unlocking condition. According to the scheme, the requirements of each node in the distributed system on the distributed lock can be predicted by referring to the historical experience, and the requirements are updated in real time according to the request condition, so that expected information can be continuously updated on the basis of the historical experience, unlocking is performed according to the expected information, the condition that the computing node can be matched with the request condition of the distributed system when the computing node is unlocked is ensured, the performance of the distributed lock is effectively improved, and further, the load balance in the distributed system is stably maintained.

Description

Distributed lock management method and device, computing device cluster and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and apparatus for managing a distributed lock, a computing device cluster, and a storage medium.
Background
Distributed locks are a mutually exclusive mechanism applied in distributed systems. Since data of a distributed system is distributed among different nodes, in order to ensure consistency of the data, a distributed lock is generally used to control synchronous access of each node to a shared resource in the distributed system. Nodes in the system can apply for occupation of the distributed lock to obtain access rights to shared resources controlled by the distributed lock, a process also known as locking. And other nodes in the system which cannot occupy the distributed lock can apply for occupying the distributed lock after waiting for the occupied state of the distributed lock to be released. Typically, a current node that occupies a distributed lock will actively disengage the distributed lock after completing the access task to the shared resource, a process also known as unlocking.
Under the high concurrency scene of the distributed system, for example, each node accesses the same shared resource more frequently, according to the current technical scheme, if the node actively unlocks after completing the access task each time, the node needs to repeatedly apply for locking and unlocking, but if the node continuously occupies the distributed lock to meet the self demand, the use of the locks by other nodes can be influenced due to the fact that the node cannot unlock in time, so that the load of the whole distributed system is difficult to balance. Therefore, there is a need for a method that can effectively improve the performance of a distributed lock to maintain load balancing in a distributed system.
Disclosure of Invention
The application provides a management method and device of a distributed lock, a computing device cluster and a storage medium, which can improve the performance of the distributed lock so as to maintain load balancing in a distributed system. The technical scheme is as follows:
in a first aspect, a method for managing a distributed lock is provided, where the method is applied to a computing node of a distributed system, and the system includes a plurality of computing nodes, and the method includes:
predicting first expected information for the distributed lock based on a historical expected probability, based on locking requests sent by the plurality of computing nodes for the target resource, the locking requests being for occupying the distributed lock, the first expected information indicating a frequency of the locking requests sent by the plurality of computing nodes in a future time period, in a case that the computing nodes occupy the distributed lock corresponding to the target resource;
updating the first expected information based on request conditions of the plurality of computing nodes of the distributed system for the distributed lock;
and if the updated first expected information meets the unlocking condition, releasing the occupation of the distributed lock by the computing node.
Wherein the plurality of computing nodes are capable of accessing shared resources stored in the distributed system. In some embodiments, the shared resource may be a hardware resource, such as a storage hard disk; or may be a software resource, such as a file or process, to which the present application is not limited. In some embodiments, the historical expected probability can represent the demand duty cycle of the remote node for the distributed lock in the distributed system, thereby providing an effective reference for predicting the demand of each computing node over a future period of time.
According to the scheme, the requirements of each node in the distributed system on the distributed lock can be predicted by referring to the historical experience, and the requirements are updated in real time according to the request condition, so that expected information can be continuously updated on the basis of the historical experience, unlocking is performed according to the expected information, the condition that the computing node can be matched with the request condition in the distributed system when the computing node is unlocked is ensured, the performance of the distributed lock is effectively improved, and further, the load balance in the distributed system is stably maintained.
In some embodiments, there is still a need for access to the target resource in the distributed system during the time that the computing node is occupying the distributed lock. Therefore, by counting the locking requests sent by each computing node in the distributed system for the target resource, the request condition of the distributed lock in the distributed system can be determined. In the following, the remote node refers to a computing node that does not occupy the distributed lock, and the local node refers to a computing node that is occupying the distributed lock.
In one possible implementation, the updating the first expected information based on the request of the distributed lock by the plurality of computing nodes of the distributed system includes:
In response to a locking request by the computing node for the distributed lock, the first expected information is updated such that the frequency of sending locking requests in a future time period is reduced, the locking request by the computing node for the distributed lock being used to request continued occupancy of the distributed lock.
According to the technical scheme, aiming at the condition that the computing node continuously occupies the distributed lock locally, the first expected information is updated in real time according to the number of local locking requests, so that the remote locking requirements can be predicted based on the local locking requirements, and the distributed lock requirements of the local node and the remote node are effectively balanced.
In one possible embodiment, the method further comprises:
if the updated first expected information does not meet the unlocking condition, determining the number of times of locking requests to be executed by the computing node based on the updated first expected information;
and after the execution of the locking request to be executed is completed, releasing the occupation of the distributed lock by the computing node.
According to the technical scheme, the local node can directly finish locking operation locally aiming at the locking request sent by the local node under the condition that the local demand on distributed locks is large, so that the influence of network delay on locking efficiency is effectively avoided; based on the method, the locking efficiency of the distributed lock is improved, and the performance of the distributed lock is effectively improved.
In one possible implementation, the updating the first expected information based on the request of the distributed lock by the plurality of computing nodes of the distributed system includes:
in response to the computing node receiving second expected information for the distributed lock, the first expected information is updated to increase the frequency of sending locking requests in a future time period, the second expected information indicating a number of locking failures of locking requests that the plurality of computing nodes have sent.
According to the technical scheme, the first expected information is updated timely according to the condition that the computing node in the system fails to request the distributed lock, so that the remote locking requirement indicated by the first expected information is updated in real time, the computing node occupying the distributed lock is ensured to be capable of knowing the remote locking requirement sensitively, and the local node and the remote node are effectively balanced on the distributed locking requirement.
In one possible embodiment, the method further comprises:
and in response to the computing node receiving the second expected information, releasing the occupation of the distributed lock by the computing node under the condition that the locking failure times are greater than zero.
Through the technical scheme, the local node can be unlocked under the condition that the demand of the remote node for the distributed lock is large, and the demand of the remote node for the distributed lock can be timely met. Based on the method, the locking efficiency of the distributed lock under various conditions is improved, and the performance of the distributed lock is effectively improved.
In one possible implementation, the unlocking condition is that the frequency of sending locking requests in a future time period is greater than the frequency that the computing node continues to occupy the distributed lock.
By the technical scheme, the requirements between the local node and the remote node can be balanced stably, so that load balancing in the distributed system is maintained efficiently.
In one possible embodiment, the method further comprises:
and adjusting the historical expected probability based on the received lock load information every first period, wherein the lock load information indicates the duty ratio of the successful locking times in locking requests sent by the plurality of computing nodes aiming at the target resource in the first period.
In one possible implementation, the adjusting the historical expected probability based on the received lock load information at intervals of a first period includes:
adjusting the historical expected probability to increase the frequency of sending locking requests in the future time period if the duty ratio of the locking success times indicated by the locking load information is greater than the historical expected probability;
and adjusting the expected frequency to reduce the frequency of sending locking requests in the future time period when the duty ratio of the locking success times indicated by the locking load information is smaller than or equal to the historical expected probability.
According to the technical scheme, the computing node can periodically update the history expected probability in an iterative manner according to the request condition of the distributed lock in the distributed system so as to balance the use requirement of the remote node and the local node on the distributed lock, thereby greatly improving the performance of the distributed lock, ensuring that each node can successfully occupy the distributed lock at the required time, and effectively maintaining the load balance in the distributed system.
In one possible embodiment, the method further comprises:
and under the condition that the distributed lock is not occupied, adjusting the frequency of sending the locking request every second period based on the received lock flow information and the flow threshold value, wherein the lock flow information indicates the communication resources consumed by the system for managing the distributed lock in the second period.
According to the technical scheme, the frequency of the request can be dynamically adjusted by each computing node according to the occupation condition of the distributed lock on the communication resource in the distributed system, so that the consumption of the cluster on the bandwidth is effectively optimized, and the effect of only load balancing of the distributed system is improved.
In a second aspect, there is provided an apparatus for managing a distributed lock, the apparatus comprising a plurality of functional modules for performing corresponding steps in the method for managing a distributed lock as provided in the first aspect.
In a third aspect, a cluster of computing devices is provided, comprising at least one computing device, each computing device comprising a processor and a memory;
the memory of the at least one computing device is configured to store at least one piece of program code that is loaded by the processor of the at least one computing device to cause the cluster of computing devices to perform the method of managing a distributed lock as described in the first aspect.
In a fourth aspect, a computer readable storage medium is provided for storing at least one piece of program code, which when executed by a cluster of computing devices, performs the method of managing a distributed lock according to the first aspect.
In a fifth aspect, there is provided a computer program product that, when run on a cluster of computing devices, causes the cluster of computing devices to perform the method of managing a distributed lock as in the first aspect.
Drawings
FIG. 1 is an architecture diagram of a distributed system provided in an embodiment of the present application;
FIG. 2 is a schematic hardware architecture of a computing device according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a computing device cluster provided in an embodiment of the present application;
FIG. 4 is a flow chart of a method for managing a distributed lock according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a method for managing a distributed lock according to an embodiment of the present application;
FIG. 6 is a deployment diagram of a method of distributed lock management provided by an embodiment of the present application;
FIG. 7 is a flow chart of a locking process provided by an embodiment of the present application;
FIG. 8 is a flowchart of an unlocking process provided by an embodiment of the present application;
FIG. 9 is a schematic diagram of an information management process provided in an embodiment of the present application;
fig. 10 is a schematic structural diagram of a management device of a distributed lock according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Before introducing the technical solutions provided by the embodiments of the present application, the following describes key terms related to the present application.
Distributed file system (distributed file system, DFS): the storage resources managed by the distributed file system are not necessarily physically connected directly to nodes in the system, but may be connected to the nodes through a computer network. DFS provides a logical file system structure for resources distributed anywhere on the network, thereby enabling nodes in the distributed file system to easily access shared resources (shared files) distributed in the network.
Distributed database (distributed database, DDB): the distributed database includes a plurality of related databases that are physically dispersed but interconnected with each other by a computer network to provide distributed storage and distributed computing.
Distributed lock: a mutual exclusion mechanism applied in a distributed system is used for ensuring that a shared resource can only be accessed by one thread in one node in a distributed deployment cluster if a plurality of nodes request to carry out business operation on the same shared resource so as to avoid the concurrent problem.
The distributed system to which the technical solution provided in the embodiments of the present application is applied is described first.
Fig. 1 is an architecture diagram of a distributed system provided in an embodiment of the present application, where the method for managing a distributed lock provided in the present application can be applied to the distributed system, and referring to fig. 1, the distributed system 100 includes a plurality of computing nodes 110.
Wherein the plurality of computing nodes 110 are capable of accessing shared resources stored in the distributed system. In some embodiments, the shared resource may be a hardware resource, such as a storage hard disk; or may be a software resource, such as a file or process, to which the present application is not limited.
In the embodiment of the present application, synchronous access to the shared resources in the distributed system 100 is controlled based on the distributed lock, so as to implement mutual exclusion operation of the plurality of computing nodes 110 to the shared resources, and ensure consistency of data. Illustratively, any computing node 110 may obtain access rights to the shared resource controlled by the distributed lock by occupying the distributed lock corresponding to the target resource, which process is also referred to as locking; after completing the access task to the shared resource, the computing node occupying the distributed lock releases the occupation of the distributed lock to release the access authority to the shared resource, which is also called unlocking. In some embodiments, the distributed system 100 may be implemented as a distributed database or a distributed file system, which is not limited in this application.
The distributed lock management method provided by the application can be applied to any computing node 110 in the distributed system 100, so that the unlocking time is determined according to the request condition of a plurality of computing nodes in the distributed system for the distributed lock under the condition that any computing node 110 occupies the distributed lock corresponding to the target resource.
In some embodiments, the distributed system 100 includes a management node 120, which is a computing node in the distributed system that has management functionality. The management node 120 is used to monitor, schedule and manage shared resources in the distributed system. In some embodiments, the management node 120 performs the purpose of scheduling and managing the shared resources in the distributed system by monitoring and managing the status of the distributed locks corresponding to the respective shared resources. The plurality of computing nodes 110 in the system can implement the occupation of the distributed lock by sending the locking request for the target resource to the management node 120, and based on this, the management node 120 can determine the request condition of the distributed lock by counting the locking requests sent by the computing nodes in the distributed system. In some embodiments, a plurality of management nodes 120 may be deployed in the distributed system 100, each management node 120 being configured to manage a specified number of computing nodes 110 to improve management efficiency in a large-scale cluster, where the plurality of management nodes 120 are capable of communicating to maintain data consistency.
In some embodiments, the computing node 110 may be any form of computing device or cluster of computing devices including at least one computing device. In some embodiments, the computing device may be a terminal, which is also called an intelligent terminal or a mobile intelligent terminal, and refers to a device with rich man-machine interaction modes, internet access capability, various operating systems, and strong processing capability. In some embodiments, the types of mobile smart terminals include, but are not limited to, smartphones, tablet computers, car terminals, palm game consoles, and the like. In some embodiments, the computing device may be a server, such as a central server, an edge server, or a server in a local data center, as not limited in this application.
Wherein the plurality of computing nodes 110 are in communication connection through a wired network or a wireless network. In some embodiments, the wireless network or wired network described above uses standard communication techniques and/or protocols. The network is typically the internet, but can be any network including, but not limited to, a local area network (local area network, LAN), metropolitan area network (metropolitan area network, MAN), wide area network (wide area network, WAN), mobile, wired or wireless network, private network, or any combination of virtual private networks. In some embodiments, peer-to-peer (P2P) communication is implemented between the plurality of computing nodes 110 based on a remote invocation protocol (remote procedure call protocol, RPC). In some embodiments, the plurality of computing nodes 110 represent data exchanged over a network using techniques and/or formats including hypertext markup language (hyper text markup language, HTML), extensible markup language (extensible markup language, XML), and the like. In addition, all or some of the links can be encrypted using conventional encryption techniques such as secure socket layer (secure socket layer, SSL), transport layer security (transport layer security, TLS), virtual private network (virtual private network, VPN), internet protocol security (internet protocol security, IPsec), and the like. In other embodiments, custom and/or dedicated data communication techniques can also be used in place of or in addition to the data communication techniques described above.
The following describes a hardware structure of a computing device according to an embodiment of the present application.
Embodiments of the present application provide a computing device that can be configured as any of the forms of computing devices referred to in the distributed systems described above, e.g., a server or a terminal. Referring to fig. 2 schematically, fig. 2 is a schematic hardware structure of a computing device according to an embodiment of the present application. As shown in fig. 2, the computing device 200 includes a memory 201, a processor 202, a communication interface 203, and a bus 204. The memory 201, the processor 202, and the communication interface 203 are connected to each other by a bus 204.
The memory 201 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (random access memory, RAM) or other type of dynamic storage device that can store information and instructions, or an electrically erasable programmable read-only memory (electrically erasable programmable read-only memory, EEPROM), a compact disc read-only memory (compact disc read-only memory) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media, or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The processor 202 implements the method in the above or below embodiments by reading the program code stored in the memory 201, or the processor 202 implements the method in the above or below embodiments by internally storing the program code. In the case where the processor 202 implements the method in the above or in the following embodiments by reading the program code stored in the memory 201, the memory 201 stores the program code for implementing the management method of the distributed lock provided in the embodiment of the present application. The memory 201 may also store status data, event logs, and the like, which are not limited in this embodiment.
Processor 202 may be a network processor (network processor, NP), a central processing unit (central processing unit, CPU), an application-specific integrated circuit (ASIC) or an integrated circuit for controlling the execution of programs in accordance with aspects of the present application. The processor 202 may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. The number of the processors 202 may be one or a plurality. Communication interface 203 enables communication between computing device 200 and other devices or communication networks using a transceiver module, such as a transceiver. For example, data may be acquired through the communication interface 203.
The memory 201 and the processor 202 may be separately provided or may be integrated.
Bus 204 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one double arrow is shown in FIG. 2, but not only one bus or one type of bus. Bus 204 may include a path for transferring information between various components of computing device 200 (e.g., memory 201, processor 202, communication interface 203).
The embodiment of the application also provides a computing device cluster. The cluster of computing devices comprises at least one computing device, which can be implemented as a computing node or a management node in the distributed system described above. The computing device may be a server, such as a central server, an edge server, or a local server in a local data center. In some embodiments, the computing device may also be a terminal device such as a desktop, notebook, or smart phone. Fig. 3 is a schematic diagram of a computing device cluster provided in an embodiment of the present application, and referring to fig. 3, the computing device cluster 300 includes at least one computing device 310. The same instructions for performing the distributed lock management methods provided herein may be stored in memory in one or more computing devices 310 in computing device cluster 300. The hardware structure of at least one computing device 310 in the computing device cluster 300 may refer to the description in fig. 2, which is not described herein.
Next, a detailed description will be given of a method for managing a distributed lock according to an embodiment of the present application based on the distributed system described in fig. 1, and fig. 4 is a flowchart of a method for managing a distributed lock according to an embodiment of the present application, where the method can be executed by any computing node in fig. 1, and the method includes the following steps 401 to 403.
401. Under the condition that the computing node occupies the distributed lock corresponding to the target resource, first expected information aiming at the distributed lock is predicted based on historical expected probability.
The target resource is a shared resource that can be accessed by a plurality of computing nodes in the distributed system, and the description of the shared resource in the content corresponding to fig. 1 is omitted herein.
The locking request is used for occupying the distributed lock to acquire the access right to the target resource. In some embodiments, the locking request carries identification information of the target resource and identification information of the computing node to request allocation of access rights to the target resource to the computing node. In some embodiments, the access rights include read rights and write rights. In some embodiments, a compute node that occupies the distributed lock can obtain read rights for the target resource, in which case a compute node that obtains read rights for the target resource can read the target resource, and other compute nodes can read but cannot write to the target resource. In other embodiments, the computing node occupying the distributed lock may obtain the write permission to the target resource, in which case, only the computing node occupying the distributed lock may perform the write operation to the target resource, and other computing nodes may not perform the read operation to the target resource or the write operation to the target resource.
Wherein the historical expected probability is determined based on locking requests sent by a plurality of computing nodes in the distributed system for a target resource. In some embodiments, the historical expected probability can be determined from locking requests from multiple sources over a historical period of time. Illustratively, the historical expected probability can be obtained based on the following equation (1).
P=number of remote locking requests/(number of remote locking requests+number of local locking requests) (1)
In formula (1), P is the expected probability of history, which is a non-negative number; if the locking request comes from a computing node of the unoccupied distributed lock, the locking request belongs to a remote locking request; if the locking request is from a compute node that is occupying a distributed lock, the locking request belongs to a local locking request. For convenience of description, in the following description, the remote node refers to a computing node that does not occupy the distributed lock, and the local node refers to a computing node that is occupying the distributed lock, and it will be understood that steps 401 to 403 in the embodiments of the present application are performed by the local node. In such an example, the historical expected probability can represent the demand duty cycle of the distributed lock by the remote nodes in the distributed system, thereby providing an effective reference for predicting the demand of each computing node over a future period of time.
Wherein the first expected information indicates a frequency of the plurality of computing nodes sending locking requests in a future time period, the first expected information being a result of a local node predicting a demand for the distributed lock by a plurality of remote nodes. In some embodiments, the first expected information can embody the size of the demand in the form of a score. In this example, the first expected information may be determined based on the set score upper limit value and the historical expected probability, referring to the following formula (2).
K=N×P (2)
In formula (2), K is the first expected information, N is the upper score limit, and P is the historical expected probability, where K, N and P are both nonnegative numbers. Alternatively, the score upper limit value N may be set to 100.
Through the process, the history experience of managing the distributed locks in the distributed system can be obtained based on the locking requests of the distributed locks in the history time period, and the history experience is used as a basis for predicting the request frequency of the distributed locks in a future time period, so that the instruction information based on the history experience is provided for the local node occupying the locks in the process of determining the unlocking time of the local node, and a good starting point is provided for load balancing in the distributed system.
402. The computing node updates the first expected information based on request conditions of the distributed lock by a plurality of computing nodes of the distributed system.
In some embodiments, there is still a need for access to the target resource in the distributed system during the time that the computing node is occupying the distributed lock. Therefore, by counting the locking requests sent by each computing node in the distributed system for the target resource, the request condition of the distributed lock in the distributed system can be determined. The locking requests of different sources correspond to different request cases for which the computing node updates the first expected information in different ways. A number of possible implementations of this step 402 are described below taking case 1 and case 2 as examples.
Case 1, a computing node updating the first expected information in response to a locking request by the computing node for the distributed lock to reduce the frequency of sending locking requests in a future time period, the locking request by the computing node for the distributed lock for requesting to continue to occupy the distributed lock.
Wherein the compute node makes a locking request for the distributed lock, i.e., a locking request from a local node. In some embodiments, where a computing node is occupying the distributed lock, the locking request sent by the computing node may be sent directly to the host process (or to a designated thread under the host process that is holding the lock), thereby enabling the request to continue to occupy the distributed lock before the distributed lock is unlocked, a process also referred to as local reuse of the distributed lock.
In some embodiments, during the process of accessing the target resource, the local node runs multiple access tasks for the shared resource based on multiple threads under the main process, so as to ensure that the access tasks executed by the multiple threads follow the mutual exclusion principle, the local node only allows a specified thread to occupy the distributed lock in the running memory space of the main process, and after the specified thread completes the access tasks, the local node releases the distributed lock in the running memory space of the main process. At this time, the distributed lock can be occupied by other threads under the main process, but not unlocked, that is, for the remote node in the distributed system, the distributed lock is still occupied by the local node, and the state in which the distributed lock is located may be referred to as: the local release is not unlocked. In this example, the locking request sent by the computing node, i.e., the locking request for the distributed lock by the other threads under the master process of the local node, is equivalent to the local node requesting to continue to occupy the distributed lock from the perspective of the distributed system. Thus, a locking request from a local node can represent a request for the distributed lock by the local node in the distributed system, based on which, by updating the first expected information, it is ensured that the frequency of sending locking requests in the predicted future time period can be changed accordingly as the number of times that locking requests are sent by the local node increases.
In some embodiments, the number of local node locking requests for a distributed lock increases, and the probability of sending locking requests on behalf of the remote node decreases to some extent. Based on this, the value of the first expected information can be updated based on the number of locking requests of the distributed lock by the local node, thereby indicating a change in demand of the distributed lock by the remote node by a change in the value of the first expected information. In some embodiments, the process of updating the first expected information each time a locking request sent by a local node is received may be implemented by the following formula (3).
S i =S i-1 -1 (3)
In the formula (3), S i Is updated first expected information, S is more than or equal to 0 i N is less than or equal to; i is an integer greater than 1, S when i=2 i-1 =S 1 The S is 1 That is, K obtained in the above formula (2), that is, the first expected information without update obtained based on the score upper limit value N and the historical expected probability P; in the formula (3), each jointUpon receipt of a lock request from a local node, the value of the first expected information is decremented by 1, thereby indicating a decrease in the frequency with which the remote node transmits lock requests during the future time period.
According to the technical scheme, aiming at the situation that the computing node locally reuses the distributed lock, the first expected information is updated in real time according to the number of locking requests sent by the local node, so that the remote locking requirement can be predicted based on the local locking requirement, and the local node and the remote node can effectively balance the distributed lock requirement.
Case 2, in response to the computing node receiving second expected information for the distributed lock, updating the first expected information to increase the frequency of sending locking requests in a future time period, the second expected information indicating a number of locking failures of locking requests that the plurality of computing nodes have sent.
The number of locking failures of the locking requests sent by the plurality of computing nodes is the number of locking failures of the locking requests from the remote node. In some embodiments, locking requests sent by multiple remote nodes in a distributed system may fail because the distributed lock is being occupied by a local node.
In some embodiments, the computing node is capable of receiving the second expected information from a management node (see description in fig. 1). In some embodiments, a wait queue of distributed locks is maintained in the management node. In the case where the distributed lock is being occupied by a local node, the remote node requests a failed locking request sent for the distributed lock, the management node returns failure information to the remote node, and the locking requests that failed requests are queued in the wait queue to be executed. Based on this, the management node can determine, based on the length of the wait queue of the distributed lock, a number of lock failures for the distributed lock by the remote node, thereby sending the second expected information to the computing node. In some embodiments, the value of the second expected information indicates a number of lock failures. It will be appreciated that an increase in the number of locking failures of the distributed lock by the remote node represents an increase in the probability of the remote node sending a locking request. Based on this, the computing node is able to update the value of the first expected information based on the number of lock failures, thereby indicating a change in demand for the distributed lock by the remote node by a change in the value of the first expected information. In some embodiments, the management node obtains the length of the waiting queue every specified time period, thereby sending the second expected information to the local node; in some embodiments, the management node clears the length of the wait queue after the distributed unlocking. In some embodiments, for the case where the same computing node repeatedly sends multiple locking requests to the management node, the management node increases the length of the waiting queue for only the first locking request sent by the computing node, i.e., the length of the waiting queue can, in some embodiments, indicate the number of nodes waiting to lock the distributed lock. In other embodiments, the management node sends the second expected information to the local node if the length of the waiting queue reaches a target threshold, which may be, for example, 50% of the total number of computing nodes in the distributed system, without limitation.
In some embodiments, in response to receiving the second expected information, the process of updating the first expected information may be implemented by the following equation (4).
S i =S i-1 +M (4)
In the formula (4), S i Is updated first expected information, S is more than or equal to 0 i N is less than or equal to; i is an integer greater than 1, S when i=2 i-1 =S 1 The S is 1 That is, K obtained in the above formula (2), that is, the first expected information without update obtained based on the score upper limit value N and the historical expected probability P; m is the length of the waiting queue, M is a non-negative integer; in equation (4), in response to receiving the second expected information, the frequency at which the remote node sends locking requests in the future time period is indicated to increase by increasing the value of the first expected information by M.
According to the technical scheme, the first expected information is updated timely according to the condition that the computing node in the system fails to request the distributed lock, so that the remote locking requirement indicated by the first expected information is updated in real time, the computing node occupying the distributed lock is ensured to be capable of knowing the remote locking requirement sensitively, and the local node and the remote node are effectively balanced on the distributed locking requirement.
Through the technical scheme, the first expected information is updated in real time based on the request condition of the distributed lock in the distributed system, so that the first expected information can be guaranteed to be matched with the actual condition in the distributed system, the change of the distributed lock demand by each node in the distributed system is accurately indicated, an effective reference is provided for the judgment of the subsequent unlocking time, the circulation of ownership of the distributed lock can meet the demand change in the system to the greatest extent, and the performance of the distributed lock is greatly improved.
403. And if the updated first expected information meets the unlocking condition, the computing node releases the occupation of the computing node on the distributed lock.
In some embodiments, the unlocking condition is: the frequency of sending locking requests in the future time period is greater than the frequency with which the computing node continues to occupy the distributed lock.
In some embodiments, the computing node performs the step 403 after performing the access task for the target resource, that is, the computing node can release the computing node from occupying the distributed lock when the frequency of sending the locking request is greater than the frequency of the computing node continuing to occupy the distributed lock in the future time period after performing the access task for the target resource, so as to timely satisfy the requirement of the remote node for the distributed lock, and stably maintain the load balance in the distributed system. In other embodiments, the distributed lock may be automatically released to unlock after a lease expires, wherein the lease indicates that the computing node occupies the effective length of the distributed lock, and the lease specifies a length of time that is generally greater than a length of time required for the access task to be performed.
In some embodiments, referring to what is described in case 1 and case 2, the updated first expected information references the number of local node transmissions of locking requests and the number of remote node locking failures. Illustratively, the first expected information that is not updated is K; for the case 1, updating for the first time to obtain first expected information S=K-1 after the first updating; for case 2, a second update occurs, resulting in the first expected information s=k-1+m after the second update. Thus, the gap between updated first expected information and non-updated first expected information can be used as an unlock value to indicate whether the frequency of sending locking requests is greater than the frequency with which the computing node continues to occupy the distributed lock. Based on this, the unlocking condition may be that the unlocking value is greater than the unlocking threshold. In some embodiments, the unlock value may be derived based on equation (5) below.
Z=S-K (5)
In the formula (5), S is updated first expected information; k is first expected information which is obtained based on the upper score limit value and the historical expected probability and is not updated; z is an unlocking value and is an integer. Based on this, the unlock condition may be: z is 0 or more. It will be appreciated that the unlock value Z can represent the difference between the number of lock failures at the remote node and the number of lock requests accumulated at the local node.
According to the scheme, the requirements of each node in the distributed system on the distributed lock can be predicted by referring to the historical experience, and the requirements are updated in real time according to the request condition, so that expected information can be continuously updated on the basis of the historical experience, unlocking is performed according to the expected information, the condition that the computing node can be matched with the request condition in the distributed system when the computing node is unlocked is ensured, the performance of the distributed lock is effectively improved, and further, the load balance in the distributed system is stably maintained.
In other embodiments, if the updated first expected information does not meet the unlocking condition, the computing node can determine the number of times of locking requests to be executed by the computing node based on the updated first expected information, so as to release the computing node from occupying the distributed lock after the execution of the locking requests to be executed is completed. In this example, the number of locking requests sent by the local node is greater than the number of locking failures of the remote node, so that the local node can determine, according to the first expected information, the number of locking requests to be executed, thereby preferentially meeting the local reuse requirement. In some embodiments, the updated first expected information does not meet the unlocking condition may refer to: the unlock value Z is less than 0, and the number of locking requests to be performed may be the absolute value |z| of the unlock value Z.
In other embodiments, the computing node, in response to the computing node receiving the second expected information, unbuckles the distributed lock if the number of lock failures is greater than zero. Referring to the above formula (4), in response to receiving the second expected information, if M is greater than 0, the computing node unlocks after performing its access task for the target resource, so as to timely satisfy the requirement of the remote node for the distributed lock, so as to stably maintain load balance in the distributed system.
According to the technical scheme, the local node can directly finish locking operation on the local aiming at the locking request sent by the local node under the condition that the local demand on distributed locks is large, so that the influence of network delay on locking efficiency is effectively avoided; the local node can also unlock under the condition that the demand of the remote node for the distributed lock is large, and the demand of the remote node for the distributed lock can be timely met. Based on the method, the locking efficiency of the distributed lock under various conditions is improved, the performance of the distributed lock is effectively improved, and the demands between the local node and the remote node are stably balanced, so that the load balance in the distributed system is efficiently maintained.
In some embodiments, the historical expected probabilities used in the above process can be persisted as a type of configuration information into a global profile of the distributed system, thereby providing to individual computing nodes in the distributed system. In some embodiments, the initial value of the historical expected probability is set according to engineering experience, and each computing node in the distributed system is capable of adjusting the historical expected probability based on lock load information received from a management node (see introduction in fig. 1) every first period, the lock load information indicating a duty cycle of a number of successful locking requests sent by the plurality of computing nodes for the target resource in the first period.
In some embodiments, each computing node in the distributed system records the number of request failures V and the number of request successes U in the locking requests sent by itself, and each computing node periodically sends the V and U counted by itself to the management node. And the management node determines the duty ratio of the successful locking times in the first period based on the V and U of each computing node received in the first period so as to determine the locking load information in the first period. In some embodiments, the computing node transmits V and U every first time period that is less than the duration of the first period.
In some embodiments, the duty ratio W of the number of successful locking times can be calculated by the following formula (6).
In the formula (6), V i Is the number of failed requests sent the ith time in the first period, U i The number of successful requests sent by the ith time in the first period; n is the number of compute nodes; w is the duty ratio of the successful times of locking, also called the effective request duty ratio; i and n are positive integers.
In some embodiments, the computing node may implement the process of adjusting the historical expected probability based on lock load information in two ways.
In the first aspect, when the duty ratio of the number of successful locking times indicated by the lock load information is greater than the expected historical probability, the expected historical probability is adjusted so that the frequency of sending locking requests in the future time period is increased.
In some embodiments, in the case that the duty ratio of the locking success times is greater than the historical expected probability, that is, in the case that W > P, it is stated that the remote node in the distributed system brings a larger load to the demand of the distributed lock, so by increasing the historical expected probability P, the frequency of sending the locking request in the future time period predicted based on the adjusted P can be increased, so as to increase the probability that the local node directly unlocks after finishing the access task to the target resource.
And in a second mode, when the duty ratio of the locking success times indicated by the locking load information is smaller than or equal to the historical expected probability, the expected frequency is adjusted so as to reduce the frequency of sending locking requests in the future time period.
In some embodiments, in the case where the duty ratio of the number of successful locking times is less than or equal to the historical expected probability, that is, w+.p, it is indicated that the load caused by the demand of the remote node for the distributed lock in the distributed system is smaller, so by reducing the historical expected probability P, the frequency of sending locking requests in the future time period predicted based on the adjusted P can be reduced, so as to reduce the probability that the local node directly unlocks after finishing the access task to the target resource.
In some embodiments, any computing node in the distributed system that has access to the target resource may store the historical expected probability locally, and may then perform the process of adjusting the historical expected probability.
According to the technical scheme, the computing node can periodically update the history expected probability in an iterative manner according to the request condition of the distributed lock in the distributed system so as to balance the use requirement of the remote node and the local node on the distributed lock, thereby greatly improving the performance of the distributed lock, ensuring that each node can successfully occupy the distributed lock at the required time, and effectively maintaining the load balance in the distributed system.
The above process of adjusting the historical expected probability based on lock load information is actually a machine learning process that is continuously iterated and updated to improve the performance of the distributed lock, and in some embodiments, the model for iteratively adjusting the historical expected probability is constructed based on the above-provided technical solution with the purpose of improving the performance of the distributed lock and maintaining the load balance in the distributed system. Based on the model, the historical expected probability can be iteratively adjusted according to the change of the request condition of the distributed lock in the distributed system, and can be used as a staged result of machine learning and persisted into a global configuration file record of the distributed system, so that historical data for managing the distributed lock is continuously accumulated, and maintainability of the distributed lock management method provided by the application is improved.
In order to facilitate understanding, based on the foregoing, the present application provides a schematic diagram of a method for managing a distributed lock, referring to fig. 5, where a length M of a waiting queue of the distributed lock and an effective request duty ratio W are maintained in a management node, and initial scores K (corresponding to a value of first expected information that is not updated), current scores S (corresponding to a value of first expected information that is updated), unlock values Z (a difference between S and K), request failure times V, and request success times U are maintained in a plurality of computing nodes; the configuration file config file stores a history expected probability P and a scoring upper limit value N (set to 100); the dotted line between the computing node and the configuration file indicates that the computing node can read and write the configuration file; the management node and the plurality of computing nodes can interact based on the process described in the corresponding embodiment of fig. 4, so that the computing nodes implement the method for managing the distributed lock provided by the application.
The above embodiments describe in detail the process of executing the method for managing distributed locks provided in the present application by a computing node in a distributed system. In some embodiments, the method for managing a distributed lock provided herein may be implemented in a distributed system in the form of a computing instance, for example, running in a computing node in the form of a virtual machine, container, or process, which is not limited in this application.
For ease of understanding, the present application provides a deployment diagram of a method of management of a distributed lock, see fig. 6, a distributed system comprising a management node 610 and a plurality of compute nodes 620. Wherein, the management node 610 runs a lock management process 611 and a lock request process 621 based on a background thread of a Primary process Primary of the distributed system; the lock request process 621 runs on a background thread of the standby process Replica of the distributed system in any computing node 620; the functions of the lock management process and the lock request process will be described in the following embodiments.
In some embodiments, the lock management process and the lock request process may be implemented in different modes in the same process, so that the lock management process and the lock request process are lightweight and deployed in each computing node, so that each computing node can switch modes according to requirements to implement corresponding functions, and applicability of the distributed lock management method provided by the application is further improved.
Next, taking the interaction between the computing node 110 and the management node 120 in the above-mentioned distributed system as an example, the technical solution provided in the present application will be further described with reference to the deployment manner provided in fig. 6. The method for managing the distributed lock provided in the embodiment of the present application can be applied to the distributed system 100 shown in fig. 1 according to the deployment manner shown in fig. 6, where the method includes the following three parts, namely, a locking part, an unlocking part and an information management part.
In this embodiment, the locking portion includes the following steps 1A to 3A, and the present application provides a flowchart of a locking process, so as to facilitate understanding the following steps 1A to 3A, see fig. 7, where a dashed arrow in fig. 7 indicates a cross-process information interaction.
Step 1A, a lock request process in a computing node responds to a locking request of the computing node for the distributed lock, and the first expected information is updated under the condition that the distributed lock is occupied by the computing node.
The determining process of the first expected information refers to step 401, which is not described herein.
This step is referred to above in case 1, and will not be described here.
In some embodiments, the lock request process (see fig. 7) further determines if the distributed lock is released if the distributed lock is occupied (held) by the compute node, and if the distributed lock is released, updates the first expected information based on equation (3) and locks successfully to achieve local reuse of the distributed lock. If the distributed lock is not released, indicating that the distributed lock is being occupied by other threads, locking fails.
And 2A, the lock request process sends a locking request to a management node under the condition that the distributed lock is not occupied by the computing node.
And 3A, the lock management process in the management node responds to the locking request, and sends second expected information to the computing node under the condition that the distributed lock is occupied.
This step 3A may refer to the description in case 2 above, and will not be described herein.
In this embodiment of the present application, the second expected information indicates the number of locking failures. In some embodiments, the lock management process determines whether the distributed lock is occupied in response to the locking request; and under the condition that the distributed lock is occupied, updating the length M of a waiting queue corresponding to the distributed lock, wherein the length of the waiting queue is the locking failure times.
In some embodiments, in the event that the distributed lock is occupied, the management node returns failure information to the lock request process of the computing node to indicate that the locking request failed.
In the embodiment of the present application, the unlocking portion includes the following steps 1B to 3B, and the present application provides a flowchart of an unlocking process, so as to understand the following steps 1B to 3B, see fig. 8, and dashed arrows in fig. 8 indicate cross-process information interaction.
And step 1B, the lock request process in the computing node updates the first expected information of the distributed lock based on the second expected information.
This step 1B is referred to the description in case 2 above, and will not be described here.
In some embodiments, the lock request process initiates an unlocking flow upon completion of execution of an access task to a target resource in the compute node. During unlocking, the lock request process updates the first expected information process according to a formula (4) based on the locking failure times indicated by the second expected information acquired from the management node.
And step 2B, sending an unlocking request aiming at the distributed lock to a management node by a lock request process in the computing node under the condition that the updated first expected information meets the unlocking condition.
The present step is referred to above step 403, and will not be described herein. Referring to fig. 8, the unlock condition may be: the length M of the waiting queue is greater than or equal to 0 or the unlocking value Z is less than or equal to 0, and "||" is logic OR.
And 3B, the lock management process in the management node receives an unlocking request for the distributed lock, and the distributed lock is set to be in an unoccupied state.
In this embodiment of the present application, the information management portion includes the following steps 1C to 5C, and the present application provides a schematic diagram of an information management process, so as to facilitate understanding the following steps 1C to 5C, see fig. 9, and dashed arrows in fig. 9 indicate cross-process information interaction.
And step 1C, the lock request process in the computing node sends the request failure times V and the request success times U to the management node every a first time length.
This step 1C is referred to the description of step 403, and will not be described herein.
And 2C, the lock management process in the management node receives the request failure times V and the request success times U from each computing node, determines lock load information W at intervals of a first period, and returns the lock load information W to the computing node.
This step 2C is referred to the description of step 403, and will not be described herein.
In the embodiment of the application, the lock management process can update the lock load information W based on the background thread and the received V and U, and return W to the computing node at intervals of a first period.
In other embodiments, the lock management process can periodically send the lock load information W the length M of the wait queue of the distributed lock. Alternatively, the lock management process can return W and the length M of the wait queue of the distributed lock to the compute node based on different cycles, respectively, which is not limited in this application.
And 3C, adjusting the history expected probability by a lock request process in the computing node based on the lock load information received from the management node every other first period.
This step 3C is referred to the description of step 403, and will not be described herein. In some embodiments, referring to FIG. 9, the lock request process also periodically receives the length M of the wait queue for the distributed lock.
And 4C, the lock management process in the management node sends the lock flow information of the distributed lock to each computing node every a first period, wherein the lock flow information indicates communication resources consumed by the system for managing the distributed lock in the second period.
In some embodiments, the lock management process determines the lock traffic information based on an amount of data for locking and unlocking requests for the distributed lock in a second period. Wherein the higher the frequency of transmission of the locking and unlocking requests, the more communication resources are consumed for managing the distributed lock.
In some embodiments, the lock traffic information may indicate the bandwidth consumed by the distributed lock related requests during the second period, i.e., the communication resource refers to the communication bandwidth.
And 5C, calculating a lock request process in the node, and adjusting the frequency of sending the locking request every second period based on the received lock flow information and the flow threshold value.
In some embodiments, the lock request process can adjust its frequency of sending locking requests based on the lock traffic information and traffic threshold received every second period without occupying the distributed lock.
In some embodiments, the communication resource refers to a communication bandwidth, and the traffic threshold may be a bandwidth threshold, for example, 100 megabytes per second (MB/s).
In some embodiments, the lock request process reduces the frequency of sending the locking request under the condition that the communication resource indicated by the lock flow information exceeds the flow threshold, so as to avoid overlarge burden on the distributed system due to occupation of excessive communication resources and further ensure load balancing in the distributed system.
In other embodiments, the lock management process in the management node can directly determine whether the communication resource occupied by the distributed lock currently exceeds the traffic threshold based on the lock traffic information and the traffic threshold, so that under the condition that the traffic threshold is exceeded, high traffic warning information is sent to each computing node, and the frequency of sending the locking request can be directly adjusted by each computing node after receiving the high traffic warning information, without further determining, so that the effect of load balancing is further improved.
According to the scheme, the requirements of each node in the distributed system on the distributed lock can be predicted by referring to the historical experience, and the requirements are updated in real time according to the request condition, so that expected information can be continuously updated on the basis of the historical experience, unlocking is performed according to the expected information, the condition that the computing node can be matched with the request condition in the distributed system when the computing node is unlocked is ensured, the performance of the distributed lock is effectively improved, and further, the load balance in the distributed system is stably maintained.
Further, according to the occupation condition of the distributed locks on the communication resources in the distributed system, each computing node dynamically adjusts the frequency of the requests, so that the consumption of the cluster on the bandwidth is effectively optimized, and the effect of only load balancing of the distributed system is improved.
Fig. 10 is a schematic structural diagram of a management device of a distributed lock provided in an embodiment of the present application, referring to fig. 10, where the management device of a distributed lock can be applied to a distributed system including a plurality of computing nodes, and the device includes:
a prediction module 1001, configured to predict, based on a historical expected probability, first expected information for a distributed lock, where the computing node occupies a distributed lock corresponding to a target resource, the historical expected probability being determined based on locking requests sent by the plurality of computing nodes for the target resource, the locking requests being used to occupy the distributed lock, and the first expected information indicating a frequency with which the plurality of computing nodes send locking requests in a future time period;
An updating module 1002, configured to update the first expected information based on request conditions of the plurality of computing nodes of the distributed system for the distributed lock;
and an unlocking module 1003, configured to release the computing node from occupying the distributed lock if the updated first expected information meets an unlocking condition.
In one possible implementation, the updating module 1002 is configured to:
in response to a locking request by the computing node for the distributed lock, the first expected information is updated such that the frequency of sending locking requests in a future time period is reduced, the locking request by the computing node for the distributed lock being used to request continued occupancy of the distributed lock.
In one possible embodiment, the apparatus further comprises:
the target module is used for determining the number of times of locking requests to be executed by the computing node based on the updated first expected information if the updated first expected information does not meet the unlocking condition;
the unlocking module 1003 is further configured to release the computing node from occupying the distributed lock after the execution of the to-be-executed locking request is completed.
In one possible implementation, the updating module 1002 is configured to:
In response to the computing node receiving second expected information for the distributed lock, the first expected information is updated to increase the frequency of sending locking requests in a future time period, the second expected information indicating a number of locking failures of locking requests that the plurality of computing nodes have sent.
In one possible embodiment, the apparatus further comprises:
and the target unlocking module is used for responding to the second expected information received by the computing node and releasing the occupation of the computing node on the distributed lock under the condition that the locking failure times are greater than zero.
In one possible implementation, the unlocking condition is that the frequency of sending locking requests in a future time period is greater than the frequency that the computing node continues to occupy the distributed lock.
In one possible embodiment, the apparatus further comprises:
the load adjustment module is used for adjusting the historical expected probability based on the received lock load information every other first period, wherein the lock load information indicates the duty ratio of the successful locking times in locking requests sent by the plurality of computing nodes aiming at the target resource in the first period.
In one possible embodiment, the load adjustment module is configured to:
Adjusting the historical expected probability to increase the frequency of sending locking requests in the future time period if the duty ratio of the locking success times indicated by the locking load information is greater than the historical expected probability;
and adjusting the expected frequency to reduce the frequency of sending locking requests in the future time period when the duty ratio of the locking success times indicated by the locking load information is smaller than or equal to the historical expected probability.
In one possible embodiment, the apparatus further comprises:
and the traffic adjustment module is used for adjusting the frequency of sending the locking request based on the received locking traffic information and the traffic threshold value every second period under the condition that the distributed lock is not occupied, wherein the locking traffic information indicates that the system is used for managing communication resources consumed by the distributed lock in the second period.
According to the scheme, the requirements of each node in the distributed system on the distributed lock can be predicted by referring to the historical experience, and the requirements are updated in real time according to the request condition, so that expected information can be continuously updated on the basis of the historical experience, unlocking is performed according to the expected information, the condition that the computing node can be matched with the request condition in the distributed system when the computing node is unlocked is ensured, the performance of the distributed lock is effectively improved, and further, the load balance in the distributed system is stably maintained.
Further, according to the occupation condition of the distributed locks on the communication resources in the distributed system, each computing node dynamically adjusts the frequency of the requests, so that the consumption of the cluster on the bandwidth is effectively optimized, and the effect of only load balancing of the distributed system is improved.
It should be noted that, when implementing the corresponding functions, the management device for a distributed lock provided in the foregoing embodiment is only exemplified by the division of the foregoing functional modules, and in practical application, the foregoing functional allocation may be implemented by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to implement all or part of the functions described above. In addition, the management device of the distributed lock provided in the above embodiment and the management method embodiment of the distributed lock belong to the same concept, and the detailed implementation process of the device is referred to the method embodiment, which is not repeated here.
It should be noted that, information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals referred to in this application are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of relevant data is required to comply with relevant laws and regulations and standards of relevant countries and regions. For example, the target resources referred to in this application are all acquired with sufficient authorization.
The terms "first," "second," and the like in this application are used to distinguish between identical or similar items that have substantially the same function and function, and it should be understood that there is no logical or chronological dependency between the "first," "second," and "nth" terms, nor is it limited to the number or order of execution. It will be further understood that, although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by the terms. These terms are only used to distinguish one element from another element. For example, the first period may be referred to as the second period, and similarly, the second period may be referred to as the first period, without departing from the scope of the various described examples. The first period and the second period may both be periods, and in some cases may be separate and distinct periods.
The term "at least one" in this application means one or more, the term "plurality" in this application means two or more, for example, a plurality of nodes means two or more.
The foregoing description is merely a specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any person skilled in the art can easily think about various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions are all covered by the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a program product. The program product includes one or more program instructions. When loaded and executed on a computing device, produces, in whole or in part, a flow or function in accordance with embodiments of the present application.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the above storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above embodiments are merely for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present application.

Claims (21)

1. A method of managing a distributed lock for use in a computing node of a distributed system, the system including a plurality of computing nodes, the method comprising:
predicting first expected information for a distributed lock based on a historical expected probability when the computing node occupies the distributed lock corresponding to a target resource, wherein the historical expected probability is determined based on locking requests sent by the computing nodes for the target resource, the locking requests are used for occupying the distributed lock, and the first expected information indicates the frequency of the locking requests sent by the computing nodes in a future time period;
updating the first expected information based on request conditions of the distributed locks by the plurality of computing nodes of the distributed system;
and if the updated first expected information meets an unlocking condition, releasing the occupation of the distributed lock by the computing node.
2. The method of claim 1, wherein the updating the first expected information based on the request for the distributed lock by the plurality of computing nodes of the distributed system comprises:
In response to a locking request by the computing node for the distributed lock, updating the first expected information to reduce the frequency of sending locking requests in a future time period, the locking request by the computing node for the distributed lock being used to request continued occupancy of the distributed lock.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
if the updated first expected information does not meet the unlocking condition, determining the number of times of locking requests to be executed by the computing node based on the updated first expected information;
and after the execution of the locking request to be executed is completed, releasing the occupation of the distributed lock by the computing node.
4. A method according to any one of claims 1 to 3, wherein the updating the first expected information based on the request for the distributed lock by the plurality of computing nodes of the distributed system comprises:
in response to the computing node receiving second expected information for the distributed lock, updating the first expected information to increase the frequency of sending locking requests in a future time period, the second expected information indicating a number of locking failures of locking requests that the plurality of computing nodes have sent.
5. The method according to claim 4, wherein the method further comprises:
and in response to the computing node receiving the second expected information, releasing the occupation of the distributed lock by the computing node under the condition that the locking failure times are greater than zero.
6. The method of any of claims 1-5, wherein the unlocking condition is that the frequency of sending locking requests in a future time period is greater than the frequency at which the computing node continues to occupy the distributed lock.
7. The method according to any one of claims 1 to 6, further comprising:
and adjusting the historical expected probability based on the received lock load information every first period, wherein the lock load information indicates the duty ratio of the locking success times in locking requests sent by the plurality of computing nodes aiming at target resources in the first period.
8. The method of claim 7, wherein adjusting the historical expected probability based on the received lock load information every first period comprises:
adjusting the historical expected probability to increase the frequency of sending locking requests in the future time period when the duty ratio of the locking success times indicated by the locking load information is greater than the historical expected probability;
And in the case that the duty ratio of the locking success times indicated by the locking load information is smaller than or equal to the historical expected probability, adjusting the expected frequency so as to reduce the frequency of sending locking requests in the future time period.
9. The method according to any one of claims 1 to 8, further comprising:
and under the condition that the distributed lock is not occupied, adjusting the frequency of sending locking requests every second period based on the received lock flow information and the flow threshold value, wherein the lock flow information indicates that the system is used for managing communication resources consumed by the distributed lock in the second period.
10. A distributed lock management apparatus for use in a distributed system including a plurality of computing nodes, the apparatus comprising:
a prediction module, configured to predict, based on a historical expected probability, first expected information for a distributed lock, where the computing node occupies a distributed lock corresponding to a target resource, the historical expected probability being determined based on locking requests sent by the plurality of computing nodes for the target resource, the locking requests being used to occupy the distributed lock, the first expected information indicating a frequency with which the plurality of computing nodes send locking requests in a future time period;
An updating module, configured to update the first expected information based on request conditions of the plurality of computing nodes of the distributed system for the distributed lock;
and the unlocking module is used for releasing the occupation of the computing node on the distributed lock if the updated first expected information accords with an unlocking condition.
11. The apparatus of claim 10, wherein the update module is configured to:
in response to a locking request by the computing node for the distributed lock, updating the first expected information to reduce the frequency of sending locking requests in a future time period, the locking request by the computing node for the distributed lock being used to request continued occupancy of the distributed lock.
12. The apparatus according to claim 10 or 11, characterized in that the apparatus further comprises:
the target module is used for determining the number of times of locking requests to be executed by the computing node based on the updated first expected information if the updated first expected information does not meet the unlocking condition;
and the unlocking module is also used for releasing the occupation of the computing node on the distributed lock after the execution of the locking request to be executed is completed.
13. The apparatus according to any one of claims 10 to 12, wherein the updating module is configured to:
in response to the computing node receiving second expected information for the distributed lock, updating the first expected information to increase the frequency of sending locking requests in a future time period, the second expected information indicating a number of locking failures of locking requests that the plurality of computing nodes have sent.
14. The apparatus of claim 13, wherein the apparatus further comprises:
and the target unlocking module is used for responding to the second expected information received by the computing node, and releasing the occupation of the computing node on the distributed lock under the condition that the locking failure times are greater than zero.
15. The apparatus of any of claims 10 to 14, wherein the unlocking condition is that the frequency of sending locking requests in a future time period is greater than the frequency at which the computing node continues to occupy the distributed lock.
16. The apparatus according to any one of claims 10 to 15, further comprising:
the load adjustment module is used for adjusting the historical expected probability based on the received lock load information every other first period, wherein the lock load information indicates the duty ratio of the locking success times in locking requests sent by the plurality of computing nodes aiming at the target resource in the first period.
17. The apparatus of claim 16, wherein the load adjustment module is configured to:
adjusting the historical expected probability to increase the frequency of sending locking requests in the future time period when the duty ratio of the locking success times indicated by the locking load information is greater than the historical expected probability;
and in the case that the duty ratio of the locking success times indicated by the locking load information is smaller than or equal to the historical expected probability, adjusting the expected frequency so as to reduce the frequency of sending locking requests in the future time period.
18. The apparatus according to any one of claims 10 to 17, further comprising:
and the flow adjustment module is used for adjusting the frequency of sending the locking request based on the received locking flow information and the flow threshold value every second period under the condition that the distributed lock is not occupied, wherein the locking flow information indicates that the system is used for managing communication resources consumed by the distributed lock in the second period.
19. A cluster of computing devices, comprising at least one computing device, each computing device comprising a processor and a memory;
The memory of the at least one computing device is configured to store at least one piece of program code that is loaded by the processor of the at least one computing device to cause the cluster of computing devices to perform the method of managing a distributed lock according to any one of claims 1 to 9.
20. A computer readable storage medium for storing at least one piece of program code, which when executed by a cluster of computing devices performs the method of managing a distributed lock according to any one of claims 1 to 9.
21. A computer program product, which, when run on a cluster of computing devices, causes the cluster of computing devices to perform the method of managing a distributed lock according to any one of claims 1 to 9.
CN202210979704.2A 2022-08-16 2022-08-16 Distributed lock management method and device, computing device cluster and storage medium Pending CN117632523A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210979704.2A CN117632523A (en) 2022-08-16 2022-08-16 Distributed lock management method and device, computing device cluster and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210979704.2A CN117632523A (en) 2022-08-16 2022-08-16 Distributed lock management method and device, computing device cluster and storage medium

Publications (1)

Publication Number Publication Date
CN117632523A true CN117632523A (en) 2024-03-01

Family

ID=90020463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210979704.2A Pending CN117632523A (en) 2022-08-16 2022-08-16 Distributed lock management method and device, computing device cluster and storage medium

Country Status (1)

Country Link
CN (1) CN117632523A (en)

Similar Documents

Publication Publication Date Title
EP3380937B1 (en) Techniques for analytics-driven hybrid concurrency control in clouds
US8191068B2 (en) Resource management system, resource information providing method and program
US8370318B2 (en) Time limited lock ownership
US9912704B2 (en) System, apparatus and method for access control list processing in a constrained environment
US20240137295A1 (en) Distributed workload reassignment following communication failure
US10503671B2 (en) Controlling access to a shared resource
CN109995669B (en) Distributed current limiting method, device, equipment and readable storage medium
US20110185364A1 (en) Efficient utilization of idle resources in a resource manager
US9860192B2 (en) Distributed computing architecture
CN111580990A (en) Task scheduling method, scheduling node, centralized configuration server and system
US9405602B1 (en) Method for application notification and tasking
US11263061B2 (en) Efficient and scalable use of shared resources
US7085815B2 (en) Scalable memory management of token state for distributed lock managers
CN111722933A (en) Deadlock resolution between distributed processes
US8732346B2 (en) Coordination of direct I/O with a filter
US20150372946A1 (en) Acquiring resource lease using multiple lease servers
JPH04271453A (en) Composite electronic computer
US10348814B1 (en) Efficient storage reclamation for system components managing storage
JP7305898B2 (en) Operation response method, operation response device, electronic device and storage medium
CN111143033B (en) Operation execution method and device based on scalable operation system
CN117632523A (en) Distributed lock management method and device, computing device cluster and storage medium
CN113472638A (en) Edge gateway control method, system, device, electronic equipment and storage medium
CN111327663A (en) Bastion machine distribution method and equipment
US20240015595A1 (en) Distributed Network Management System
US20230315522A1 (en) Systems and methods for implementing distributed scheduling capabilities for computing clusters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination