CN111737063B - Disk lock arbitration method, device, equipment and medium for double-control brain fracture - Google Patents

Disk lock arbitration method, device, equipment and medium for double-control brain fracture Download PDF

Info

Publication number
CN111737063B
CN111737063B CN202010429475.8A CN202010429475A CN111737063B CN 111737063 B CN111737063 B CN 111737063B CN 202010429475 A CN202010429475 A CN 202010429475A CN 111737063 B CN111737063 B CN 111737063B
Authority
CN
China
Prior art keywords
disk
node
service
disk array
lock
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010429475.8A
Other languages
Chinese (zh)
Other versions
CN111737063A (en
Inventor
王晓强
岳亚丰
朱明胜
贾德明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Whale Shark Information Technology Co ltd
Original Assignee
Shandong Whale Shark Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Whale Shark Information Technology Co ltd filed Critical Shandong Whale Shark Information Technology Co ltd
Priority to CN202010429475.8A priority Critical patent/CN111737063B/en
Publication of CN111737063A publication Critical patent/CN111737063A/en
Application granted granted Critical
Publication of CN111737063B publication Critical patent/CN111737063B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2056Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
    • G06F11/2071Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring using a plurality of controllers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

One or more embodiments of the present disclosure provide a method, an apparatus, a device, and a medium for disk lock arbitration of dual-control brain fracture, where after a communication failure occurs in a heartbeat line between dual-control nodes, the dual-control nodes include a first node and a second node, and the method includes: determining a disk array to be provided with service, and transmitting a disk lock to the disk array in a non-blocking mode by a double control node; detecting whether each disk included in the disk array is successfully locked by the first node; when each disk of the disk array is successfully locked, taking over the service by a first node; when at least one disk of the disk array is not successfully locked, releasing the disk lock by the double-control node; when all the disks of the disk array are not successfully locked, the second node takes over the service; the dual control node sends a disk lock to the disk array in a blocking manner. The invention uses the service as a unit to perform disk lock arbitration, effectively prevents the occurrence of double-control brain fracture, and avoids inconsistent data at IO level.

Description

Disk lock arbitration method, device, equipment and medium for double-control brain fracture
Technical Field
One or more embodiments of the present disclosure relate to the field of dual-control storage technologies, and in particular, to a method, an apparatus, a device, and a medium for disk lock arbitration for dual-control brain fracture.
Background
The brain split phenomenon is that in a High Availability (HA) system, when a "jumper" connecting 2 nodes (also called controllers, control nodes) is disconnected, the HA system, which is an integral and coordinated system, is split into 2 independent individuals. The other side fails because of losing contact with each other. HA software on both nodes contends for "shared resources", contending for "application services", like "spallation person", with serious consequences: or shared resources are divided by melons, and the service corresponding to the two nodes cannot be provided; or the "service" corresponding to both nodes can be provided, but read and write "shared storage" simultaneously, resulting in data corruption (often an online log error such as database polling).
Current measures to avoid the occurrence of brain cracks include:
1. two communication paths are added by redundancy mechanisms, however the disadvantage of doing so is: adding two paths of communication clearly increases the cost for the product, and the situation that double communication is damaged at the same time cannot be eliminated, so that the communication is not completely trusted;
2. Referring to the IP arbitration mechanism, when a fault occurs, the dual control simultaneously pings the reference IP, and pings the reference IP to take over the opposite end service, however, the disadvantage of doing so is: IP arbitration cannot ensure complete availability, and when the reference IP is passed through for double control, brain fracture can occur;
3. the disk lock arbitration mechanism is used for determining the use right stored at the rear end through locking after the fault; however, this has the disadvantage that: the traditional arbitration disk mechanism needs to introduce additional arbitration equipment which does not bear user data, has high cost, and the arbitration and the use of a simple disk lock do not accord with the current mainstream scene (Raid, redundant Arrays of Independent Disks, redundant array formed by independent disks) disk array, so that service confusion is caused.
Therefore, how to reduce the sequelae of brain fracture on the premise of avoiding losing the actual data storage capacity of the user when the high-availability system fails is a urgent and unsolved problem in the industry at present.
Disclosure of Invention
In view of this, one or more embodiments of the present disclosure are directed to a method, an apparatus, a device, and a medium for arbitration of a dual-control split-brain disk lock, so as to solve the problems of unreliable read and write of disk data for service and disordered service when a dual-control system fails.
Based on the above object, one or more embodiments of the present disclosure provide a disk lock arbitration method for dual-control brain fracture, which is applied to a heartbeat line between dual-control nodes after a communication failure occurs, wherein the dual-control nodes include a first node and a second node, and the method includes:
determining a disk array to be provided with services, and transmitting a disk lock to the disk array in a non-blocking mode by a double-control node, wherein the disk array comprises a plurality of disks, and the plurality of disks are determined by the services provided by the double-control node;
detecting whether each disk included in the disk array is successfully locked by a first node;
when each disk of the disk array is successfully locked, taking over the service by a first node;
when at least one disk of the disk array is not successfully locked, the dual-control node releases a disk lock; when all the disks of the disk array are not successfully locked, the second node takes over the service;
the double control node sends a disk lock to the disk array in a blocking mode;
when the first node locks one disk in the disk array successfully before the second node, the first node sequentially sends the disk locks to other disks in the disk array until all the disks included in the disk array are successfully locked, at this time, the first node takes over the client service request corresponding to the service, and the second node gives up the service request corresponding to the service;
When the second node locks one disk in the disk array successfully before the first node, the second node sequentially sends the disk locks to other disks in the disk array until all the disks included in the disk array are successfully locked, at this time, the second node takes over the client service request corresponding to the service, and the first node gives up the service request corresponding to the service.
In combination with the foregoing description, in another possible implementation manner of the embodiment of the present invention, when the second node succeeds in locking one disk of the disk array before the first node, the method includes:
after the first node and the second node send disk locks to the disk array in a blocking mode, detecting whether the disk of the disk array has the disk locks or not respectively;
when the disk of the disk array has the disk lock of the second node, the first node releases the disk lock sent by the first node and gives up the service request corresponding to the service.
In combination with the foregoing description, in another possible implementation manner of the embodiment of the present invention, when the first node succeeds in locking one disk of the disk array before the second node, the method includes:
After the first node and the second node send disk locks to the disk array in a blocking mode, detecting whether the disk of the disk array has the disk locks or not respectively;
when the disk of the disk array has the disk lock of the first node, the second node releases the disk lock sent by the second node and gives up the service request corresponding to the service.
In combination with the foregoing description, in another possible implementation manner of the embodiment of the present invention, when the first node succeeds in locking one disk of the disk array before the second node, the method includes:
after the first node and the second node send disk locks to the disk array in a blocking mode, detecting whether the disk of the disk array has the disk locks or not respectively;
when the disk of the disk array has the disk lock of the second node, the first node releases the disk lock sent by the first node and gives up the service request corresponding to the service;
when the disk of the disk array has the disk lock of the first node, the second node releases the disk lock sent by the second node and gives up the service request corresponding to the service.
In combination with the foregoing description, in another possible implementation manner of the embodiment of the present invention, before the determining a disk array to be serviced, the method further includes:
Acquiring a service request of a client and creating a corresponding service according to the service request;
and dividing a plurality of disks from the controllable disks of the double-control node in a mapping mode by taking the service as a unit, and taking the disks as a disk array corresponding to the service.
In combination with the foregoing description, in another possible implementation manner of the embodiment of the present invention, the method further includes:
and the first node or the second node determines disk lock arbitration information of the double-control node through the query command in the process of locking and taking over the service request of the client corresponding to the service.
In a second aspect, the present invention further provides a disk lock arbitration device for dual-control brain fracture, where after a communication failure occurs in a heartbeat line between dual-control nodes, the dual-control nodes include a first node and a second node, and the device includes:
the system comprises a service acquisition module, a service management module and a service management module, wherein the service acquisition module is used for determining a disk array to be provided with a service, the disk array comprises a plurality of disks, and the plurality of disks are determined by the service provided by a double-control node;
the non-blocking locking module is used for sending a disk lock to the disk array in a non-blocking mode by the double control node;
the detection module is used for detecting whether each disk included in the disk array is successfully locked by the first node;
The first take-over module is used for taking over the service by the first node when each disk of the disk array is successfully locked;
the first releasing module is used for releasing the disk lock by the double-control node when at least one disk of the disk array is not successfully locked; when all the disks of the disk array are not successfully locked, the second node takes over the service;
the blocking locking module is used for sending a disk lock to the disk array in a blocking mode by the double control node;
the second take-over module is used for sequentially sending disk locks to other disks in the disk array by the first node when the first node locks one disk in the disk array before the second node, until all disks in the disk array are successfully locked, and at the moment, the first node takes over the client service request corresponding to the service;
the second giving up module is used for giving up the service request corresponding to the service by the second node when the second taking over module takes over successfully;
the third take-over module is used for sequentially sending disk locks to other disks in the disk array when the second node locks one disk in the disk array before the first node, until all disks in the disk array are successfully locked, and at the moment, the second node takes over the client service request corresponding to the service;
And the third discarding module is used for discarding the service request corresponding to the service by the first node when the third taking over module takes over successfully.
The device, further comprising:
the blocking detection module is used for respectively detecting whether the disk of the disk array has a disk lock or not after the first node and the second node send the disk lock to the disk array in a blocking mode;
the second releasing module is used for releasing the disk lock sent by the first node and giving up the service request corresponding to the service when the disk lock of the second node exists on the disk of the disk array;
the device, further comprising:
the blocking detection module is used for respectively detecting whether the disk of the disk array has a disk lock or not after the first node and the second node send the disk lock to the disk array in a blocking mode;
and the third releasing module is used for releasing the disk lock sent by the second node and giving up the service request corresponding to the service when the disk lock of the first node exists on the disk of the disk array.
In a third aspect, the present invention further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the above-mentioned method for arbitrating a disk lock with dual-control brain-splitting when executing the program.
In a fourth aspect, the present invention also provides a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the above-described disk lock arbitration method for dual-control brain fracture.
From the above, it can be seen that, in the method, the device, the equipment and the medium for arbitrating the disk lock with the dual-control brain fracture provided in one or more embodiments of the present disclosure, when the dual-control has a communication failure, any node sends the disk lock to a plurality of disks providing a service in a non-blocking manner at the same time, and directly takes over a client service corresponding to the service after the disk lock is locked successfully, and when the non-blocking manner is unsuccessful, sends the disk lock in a blocking manner, so that not only is the delay of the service to the client greatly reduced, but also the seamless take over service is realized by performing the disk arbitration with the service as a unit, thereby effectively preventing occurrence of brain fracture and avoiding occurrence of inconsistent data at the IO level.
Drawings
For a clearer description of one or more embodiments of the present description or of the solutions of the prior art, the drawings that are necessary for the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are only one or more embodiments of the present description, from which other drawings can be obtained, without inventive effort, for a person skilled in the art.
FIG. 1 is a schematic diagram of a basic flow of a method for disk lock arbitration for dual control brain fracture in one or more embodiments of the present disclosure;
FIG. 2 is a schematic diagram of a disk lock arbitration flow in accordance with one or more embodiments of the present disclosure;
FIG. 3 is a schematic diagram of a dual control system according to one or more embodiments of the present disclosure;
FIG. 4 is a schematic diagram of a dual-control split disk lock arbitration device according to one or more embodiments of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device in the present specification.
Detailed Description
For the purposes of promoting an understanding of the principles and advantages of the disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same.
It is noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present disclosure should be taken in a general sense as understood by one of ordinary skill in the art to which the present disclosure pertains. The use of the terms "first," "second," and the like in one or more embodiments of the present description does not denote any order, quantity, or importance, but rather the terms "first," "second," and the like are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
Node failure is a relatively common problem in a distributed environment, and refers to the phenomenon of downtime or "dead" of server nodes that make up a distributed system. When two (or more) nodes consider themselves to be the only active server at the same time, contention for resources occurs, which is a scenario called "split-brain" or "span cluster" (partitioned cluster).
Each node in the double-control nodes corresponds to one controller, each controller corresponds to one operating system, the operating systems of the nodes can be controlled through an upper computer, and the method can be applied to one node in the double-control nodes.
The invention relates to a disk lock arbitration method, a device, equipment and a medium for double-control brain fracture, which are mainly applied to a scene that a high-availability HA system with double hot standby avoids brain fracture, and the basic idea is that: when the double control fails, any node in the double control nodes sends a disk lock to a disk array providing service in a non-blocking mode, and directly takes over the client service corresponding to the service after success, and when the non-blocking mode is unsuccessful, the disk lock is respectively sent to the disk array in a blocking mode, and service data is stopped from being issued when the service is not taken over and is not completely taken over, so that no loss is generated in actual data of a user, seamless taking over of the client service is realized, the occurrence of brain cracking phenomenon of a double control system is effectively prevented, the condition of inconsistent data of IO level is avoided, and the delay of the external service is greatly reduced.
The embodiment is applicable to the case of performing dual-control split disk lock arbitration in an intelligent terminal with a disk lock arbitration module, where the method may be performed by a disk lock arbitration device, where the device may be implemented by software and/or hardware, and may be generally integrated in the intelligent terminal, or controlled by a central control module in the terminal, as shown in fig. 1, and is a basic flow diagram of the dual-control split disk lock arbitration method of the present invention, where the dual-control node includes a first node and a second node after a communication failure occurs on a heartbeat line between the dual-control nodes, and specifically includes the following steps:
in step 110, determining a disk array to be served, and sending, by a dual-control node, a disk lock to the disk array in a non-blocking manner, wherein the disk array includes a plurality of disks, and the plurality of disks are determined by the service provided by the dual-control node;
the disk array is RAID, and a SCSI subsystem in a system of double control nodes can judge whether the disk lock is successful or not, the SCSI subsystem realizes a communication architecture of a client/server style, an initiator sends a command request to target equipment, the target processes the request and returns a response to the initiator, the initiator can be SCSI equipment in a managed computer, the target of SCSI can be a disk, an optical disk and tape equipment or special equipment (such as box equipment), and the target of SCSI can also be a RAID disk array taking service as a unit in the embodiment of the invention, and each node in the double control nodes can comprise a plurality of disk arrays.
The service is created by the dual control system side according to the service request of the client side, one service corresponds to one service request of the client side, and one service request of the client side can correspondingly create at least one service on the dual control system side, and the two services are not in one-to-one correspondence.
The disk array to be provided with the service is a plurality of disks divided in the corresponding controllable disks by the double control system side according to the service request when the service request input by the client side is not executed yet.
And sending a disk lock to the disk array in a non-blocking mode, namely, the non-blocking lock, wherein when a plurality of nodes call threads to call a method respectively, when one thread acquires the lock first, judging that other threads do not acquire the lock and return the lock directly, only after the thread which acquires the disk lock first is released, the other threads can resend the acquired lock, and the other threads cannot acquire the lock before the thread which acquires the disk lock first is not released.
Referring to fig. 2, a schematic diagram of a disk lock arbitration flow in an embodiment of the present invention is shown, in which an arrow is a locking process, the number of arrows is the number of disks in service, after a first node control 1 in a dual-control node detects that a communication fault occurs between the dual-control nodes, the first node and a second node in the dual-control node respectively call a thread to call a disk lock arbitration mechanism, and after the control 1 acquires a lock in a non-blocking manner, the second node returns to a disk array corresponding to the service without participating in the service through disk lock arbitration.
In step 120, it is detected whether each disk included in the disk array is successfully locked by the first node;
the disk array may also be a cluster (Just a Bunch Of Disks, JBOD for short) that is a backplane mounted storage device with multiple disk drives that may include N disks, typically N.gtoreq.2.
When all the disks in the disk array belonging to one service can be successfully locked by one node, if the locking information is sent, judging whether the locking is successful or not according to the return value, for example, when the return value is 1, the locking is successful, and when the return value is 0, the locking failure is indicated, the first node and the second node can be judged through the return values when detecting, and when at least one disk fails to lock, the locking operation is indicated to exist for the two nodes, and the locking failure is not blocked.
In step 130, taking over the service by the first node when each disk of the disk array is successfully locked;
when N disks are successfully locked, the first node takes over the client service request corresponding to the service, and performs corresponding operations such as data issuing according to the specific content of the client service request information.
In step 140, when at least one disk of the disk array is not successfully locked, releasing the disk lock by the dual control node; when all the disks of the disk array are not successfully locked, the second node takes over the service;
and when at least one disk in the N disks fails to be locked, the second node in the double-control node is also locked, so that all disk locks of the first node and the second node are released and locking in a blocking mode is started.
In step 150, the dual control node sends a disk lock to the disk array in a blocking manner;
the blocking mode is as follows: when a plurality of threads call the same disk lock arbitration mechanism at the same time, all threads wait through queuing, namely, the threads enter a blocking state, and when corresponding signals (awakening and time) are obtained, the threads can enter a ready state of the threads, and all threads in the ready state enter a running state through competition.
And the double-control node sends the disk lock to the disk array in a blocking mode, namely, a certain node in the double-control node is determined to carry out locking operation in an arbitration mode, and when one node in the double-control node successfully locks one disk in the disk array before the other node, the node is indicated to win arbitration, and the other node is exited from the arbitration.
In step 160, when the first node locks one disk in the disk array before the second node, the first node sequentially sends the disk locks to other disks in the disk array until all the disks included in the disk array are successfully locked, at this time, the first node takes over the client service request corresponding to the service, and the second node gives up the service request corresponding to the service;
in step 170, when the second node locks one disk in the disk array before the first node, the second node sequentially sends the disk locks to other disks in the disk array until all the disks included in the disk array are successfully locked, at this time, the second node takes over the client service request corresponding to the service, and the first node gives up the service request corresponding to the service.
Both the first node and the second node may win, and after one node obtains the arbitration win, the other node exits the arbitration, and the other node can participate in the next round of arbitration or directly enter the next service.
FIG. 3 is a schematic diagram of a dual control system, a disk array, and a client, and in combination with the schematic diagrams of FIG. 2 and FIG. 3, after the first node control 1 determines that all the disks of the JBOD are successfully locked, the service originally belonging to the second node control 2 can be taken over, so as to implement dual-control seamless service connection. When the locking failure exists, the control 1 and the control 2 lock the JBOD in a blocking locking mode, the control 1 locks the control 2 after the control 1 to indicate that the control 1 arbitration fails, the control 2 locks all the disks of the disk array, the control 1 gives up the service, the control 1 locks the control 2 arbitration fails before the control 2 successfully, the control 1 locks all the disks of the disk array, and the control 2 gives up the service.
According to the method, when the double control is in communication failure, any node sends the disk lock to a plurality of disks providing the service in a non-blocking mode, and directly takes over the client service corresponding to the service after the disk lock is locked successfully, and when the non-blocking mode is unsuccessful, the disk lock is sent in a blocking mode, so that the delay of the client service is greatly reduced, and the seamless take-over service is realized by carrying out disk arbitration by taking the service as a unit, thereby effectively preventing the occurrence of brain fracture and avoiding the occurrence of inconsistent data of IO level.
According to the method, through the disk lock arbitration mode, the disk can continue to provide service according to the attribution of the raid group and the like and the non-communication controllers, namely, one node in the double-control nodes is not completely abandoned, so that the double nodes can orderly provide service outwards after brain fracture occurs, and the brain fracture phenomenon that one controller works or two controllers provide chaotic service is effectively avoided; and the reliability of the double-control system after the brain fracture fault removal is greatly improved while the cost is degraded by directly arbitrating the provided service magnetic disk without additionally introducing a third node arbitrating disk.
In a possible implementation manner of the exemplary embodiment of the present invention, before determining a disk array to be serviced, the method further includes: acquiring a service request of a client and creating a corresponding service according to the service request; and dividing a plurality of disks from the controllable disks of the double-control node in a mapping mode by taking the service as a unit, and taking the disks as a disk array corresponding to the service.
In the dual control system, a plurality of service units can exist, corresponding services are created in the controllable disks of the dual control system according to the service request of the client, the mapping is the mapping relation between each service and the disks included in the service units, namely, one service request corresponds to one service, one service corresponds to a plurality of disks, and the plurality of disks can form a disk array and correspond to one raid.
The method of the invention enables the service request to be executed on the side of the double control system as a whole in a service unit manner for the client without being split, thereby avoiding the situation of data confusion of data issue or up-issue on the side of the double control system.
In a possible implementation manner of the exemplary embodiment of the present invention, when the first node locks one disk in the disk array before the second node, the method includes:
After the first node and the second node send disk locks to the disk array in a blocking mode, detecting whether the disk of the disk array has the disk locks or not respectively;
when the disk of the disk array has the disk lock of the first node, the second node releases the disk lock sent by the second node and gives up the service request corresponding to the service.
Similarly, when the second node succeeds in locking one disk of the disk array before the first node, the method includes:
after the first node and the second node send disk locks to the disk array in a blocking mode, detecting whether the disk of the disk array has the disk locks or not respectively;
when the disk of the disk array has the disk lock of the second node, the first node releases the disk lock sent by the first node and gives up the service request corresponding to the service.
When locking thread call is carried out in a blocking mode, whether a disk is locked or not can be judged through a SCSI subsystem, when one node in the double-control node is successfully locked before the other node, the other node releases the completed locking thread, and all disk locks included in the service are handed over by the node which is successfully locked first, so that the local IO can be effectively blocked from brushing and writing the temporarily unlocked disk, and data inconsistency of IO level is avoided.
In a possible implementation manner of the embodiment of the present invention, the method further includes:
and the first node or the second node determines disk lock arbitration information of the double-control node through the query command in the process of locking and taking over the service request of the client corresponding to the service. The IO issuing of the data is suspended in the process of locking and taking over the service, so that the dual control system side can normally provide service to the outside after waiting for successful taking over, and the data issuing is not involved in the processes of taking over, not taking over or taking over incompletely, thereby effectively avoiding the situation of disordered issuing of the service data and further ensuring the consistency of the data. Although the service is interrupted for the client, the whole arbitration taking-over process is less time-consuming, so that the client can judge whether the arbitration is finished or not without sending the service suspension information to the client. In the arbitration take-over stage, the data transmission is temporarily stopped by the storage, and the client can take the data transmission as a card, so that the normal business of the client is not affected.
In a feasible implementation manner of the exemplary embodiment of the present invention, in the method, in a disk lock arbitration process after the communication of the dual-control node is interrupted, the client determines disk lock arbitration information of the dual-control node through a query command; or reporting the disk lock arbitration information to the client.
For the client, as the double control fails in communication, although the disk lock arbitration mechanism is started in time to distribute the disk corresponding to the service, the client still can be blocked occasionally once, so that the client can know the disk lock arbitration information of the double control node side in time through inquiring and reporting operations, including whether locking is successful or not, the service distribution disk, the service progress and the like.
The dynamic command input by the user at the client side can query the arbitration result and execution progress of the provided service in the double-control system, and can timely feed back the operation such as timely stopping data transmission of the relevant disk of the service to the client so as to effectively avoid data transmission confusion from the client.
Fig. 4 is a schematic structural diagram of a dual-control split disk lock arbitration device according to an embodiment of the present invention, where the device may be implemented by software and/or hardware, and is generally integrated in an intelligent terminal, and may be implemented by a dual-control split disk lock arbitration method. As shown in the figure, the present embodiment may be based on the above embodiment, and provides a disk lock arbitration device for dual-control brain fracture, which is applied to a case that a communication failure occurs in a heartbeat line between dual-control nodes, wherein the dual-control nodes include a first node and a second node, and mainly include a service acquisition module 410, a non-blocking locking module 420, a detection module 430, a first takeover module 440, a first release module 450, a blocking locking module 460, a second takeover module 470, a second discarding module 480, a third takeover module 490, and a third discarding module 4110.
The service acquisition module 410 is configured to determine a disk array to be provided with a service, where the disk array includes a plurality of disks, and the plurality of disks are determined by the service provided by the dual control node;
the non-blocking locking module 420 is configured to send, by the dual control node, a disk lock to the disk array in a non-blocking manner;
the detection module 430 is configured to detect whether each disk included in the disk array is successfully locked by the first node;
a first takeover module 440, configured to take over, by a first node, the service when each disk of the disk array is successfully locked;
the first releasing module 450 is configured to release, when at least one disk of the disk array is not successfully locked, the disk lock by the dual control node; when all the disks of the disk array are not successfully locked, the second node takes over the service;
the blocking locking module 460 is configured to send a disk lock to the disk array in a blocking manner by using the dual control node;
the second take-over module 470 is configured to, when the first node succeeds in locking one disk in the disk array before the second node, sequentially send, by the first node, a disk lock to other disks in the disk array until all the disks included in the disk array are successfully locked, where the first node takes over a client service request corresponding to the service;
The second discard module 480 is configured to discard the service request corresponding to the service when the second takeover module takes over successfully;
the third takeover module 490 is configured to, when the second node succeeds in locking one disk in the disk array before the first node, sequentially send a disk lock to other disks in the disk array until all the disks included in the disk array are successfully locked, where the second node takes over a client service request corresponding to the service;
the third discard module 4110 is configured to discard the service request corresponding to the service when the third takeover module takes over successfully.
The device, further comprising:
the blocking detection module is used for respectively detecting whether the disk of the disk array has a disk lock or not after the first node and the second node send the disk lock to the disk array in a blocking mode;
the second releasing module is used for releasing the disk lock sent by the first node and giving up the service request corresponding to the service when the disk lock of the second node exists on the disk of the disk array;
And the third releasing module is used for releasing the disk lock sent by the second node and giving up the service request corresponding to the service when the disk lock of the first node exists on the disk of the disk array.
The device, further comprising:
the service creation module is used for acquiring a service request of the client and creating a corresponding service according to the service request;
and the mapping module is used for dividing a plurality of disks from the controllable disks of the double-control node in a mapping mode by taking the service of the service creation module as a unit, and the disks are used as disk arrays corresponding to the service.
In a possible implementation manner of the exemplary embodiment of the present invention, the apparatus further includes:
and the arbitration judging module is used for determining disk lock arbitration information of the double-control node through the query command in the process of locking the first node or the second node and taking over the service request of the client corresponding to the service.
The above modules may be modules included in the first node or the second node, or may be modules independent of the first node or the second node, for example, related modules disposed in a cluster system, and the dual-control split disk lock arbitration device provided in the above embodiment may execute the dual-control split disk lock arbitration method provided in any embodiment of the present invention, and have corresponding functional modules and beneficial effects for executing the method, and technical details not described in detail in the above embodiment may refer to the dual-control split disk lock arbitration method provided in any embodiment of the present invention.
It is understood that the method may be performed by any apparatus, device, platform, cluster of devices having computing, processing capabilities.
The technical carriers involved in payment in the embodiments of the present disclosure may include, for example, near field communication (Near Field Communication, NFC), WIFI, 3G/4G/5G, POS machine card swiping technology, two-dimensional code scanning technology, bar code scanning technology, bluetooth, infrared, short message (Short Message Service, SMS), multimedia message (Multimedia Message Service, MMS), and the like.
It should be noted that the methods of one or more embodiments of the present description may be performed by a single device, such as a computer or server. The method of the embodiment can also be applied to a distributed scene, and is completed by mutually matching a plurality of devices. In the case of such a distributed scenario, one device of the plurality of devices may perform only one or more steps of the method of one or more embodiments of the present disclosure, where the plurality of devices interact with each other to complete the dual-control split disk lock arbitration method.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, the functions of each module may be implemented in one or more pieces of software and/or hardware when implementing one or more embodiments of the present description.
The device of the foregoing embodiment is configured to implement the corresponding method in the foregoing embodiment, and has the beneficial effects of the corresponding method embodiment, which is not described herein.
Fig. 5 shows a more specific hardware architecture of an electronic device according to this embodiment, where the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 implement communication connections therebetween within the device via a bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit ), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc. for executing relevant programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage device, dynamic storage device, or the like. Memory 1020 may store an operating system and other application programs, and when the embodiments of the present specification are implemented in software or firmware, the associated program code is stored in memory 1020 and executed by processor 1010.
The input/output interface 1030 is used to connect with an input/output module for inputting and outputting information. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
Communication interface 1040 is used to connect communication modules (not shown) to enable communication interactions of the present device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path for transferring information between components of the device (e.g., processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).
It should be noted that although the above-described device only shows processor 1010, memory 1020, input/output interface 1030, communication interface 1040, and bus 1050, in an implementation, the device may include other components necessary to achieve proper operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.
The computer readable media of the present embodiments, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the disclosure, including the claims, is limited to these examples; combinations of features of the above embodiments or in different embodiments are also possible within the spirit of the present disclosure, steps may be implemented in any order, and there are many other variations of the different aspects of one or more embodiments described above which are not provided in detail for the sake of brevity.
Additionally, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures, in order to simplify the illustration and discussion, and so as not to obscure one or more embodiments of the present description. Furthermore, the apparatus may be shown in block diagram form in order to avoid obscuring the one or more embodiments of the present description, and also in view of the fact that specifics with respect to implementation of such block diagram apparatus are highly dependent upon the platform within which the one or more embodiments of the present description are to be implemented (i.e., such specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the disclosure, it should be apparent to one skilled in the art that one or more embodiments of the disclosure can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.
While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of those embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic RAM (DRAM)) may use the embodiments discussed.
The present disclosure is intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Any omissions, modifications, equivalents, improvements, and the like, which are within the spirit and principles of the one or more embodiments of the disclosure, are therefore intended to be included within the scope of the disclosure.

Claims (10)

1. The utility model provides a disk lock arbitration method of two accuse brain splits, is applied to after the heart beat line between two accuse nodes takes place communication failure, two accuse nodes include first node and second node, its characterized in that, the method includes:
determining a disk array to be provided with services, and transmitting a disk lock to the disk array in a non-blocking mode by a double-control node, wherein the disk array comprises a plurality of disks, and the plurality of disks are determined by the services provided by the double-control node;
Detecting whether each disk included in the disk array is successfully locked by a first node;
when each disk of the disk array is successfully locked, taking over the service by a first node;
when at least one disk of the disk array is not successfully locked, the dual-control node releases a disk lock; when all the disks of the disk array are not successfully locked, the second node takes over the service;
the double control node sends a disk lock to the disk array in a blocking mode;
when the first node locks one disk in the disk array successfully before the second node, the first node sequentially sends the disk locks to other disks in the disk array until all the disks included in the disk array are successfully locked, at this time, the first node takes over the client service request corresponding to the service, and the second node gives up the service request corresponding to the service;
when the second node locks one disk in the disk array successfully before the first node, the second node sequentially sends the disk locks to other disks in the disk array until all the disks included in the disk array are successfully locked, at this time, the second node takes over the client service request corresponding to the service, and the first node gives up the service request corresponding to the service.
2. The method of claim 1, wherein when the second node successfully locks one of the disk arrays prior to the first node, comprising:
after the first node and the second node send disk locks to the disk array in a blocking mode, detecting whether the disk of the disk array has the disk locks or not respectively;
when the disk of the disk array has the disk lock of the second node, the first node releases the disk lock sent by the first node and gives up the service request corresponding to the service.
3. The method of claim 1, wherein when the first node successfully locks one of the disks of the disk array prior to the second node, comprising:
after the first node and the second node send disk locks to the disk array in a blocking mode, detecting whether the disk of the disk array has the disk locks or not respectively;
when the disk of the disk array has the disk lock of the first node, the second node releases the disk lock sent by the second node and gives up the service request corresponding to the service.
4. The method of claim 1, wherein prior to determining the disk array to be serviced, the method further comprises:
Acquiring a service request of a client and creating a corresponding service according to the service request;
and dividing a plurality of disks from the controllable disks of the double-control node in a mapping mode by taking the service as a unit, and taking the disks as a disk array corresponding to the service.
5. The method according to claim 1, wherein the method further comprises:
the first node or the second node determines disk lock arbitration information of the double-control node through a query command in the process of locking and taking over the service request of the client corresponding to the service; or reporting the disk lock arbitration information to the client.
6. The utility model provides a disk lock arbitration device that two accuse brain splits, is applied to after the heart beat line between two accuse nodes takes place communication failure, two accuse nodes include first node and second node, its characterized in that, the device includes:
the system comprises a service acquisition module, a service management module and a service management module, wherein the service acquisition module is used for determining a disk array to be provided with a service, the disk array comprises a plurality of disks, and the plurality of disks are determined by the service provided by a double-control node;
the non-blocking locking module is used for sending a disk lock to the disk array in a non-blocking mode by the double control node;
the detection module is used for detecting whether each disk included in the disk array is successfully locked by the first node;
The first take-over module is used for taking over the service by the first node when each disk of the disk array is successfully locked;
the first releasing module is used for releasing the disk lock by the double-control node when at least one disk of the disk array is not successfully locked; when all the disks of the disk array are not successfully locked, the second node takes over the service;
the blocking locking module is used for sending a disk lock to the disk array in a blocking mode by the double control node;
the second take-over module is used for sequentially sending disk locks to other disks in the disk array by the first node when the first node locks one disk in the disk array before the second node, until all disks in the disk array are successfully locked, and at the moment, the first node takes over the client service request corresponding to the service;
the second giving up module is used for giving up the service request corresponding to the service by the second node when the second taking over module takes over successfully;
the third take-over module is used for sequentially sending disk locks to other disks in the disk array when the second node locks one disk in the disk array before the first node, until all disks in the disk array are successfully locked, and at the moment, the second node takes over the client service request corresponding to the service;
And the third discarding module is used for discarding the service request corresponding to the service by the first node when the third taking over module takes over successfully.
7. The apparatus of claim 6, wherein the apparatus further comprises:
the blocking detection module is used for respectively detecting whether the disk of the disk array has a disk lock or not after the first node and the second node send the disk lock to the disk array in a blocking mode;
and the second releasing module is used for releasing the disk lock sent by the first node and giving up the service request corresponding to the service when the disk lock of the second node exists on the disk of the disk array.
8. The apparatus of claim 6, wherein the apparatus further comprises:
the blocking detection module is used for respectively detecting whether the disk of the disk array has a disk lock or not after the first node and the second node send the disk lock to the disk array in a blocking mode;
and the third releasing module is used for releasing the disk lock sent by the second node and giving up the service request corresponding to the service when the disk lock of the first node exists on the disk of the disk array.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of disk lock arbitration for dual control brain splits as claimed in any one of claims 1 to 5 when executing the program.
10. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of disk lock arbitration for dual control brain splits according to any one of claims 1 to 5.
CN202010429475.8A 2020-05-20 2020-05-20 Disk lock arbitration method, device, equipment and medium for double-control brain fracture Active CN111737063B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010429475.8A CN111737063B (en) 2020-05-20 2020-05-20 Disk lock arbitration method, device, equipment and medium for double-control brain fracture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010429475.8A CN111737063B (en) 2020-05-20 2020-05-20 Disk lock arbitration method, device, equipment and medium for double-control brain fracture

Publications (2)

Publication Number Publication Date
CN111737063A CN111737063A (en) 2020-10-02
CN111737063B true CN111737063B (en) 2023-11-14

Family

ID=72647430

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010429475.8A Active CN111737063B (en) 2020-05-20 2020-05-20 Disk lock arbitration method, device, equipment and medium for double-control brain fracture

Country Status (1)

Country Link
CN (1) CN111737063B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115811461B (en) * 2023-02-08 2023-04-28 湖南国科亿存信息科技有限公司 SAN shared storage cluster brain crack prevention processing method and device and electronic equipment
CN116737634A (en) * 2023-07-12 2023-09-12 北京鲸鲨软件科技有限公司 Arbitration-based rapid cerebral cleavage processing method and device in DRBD double-master mode

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005275813A (en) * 2004-03-24 2005-10-06 Canon Inc Disk array system, its control method, program and storage medium
CN103209095A (en) * 2013-03-13 2013-07-17 广东新支点技术服务有限公司 Method and device for preventing split brain on basis of disk service lock
CN105095125A (en) * 2015-07-08 2015-11-25 北京飞杰信息技术有限公司 Highly available double-control storage system and operation method thereof based on quorum disc
CN106648909A (en) * 2016-10-13 2017-05-10 华为技术有限公司 Management method and device for dish lock and system
CN110096231A (en) * 2019-04-25 2019-08-06 新华三云计算技术有限公司 The processing method and processing device of disk lock

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005275813A (en) * 2004-03-24 2005-10-06 Canon Inc Disk array system, its control method, program and storage medium
CN103209095A (en) * 2013-03-13 2013-07-17 广东新支点技术服务有限公司 Method and device for preventing split brain on basis of disk service lock
CN105095125A (en) * 2015-07-08 2015-11-25 北京飞杰信息技术有限公司 Highly available double-control storage system and operation method thereof based on quorum disc
CN106648909A (en) * 2016-10-13 2017-05-10 华为技术有限公司 Management method and device for dish lock and system
EP3470984A1 (en) * 2016-10-13 2019-04-17 Huawei Technologies Co., Ltd. Method, device, and system for managing disk lock
CN110096231A (en) * 2019-04-25 2019-08-06 新华三云计算技术有限公司 The processing method and processing device of disk lock

Also Published As

Publication number Publication date
CN111737063A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
CN106843749B (en) Write request processing method, device and equipment
CN108829350B (en) Data migration method and device based on block chain
US11163479B2 (en) Replicated state cluster with standby node state assessment during leadership transition
US9201742B2 (en) Method and system of self-managing nodes of a distributed database cluster with a consensus algorithm
US7631066B1 (en) System and method for preventing data corruption in computer system clusters
US10372384B2 (en) Method and system for managing storage system using first and second communication areas
US10884623B2 (en) Method and apparatus for upgrading a distributed storage system
WO2016150066A1 (en) Master node election method and apparatus, and storage system
US10127124B1 (en) Performing fencing operations in multi-node distributed storage systems
CN111737063B (en) Disk lock arbitration method, device, equipment and medium for double-control brain fracture
US9798639B2 (en) Failover system and method replicating client message to backup server from primary server
CN109684048B (en) Method and device for processing transaction in transaction submitting system
CN105511987A (en) Distributed task management system with high consistency and availability
CN104854845B (en) Use the method and apparatus of efficient atomic operation
CN108418859B (en) Method and device for writing data
JPH04271453A (en) Composite electronic computer
CN106170013B (en) A kind of Kafka message uniqueness method based on Redis
US20100306432A1 (en) Computer-implemented multi-resource shared lock
CN105205160A (en) Data write-in method and device
JP2021168123A (en) Systems and method for distributed read/write locking with network key values for storage devices
CN100440191C (en) Method and system for processing complexes to access shared devices
CN115098528B (en) Service processing method, device, electronic equipment and computer readable storage medium
US10970177B2 (en) Methods and systems of managing consistency and availability tradeoffs in a real-time operational DBMS
CN112596801A (en) Transaction processing method, device, equipment, storage medium and database
CN110413686B (en) Data writing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant