CN113703669A - Management method, system, equipment and storage medium for cache partition - Google Patents

Management method, system, equipment and storage medium for cache partition Download PDF

Info

Publication number
CN113703669A
CN113703669A CN202110807418.3A CN202110807418A CN113703669A CN 113703669 A CN113703669 A CN 113703669A CN 202110807418 A CN202110807418 A CN 202110807418A CN 113703669 A CN113703669 A CN 113703669A
Authority
CN
China
Prior art keywords
event notification
node
cache partition
service end
task corresponding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110807418.3A
Other languages
Chinese (zh)
Other versions
CN113703669B (en
Inventor
侯红生
刘文志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202110807418.3A priority Critical patent/CN113703669B/en
Publication of CN113703669A publication Critical patent/CN113703669A/en
Application granted granted Critical
Publication of CN113703669B publication Critical patent/CN113703669B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Hardware Redundancy (AREA)

Abstract

The application discloses a management method of a cache partition, which comprises the following steps: receiving an event notification sent by a cluster; after receiving information which is fed back by the service end and represents that the task corresponding to the last event notification is executed, updating node information of the cache partition according to the last event notification, and sending the task corresponding to the current event notification to the service end; and after receiving information which is fed back by the service end and represents that the task corresponding to the current event notification is executed, updating the node information of the cache partition according to the current event notification. By applying the scheme of the application, the condition that the node information of the cache partition is abnormal can be avoided. The application also discloses a management system, equipment and a storage medium of the cache partition, and the management system, the equipment and the storage medium have corresponding technical effects.

Description

Management method, system, equipment and storage medium for cache partition
Technical Field
The present invention relates to the field of storage technologies, and in particular, to a method, a system, a device, and a storage medium for managing a cache partition.
Background
With the current higher requirement on storage, a high-performance cluster formed by multiple nodes is more and more widely applied.
In practical application, particularly in a single partition mode, sometimes a node information of a cache partition is abnormal, for example, a cache partition that should exist only on one node may exist on both nodes of an IO group, so that when the cache partition is deleted, only the cache partition on one node is deleted, and the cache partition on the other node still exists, thereby causing a problem in service configuration. And, such a situation mostly occurs during the T2 failure recovery process of the cluster. A T2 failure means that all nodes in an IO group exit the cluster at the same time due to the failure, and a recovery of a T2 failure means that all nodes in the IO group join the cluster at the same time in the recovery process.
In summary, how to effectively avoid the node information exception of the cache partition is a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The invention aims to provide a management method, a management system, a management device and a storage medium of a cache partition, so as to effectively avoid node information exception of the cache partition.
In order to solve the technical problems, the invention provides the following technical scheme:
a management method of a cache partition comprises the following steps:
receiving an event notification sent by a cluster;
after receiving information which is fed back by the service end and represents that the task corresponding to the last event notification is executed, updating node information of the cache partition according to the last event notification, and sending the task corresponding to the current event notification to the service end;
and after receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, updating the node information of the cache partition according to the event notification.
Preferably, the receiving the event notification sent by the cluster includes:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, after the node information of the cache partition is updated according to the event notification, the method further includes:
and deleting the event notification in the notification queue.
Preferably, when the received event notification is an event notification indicating a failure of the first node, the sending a task corresponding to the current event notification to the service end includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing a survival node;
when the received event notification is an event notification indicating that the first node has failed to recover, the sending a task corresponding to the event notification to the service end includes:
and sending a task corresponding to the event notification to a service end so as to restore the taken over cache partition of the first node to the first node.
Preferably, the cache partition of the cluster is in a single partition mode.
Preferably, the method further comprises the following steps:
and recording information when receiving the event notice which shows the fault recovery of the first node within a first time period after receiving the event notice which shows the fault of the first node.
A cache-partitioned management system, comprising:
an event notification receiving unit, configured to receive an event notification sent by a cluster;
the execution unit is used for updating the node information of the cache partition according to the last event notification after receiving the information which is fed back by the service end and indicates that the task corresponding to the last event notification is executed, and sending the task corresponding to the current event notification to the service end; and after receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, updating the node information of the cache partition according to the event notification.
Preferably, the event notification receiving unit is specifically configured to:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, the method also comprises the following steps:
further comprising: and the queue updating unit is used for deleting the current event notification in the notification queue after the execution unit updates the node information of the cache partition according to the current event notification.
Preferably, when the received event notification is an event notification indicating a failure of the first node, the executing unit sends a task corresponding to the event notification to the service end, and the task includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing a survival node;
when the received event notification is an event notification indicating that the first node has failed to recover, the execution unit sends a task corresponding to the event notification to the service end, and the task includes:
and sending a task corresponding to the event notification to a service end so as to restore the taken over cache partition of the first node to the first node.
A management device of a cache partition, comprising:
a memory for storing a computer program;
a processor for executing said computer program to implement the steps of the method for managing cache partitions of any of the above.
A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method for managing a cache partition according to any one of the preceding claims.
The applicant considers that in the conventional recovery process of the T2 failure, the cache partition module can immediately perform the recovery of the self service configuration depending on the event notification sent by the cluster. In the recovery flow of the T2 failure, sometimes a node fails again due to a warm restart or the like, and recovers again and joins the cluster in a short time. In this special scenario, when the cache partition is in the single partition mode, after a node fails again due to warm reboot or the like, the cache partition module executes a takeover process of the cache partition of the failed node according to current node information, and in the process of executing the takeover process, because the failed node recovers and rejoins the cluster, and the time interval is short, the cluster notifies the cache partition module of an event that the node joins the cluster, and updates the cache partition node information. At this time, the cache partition module creates a cache partition on the recovered node according to the new node information, but because the node fails and the time interval for rejoining the cluster is short, the previous node failure takeover process is not executed correctly, so that a situation that the cache partition is created on both nodes may occur.
According to the scheme of the application, after the event notification sent by the cluster is received, the node information of the cache partition is not updated immediately, and a new task is executed. After receiving the event notification sent by the cluster, if receiving information, which is fed back by the service end and indicates that the task corresponding to the last event notification is executed completely, indicating that the task corresponding to the last event notification is executed completely, updating the node information of the cache partition according to the last event notification, and further sending the task corresponding to the event notification to the service end. After receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, the node information of the cache partition is updated according to the event notification after the task corresponding to the event notification is executed. It can be seen that, in the present application, the node information of the cache partition is not immediately updated when the event notification sent by the cluster is received, but the node information of the cache partition is updated only when the task corresponding to the event notification sent by the cluster is executed each time, so that the condition that the node information of the cache partition is abnormal in the conventional scheme does not occur.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flowchart illustrating an embodiment of a method for managing a cache partition according to the present invention;
fig. 2 is a schematic structural diagram of a management system of a cache partition according to the present invention.
Detailed Description
The core of the invention is to provide a management method of the cache partition, which can avoid the condition that the node information of the cache partition is abnormal.
In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart illustrating an implementation of a method for managing a cache partition according to the present invention, where the method for managing the cache partition may include the following steps:
step S101: and receiving an event notification sent by the cluster.
Specifically, the cluster sends information such as node joining and node exiting in the form of event notification, and the partition module, i.e., the cache partition module, may receive the event notification sent by the cluster. Due to the wide application of SSD (Solid State Disk), the cache partition of the present application may be specifically an SSD cache partition.
Step S102: and after receiving information which is fed back by the service end and represents that the task corresponding to the last event notification is executed, updating the node information of the cache partition according to the last event notification, and sending the task corresponding to the current event notification to the service end.
The method and the device do not update the node information immediately according to the received event notification after receiving the event notification sent by the cluster. This is because, the applicant considers that, in the conventional recovery flow of T2 failure, the cache partition module can immediately perform the recovery of its own service configuration depending on the event notification issued by the cluster. In the recovery flow of the T2 failure, sometimes a node fails again due to a warm restart or the like, and recovers again and joins the cluster in a short time. In this special scenario, when the cache partition is in the single partition mode, after a node fails again due to warm reboot or the like, the cache partition module executes a takeover process of the cache partition of the failed node according to current node information, and in the process of executing the takeover process, because the failed node recovers and rejoins the cluster, and the time interval is short, the cluster notifies the cache partition module of an event that the node joins the cluster, and updates the cache partition node information. At this time, the cache partition module creates a cache partition on the recovered node according to the new node information, but because the node fails and the time interval for rejoining the cluster is short, the previous node failure takeover process is not executed correctly, so that a situation that the cache partition is created on both nodes may occur.
Moreover, the applicant discovers, through the above analysis, that when a cache partition is abnormal, a node needs to fail and recover to normal in a short time, so that the situation that the previous failure takeover process described in the foregoing is not executed correctly, and the node information is already updated by an event notification newly sent by the cluster occurs. The reason why the node information of the cache partition is abnormal mainly occurs in the recovery flow of the T2 fault is that in the recovery flow of the T2 fault, the frequency of the situation that the node exits from the cluster due to hot restart and the like and joins the cluster in a short time is high, that is, in other situations, the situation that the node exits from the cluster and recovers quickly occurs rarely. Therefore, after the scheme of the application is applied, not only the recovery process of the T2 fault can avoid the occurrence of abnormal node information of the cache partition, but also the recovery process of the T2 fault can effectively avoid the occurrence of abnormal node information of the cache partition by using the scheme of the application for the node fault caused by other conditions and recovering to normal in a short time.
After receiving the event notification sent by the cluster, if receiving information which is fed back by the service end and indicates that the task corresponding to the last event notification is executed completely, the method indicates that the task corresponding to the last event notification is executed completely, and therefore the method updates the node information of the cache partition according to the last event notification. It can be seen that, for the last event notification, from the time of receiving, it is at least necessary to keep it until the task corresponding to it is executed correctly, and then update the node information of the cache partition according to it. In order to keep the event notifications according to the requirements of the present application, there are various specific means, as long as the purpose of the present application can be achieved, for example, in the following embodiment, the event notifications are placed in a queue, or in an occasion, each event notification is stored in a preset storage space and cleaned after the storage space is full.
After the node information of the cache partition is updated according to the last event notification, the application sends a task corresponding to the event notification to the service end. The business end, which may be generally referred to as agent, may perform the execution of tasks.
Step S103: and after receiving information which is fed back by the service end and represents that the task corresponding to the current event notification is executed, updating the node information of the cache partition according to the current event notification.
After a task corresponding to the event notification is sent to the service end, the service end executes the task, and in the execution process, no matter whether the cache partition module receives a new event notification, the cache partition module needs to receive information which is fed back by the service end and indicates that the task corresponding to the event notification is executed completely, and the task corresponding to the event notification is correctly executed.
In a specific embodiment of the present invention, step S101 may specifically include:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, after the updating of the node information of the cache partition according to the event notification in step S103, the method may further include:
and deleting the event notification in the notification queue.
As described above, after receiving an event notification sent by a cluster, the present application does not immediately update node information of a cache partition according to the event notification, but updates node information of the cache partition according to the event notification only after a task corresponding to the event notification is executed, that is, at least the event notification needs to be kept until the task corresponding to the event notification is executed correctly, in this process, a new event notification may be continuously sent to a cache partition module. That is, after receiving the event notification sent by the cluster, it is sufficient to place the event notification into a preset notification queue. And it is understood that the newly received event notification can be placed at the end of the queue, and the head of the queue represents the event notification corresponding to the task currently being executed.
In this embodiment, since the preset notification queue is used to sort the event notifications, after the node information of the cache partition is updated according to the current event notification, the current event notification does not need to be retained, and thus the current event notification in the notification queue can be deleted.
In a specific embodiment of the present invention, when the received event notification is an event notification indicating a failure of the first node, the sending of the task corresponding to the current event notification to the service end described in step S102 includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing the survival node;
when the received event notification is an event notification indicating that the first node has failed to recover, the sending of the task corresponding to the event notification to the service end, which is described in step S102, includes:
and sending a task corresponding to the event notification to the service end so as to restore the taken over cache partition of the first node to the first node.
In this embodiment, when the received event notification is an event notification indicating that the first node has failed, a task corresponding to the event notification may be sent to the service end, so as to take over the cache partition of the first node by using the surviving node. The first node may have 1 or more cache partitions, and may implement takeover of these cache partitions by 1 or more surviving nodes, and a specific takeover rule may be set according to an actual need, which is not described in this application.
When the received event notification is the event notification indicating that the first node has failed to recover, the first node is normal, and a task corresponding to the event notification can be sent to the service end, so that the taken over cache partition of the first node is recovered to the first node. The first node may be any node in the cluster.
As described above, in the conventional scheme, node information of the cache partition is abnormal mainly in the single partition mode, and therefore, the cache partition of the cluster of the present application may be selected as the single partition mode. However, although it should be noted that, in other partition modes, the node information of the cache partition is not easy to be abnormal, the scheme of the present application may still be adopted, and the implementation of the present application is not affected.
In an embodiment of the present invention, the method may further include:
when an event notification indicating failure recovery of the first node is received within a first time period after the event notification indicating failure of the first node is received, information is recorded.
If the event notification indicating the failure of the first node is received within the first time period after the event notification indicating the failure of the first node is received, the embodiment records information, such as the number of the first node, the date of the occurrence of the abnormal condition and the like, and is beneficial to counting and processing the special condition by a worker, namely the embodiment is beneficial to improving the convenience of operation and maintenance.
By applying the technical scheme provided by the embodiment of the invention, the node information of the cache partition is not updated and a new task is executed immediately after the event notification sent by the cluster is received. After receiving the event notification sent by the cluster, if receiving information, which is fed back by the service end and indicates that the task corresponding to the last event notification is executed completely, indicating that the task corresponding to the last event notification is executed completely, updating the node information of the cache partition according to the last event notification, and further sending the task corresponding to the event notification to the service end. After receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, the node information of the cache partition is updated according to the event notification after the task corresponding to the event notification is executed. It can be seen that, in the present application, the node information of the cache partition is not immediately updated when the event notification sent by the cluster is received, but the node information of the cache partition is updated only when the task corresponding to the event notification sent by the cluster is executed each time, so that the condition that the node information of the cache partition is abnormal in the conventional scheme does not occur.
Corresponding to the above method embodiments, the embodiments of the present invention further provide a management system for cache partitions, which can be referred to in correspondence with the above.
Referring to fig. 2, a schematic structural diagram of a management system of a cache partition in the present invention is shown, including:
an event notification receiving unit 201, configured to receive an event notification sent by a cluster;
an execution unit 202, configured to update node information of a cache partition according to a previous event notification after receiving information indicating that a task corresponding to the previous event notification is executed and fed back by a service end, and send a task corresponding to the current event notification to the service end; and after receiving information which is fed back by the service end and represents that the task corresponding to the current event notification is executed, updating the node information of the cache partition according to the current event notification.
In an embodiment of the present invention, the event notification receiving unit 201 is specifically configured to:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, the method also comprises the following steps:
further comprising: and the queue updating unit is used for deleting the current event notification in the notification queue after the execution unit updates the node information of the cache partition according to the current event notification.
In an embodiment of the present invention, when the received event notification is an event notification indicating a failure of the first node, the executing unit 202 sends a task corresponding to the current event notification to the service end, where the task includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing the survival node;
when the received event notification is an event notification indicating that the first node has failed, the execution unit 202 sends a task corresponding to the current event notification to the service end, where the task includes:
and sending a task corresponding to the event notification to the service end so as to restore the taken over cache partition of the first node to the first node.
In a specific embodiment of the present invention, the cache partition of the cluster is in a single partition mode.
In one embodiment of the present invention, the method further comprises:
and the information recording unit is used for recording information when receiving the event notice which shows the fault recovery of the first node within a first time period after receiving the event notice which shows the fault of the first node.
Corresponding to the above method and system embodiments, the embodiments of the present invention further provide a management device for cache partitions and a computer-readable storage medium, which may be referred to in correspondence with the above. The computer readable storage medium has a computer program stored thereon, and the computer program, when executed by a processor, implements the steps of the method for managing a cache partition in any of the above embodiments. A computer-readable storage medium as referred to herein may include Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The management device of the cache partition may include:
a memory for storing a computer program;
a processor for executing a computer program to implement the steps of the method for managing cache partitions in any of the above embodiments.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The principle and the implementation of the present invention are explained in the present application by using specific examples, and the above description of the embodiments is only used to help understanding the technical solution and the core idea of the present invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims (10)

1. A method for managing a cache partition, comprising:
receiving an event notification sent by a cluster;
after receiving information which is fed back by the service end and represents that the task corresponding to the last event notification is executed, updating node information of the cache partition according to the last event notification, and sending the task corresponding to the current event notification to the service end;
and after receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, updating the node information of the cache partition according to the event notification.
2. The method for managing the cache partition according to claim 1, wherein the receiving the event notification sent by the cluster includes:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, after the node information of the cache partition is updated according to the event notification, the method further includes:
and deleting the event notification in the notification queue.
3. The method according to claim 1, wherein when the received event notification is an event notification indicating a failure of the first node, the sending a task corresponding to the current event notification to the service end includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing a survival node;
when the received event notification is an event notification indicating that the first node has failed to recover, the sending a task corresponding to the event notification to the service end includes:
and sending a task corresponding to the event notification to a service end so as to restore the taken over cache partition of the first node to the first node.
4. The method for managing the cache partition according to claim 1, wherein the cache partition of the cluster is in a single partition mode.
5. The method for managing the cache partition according to any one of claims 1 to 4, further comprising:
and recording information when receiving the event notice which shows the fault recovery of the first node within a first time period after receiving the event notice which shows the fault of the first node.
6. A cache-partitioned management system, comprising:
an event notification receiving unit, configured to receive an event notification sent by a cluster;
the execution unit is used for updating the node information of the cache partition according to the last event notification after receiving the information which is fed back by the service end and indicates that the task corresponding to the last event notification is executed, and sending the task corresponding to the current event notification to the service end; and after receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, updating the node information of the cache partition according to the event notification.
7. The management system of a cache partition according to claim 6, wherein the event notification receiving unit is specifically configured to:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, the method also comprises the following steps:
further comprising: and the queue updating unit is used for deleting the current event notification in the notification queue after the execution unit updates the node information of the cache partition according to the current event notification.
8. The system according to claim 6, wherein when the received event notification is an event notification indicating a failure of the first node, the execution unit sends a task corresponding to the current event notification to the service end, and the task includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing a survival node;
when the received event notification is an event notification indicating that the first node has failed to recover, the execution unit sends a task corresponding to the event notification to the service end, and the task includes:
and sending a task corresponding to the event notification to a service end so as to restore the taken over cache partition of the first node to the first node.
9. A management apparatus for a cache partition, comprising:
a memory for storing a computer program;
processor for executing said computer program for implementing the steps of the method for managing cache partitions according to any of claims 1 to 5.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for managing cache partitions according to any one of claims 1 to 5.
CN202110807418.3A 2021-07-16 2021-07-16 Cache partition management method, system, equipment and storage medium Active CN113703669B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110807418.3A CN113703669B (en) 2021-07-16 2021-07-16 Cache partition management method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110807418.3A CN113703669B (en) 2021-07-16 2021-07-16 Cache partition management method, system, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113703669A true CN113703669A (en) 2021-11-26
CN113703669B CN113703669B (en) 2023-08-04

Family

ID=78648778

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110807418.3A Active CN113703669B (en) 2021-07-16 2021-07-16 Cache partition management method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113703669B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024113534A1 (en) * 2022-11-30 2024-06-06 苏州元脑智能科技有限公司 Method and apparatus for controlling storage resources in storage node, and storage node

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050160303A1 (en) * 2004-01-15 2005-07-21 International Business Machines Corporation Method, apparatus, and program for minimizing invalid cache notification events in a distributed caching environment
CN111813348A (en) * 2020-09-08 2020-10-23 苏州浪潮智能科技有限公司 Node event processing device, method, equipment and medium in unified storage equipment
CN112463437A (en) * 2020-11-05 2021-03-09 苏州浪潮智能科技有限公司 Service recovery method, system and related components of storage cluster system offline node

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050160303A1 (en) * 2004-01-15 2005-07-21 International Business Machines Corporation Method, apparatus, and program for minimizing invalid cache notification events in a distributed caching environment
CN111813348A (en) * 2020-09-08 2020-10-23 苏州浪潮智能科技有限公司 Node event processing device, method, equipment and medium in unified storage equipment
CN112463437A (en) * 2020-11-05 2021-03-09 苏州浪潮智能科技有限公司 Service recovery method, system and related components of storage cluster system offline node

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024113534A1 (en) * 2022-11-30 2024-06-06 苏州元脑智能科技有限公司 Method and apparatus for controlling storage resources in storage node, and storage node

Also Published As

Publication number Publication date
CN113703669B (en) 2023-08-04

Similar Documents

Publication Publication Date Title
US8775387B2 (en) Methods and systems for validating accessibility and currency of replicated data
CN109639794A (en) A kind of stateful cluster recovery method, apparatus, equipment and readable storage medium storing program for executing
CN112463448B (en) Distributed cluster database synchronization method, device, equipment and storage medium
CN102394914A (en) Cluster brain-split processing method and device
CN109491609B (en) Cache data processing method, device and equipment and readable storage medium
CN108173971A (en) A kind of MooseFS high availability methods and system based on active-standby switch
CN111901176B (en) Fault determination method, device, equipment and storage medium
CN111752488B (en) Management method and device of storage cluster, management node and storage medium
CN108319522A (en) A method of reinforcing distributed memory system reliability
CN113703669A (en) Management method, system, equipment and storage medium for cache partition
CN110858168B (en) Cluster node fault processing method and device and cluster node
EP4060514A1 (en) Distributed database system and data disaster backup drilling method
JP6418377B2 (en) Management target device, management device, and network management system
CN111342986A (en) Distributed node management method and device, distributed system and storage medium
CN111309515B (en) Disaster recovery control method, device and system
CN111897626A (en) Cloud computing scene-oriented virtual machine high-reliability system and implementation method
CN111158956A (en) Data backup method and related device for cluster system
CN114598711A (en) Data migration method, device, equipment and medium
CN116668269A (en) Arbitration method, device and system for dual-activity data center
CN114036129A (en) Database switching method for reducing data loss
JP4485560B2 (en) Computer system and system management program
CN114598593A (en) Message processing method, system, computing device and computer storage medium
CN114116178A (en) Cluster framework task management method and related device
US20060023627A1 (en) Computing system redundancy and fault tolerance
CN108897645B (en) Database cluster disaster tolerance method and system based on standby heartbeat disk

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant