CN113703669A - Management method, system, equipment and storage medium for cache partition - Google Patents
Management method, system, equipment and storage medium for cache partition Download PDFInfo
- Publication number
- CN113703669A CN113703669A CN202110807418.3A CN202110807418A CN113703669A CN 113703669 A CN113703669 A CN 113703669A CN 202110807418 A CN202110807418 A CN 202110807418A CN 113703669 A CN113703669 A CN 113703669A
- Authority
- CN
- China
- Prior art keywords
- event notification
- node
- cache partition
- service end
- task corresponding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000005192 partition Methods 0.000 title claims abstract description 136
- 238000007726 management method Methods 0.000 title abstract description 17
- 238000000034 method Methods 0.000 claims description 46
- 238000011084 recovery Methods 0.000 claims description 17
- 238000004590 computer program Methods 0.000 claims description 10
- 230000004083 survival effect Effects 0.000 claims description 6
- 230000002159 abnormal effect Effects 0.000 abstract description 12
- 230000000694 effects Effects 0.000 abstract 1
- 230000008569 process Effects 0.000 description 14
- 230000009471 action Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0644—Management of space entities, e.g. partitions, extents, pools
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0656—Data buffering arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Hardware Redundancy (AREA)
Abstract
The application discloses a management method of a cache partition, which comprises the following steps: receiving an event notification sent by a cluster; after receiving information which is fed back by the service end and represents that the task corresponding to the last event notification is executed, updating node information of the cache partition according to the last event notification, and sending the task corresponding to the current event notification to the service end; and after receiving information which is fed back by the service end and represents that the task corresponding to the current event notification is executed, updating the node information of the cache partition according to the current event notification. By applying the scheme of the application, the condition that the node information of the cache partition is abnormal can be avoided. The application also discloses a management system, equipment and a storage medium of the cache partition, and the management system, the equipment and the storage medium have corresponding technical effects.
Description
Technical Field
The present invention relates to the field of storage technologies, and in particular, to a method, a system, a device, and a storage medium for managing a cache partition.
Background
With the current higher requirement on storage, a high-performance cluster formed by multiple nodes is more and more widely applied.
In practical application, particularly in a single partition mode, sometimes a node information of a cache partition is abnormal, for example, a cache partition that should exist only on one node may exist on both nodes of an IO group, so that when the cache partition is deleted, only the cache partition on one node is deleted, and the cache partition on the other node still exists, thereby causing a problem in service configuration. And, such a situation mostly occurs during the T2 failure recovery process of the cluster. A T2 failure means that all nodes in an IO group exit the cluster at the same time due to the failure, and a recovery of a T2 failure means that all nodes in the IO group join the cluster at the same time in the recovery process.
In summary, how to effectively avoid the node information exception of the cache partition is a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The invention aims to provide a management method, a management system, a management device and a storage medium of a cache partition, so as to effectively avoid node information exception of the cache partition.
In order to solve the technical problems, the invention provides the following technical scheme:
a management method of a cache partition comprises the following steps:
receiving an event notification sent by a cluster;
after receiving information which is fed back by the service end and represents that the task corresponding to the last event notification is executed, updating node information of the cache partition according to the last event notification, and sending the task corresponding to the current event notification to the service end;
and after receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, updating the node information of the cache partition according to the event notification.
Preferably, the receiving the event notification sent by the cluster includes:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, after the node information of the cache partition is updated according to the event notification, the method further includes:
and deleting the event notification in the notification queue.
Preferably, when the received event notification is an event notification indicating a failure of the first node, the sending a task corresponding to the current event notification to the service end includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing a survival node;
when the received event notification is an event notification indicating that the first node has failed to recover, the sending a task corresponding to the event notification to the service end includes:
and sending a task corresponding to the event notification to a service end so as to restore the taken over cache partition of the first node to the first node.
Preferably, the cache partition of the cluster is in a single partition mode.
Preferably, the method further comprises the following steps:
and recording information when receiving the event notice which shows the fault recovery of the first node within a first time period after receiving the event notice which shows the fault of the first node.
A cache-partitioned management system, comprising:
an event notification receiving unit, configured to receive an event notification sent by a cluster;
the execution unit is used for updating the node information of the cache partition according to the last event notification after receiving the information which is fed back by the service end and indicates that the task corresponding to the last event notification is executed, and sending the task corresponding to the current event notification to the service end; and after receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, updating the node information of the cache partition according to the event notification.
Preferably, the event notification receiving unit is specifically configured to:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, the method also comprises the following steps:
further comprising: and the queue updating unit is used for deleting the current event notification in the notification queue after the execution unit updates the node information of the cache partition according to the current event notification.
Preferably, when the received event notification is an event notification indicating a failure of the first node, the executing unit sends a task corresponding to the event notification to the service end, and the task includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing a survival node;
when the received event notification is an event notification indicating that the first node has failed to recover, the execution unit sends a task corresponding to the event notification to the service end, and the task includes:
and sending a task corresponding to the event notification to a service end so as to restore the taken over cache partition of the first node to the first node.
A management device of a cache partition, comprising:
a memory for storing a computer program;
a processor for executing said computer program to implement the steps of the method for managing cache partitions of any of the above.
A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method for managing a cache partition according to any one of the preceding claims.
The applicant considers that in the conventional recovery process of the T2 failure, the cache partition module can immediately perform the recovery of the self service configuration depending on the event notification sent by the cluster. In the recovery flow of the T2 failure, sometimes a node fails again due to a warm restart or the like, and recovers again and joins the cluster in a short time. In this special scenario, when the cache partition is in the single partition mode, after a node fails again due to warm reboot or the like, the cache partition module executes a takeover process of the cache partition of the failed node according to current node information, and in the process of executing the takeover process, because the failed node recovers and rejoins the cluster, and the time interval is short, the cluster notifies the cache partition module of an event that the node joins the cluster, and updates the cache partition node information. At this time, the cache partition module creates a cache partition on the recovered node according to the new node information, but because the node fails and the time interval for rejoining the cluster is short, the previous node failure takeover process is not executed correctly, so that a situation that the cache partition is created on both nodes may occur.
According to the scheme of the application, after the event notification sent by the cluster is received, the node information of the cache partition is not updated immediately, and a new task is executed. After receiving the event notification sent by the cluster, if receiving information, which is fed back by the service end and indicates that the task corresponding to the last event notification is executed completely, indicating that the task corresponding to the last event notification is executed completely, updating the node information of the cache partition according to the last event notification, and further sending the task corresponding to the event notification to the service end. After receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, the node information of the cache partition is updated according to the event notification after the task corresponding to the event notification is executed. It can be seen that, in the present application, the node information of the cache partition is not immediately updated when the event notification sent by the cluster is received, but the node information of the cache partition is updated only when the task corresponding to the event notification sent by the cluster is executed each time, so that the condition that the node information of the cache partition is abnormal in the conventional scheme does not occur.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flowchart illustrating an embodiment of a method for managing a cache partition according to the present invention;
fig. 2 is a schematic structural diagram of a management system of a cache partition according to the present invention.
Detailed Description
The core of the invention is to provide a management method of the cache partition, which can avoid the condition that the node information of the cache partition is abnormal.
In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart illustrating an implementation of a method for managing a cache partition according to the present invention, where the method for managing the cache partition may include the following steps:
step S101: and receiving an event notification sent by the cluster.
Specifically, the cluster sends information such as node joining and node exiting in the form of event notification, and the partition module, i.e., the cache partition module, may receive the event notification sent by the cluster. Due to the wide application of SSD (Solid State Disk), the cache partition of the present application may be specifically an SSD cache partition.
Step S102: and after receiving information which is fed back by the service end and represents that the task corresponding to the last event notification is executed, updating the node information of the cache partition according to the last event notification, and sending the task corresponding to the current event notification to the service end.
The method and the device do not update the node information immediately according to the received event notification after receiving the event notification sent by the cluster. This is because, the applicant considers that, in the conventional recovery flow of T2 failure, the cache partition module can immediately perform the recovery of its own service configuration depending on the event notification issued by the cluster. In the recovery flow of the T2 failure, sometimes a node fails again due to a warm restart or the like, and recovers again and joins the cluster in a short time. In this special scenario, when the cache partition is in the single partition mode, after a node fails again due to warm reboot or the like, the cache partition module executes a takeover process of the cache partition of the failed node according to current node information, and in the process of executing the takeover process, because the failed node recovers and rejoins the cluster, and the time interval is short, the cluster notifies the cache partition module of an event that the node joins the cluster, and updates the cache partition node information. At this time, the cache partition module creates a cache partition on the recovered node according to the new node information, but because the node fails and the time interval for rejoining the cluster is short, the previous node failure takeover process is not executed correctly, so that a situation that the cache partition is created on both nodes may occur.
Moreover, the applicant discovers, through the above analysis, that when a cache partition is abnormal, a node needs to fail and recover to normal in a short time, so that the situation that the previous failure takeover process described in the foregoing is not executed correctly, and the node information is already updated by an event notification newly sent by the cluster occurs. The reason why the node information of the cache partition is abnormal mainly occurs in the recovery flow of the T2 fault is that in the recovery flow of the T2 fault, the frequency of the situation that the node exits from the cluster due to hot restart and the like and joins the cluster in a short time is high, that is, in other situations, the situation that the node exits from the cluster and recovers quickly occurs rarely. Therefore, after the scheme of the application is applied, not only the recovery process of the T2 fault can avoid the occurrence of abnormal node information of the cache partition, but also the recovery process of the T2 fault can effectively avoid the occurrence of abnormal node information of the cache partition by using the scheme of the application for the node fault caused by other conditions and recovering to normal in a short time.
After receiving the event notification sent by the cluster, if receiving information which is fed back by the service end and indicates that the task corresponding to the last event notification is executed completely, the method indicates that the task corresponding to the last event notification is executed completely, and therefore the method updates the node information of the cache partition according to the last event notification. It can be seen that, for the last event notification, from the time of receiving, it is at least necessary to keep it until the task corresponding to it is executed correctly, and then update the node information of the cache partition according to it. In order to keep the event notifications according to the requirements of the present application, there are various specific means, as long as the purpose of the present application can be achieved, for example, in the following embodiment, the event notifications are placed in a queue, or in an occasion, each event notification is stored in a preset storage space and cleaned after the storage space is full.
After the node information of the cache partition is updated according to the last event notification, the application sends a task corresponding to the event notification to the service end. The business end, which may be generally referred to as agent, may perform the execution of tasks.
Step S103: and after receiving information which is fed back by the service end and represents that the task corresponding to the current event notification is executed, updating the node information of the cache partition according to the current event notification.
After a task corresponding to the event notification is sent to the service end, the service end executes the task, and in the execution process, no matter whether the cache partition module receives a new event notification, the cache partition module needs to receive information which is fed back by the service end and indicates that the task corresponding to the event notification is executed completely, and the task corresponding to the event notification is correctly executed.
In a specific embodiment of the present invention, step S101 may specifically include:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, after the updating of the node information of the cache partition according to the event notification in step S103, the method may further include:
and deleting the event notification in the notification queue.
As described above, after receiving an event notification sent by a cluster, the present application does not immediately update node information of a cache partition according to the event notification, but updates node information of the cache partition according to the event notification only after a task corresponding to the event notification is executed, that is, at least the event notification needs to be kept until the task corresponding to the event notification is executed correctly, in this process, a new event notification may be continuously sent to a cache partition module. That is, after receiving the event notification sent by the cluster, it is sufficient to place the event notification into a preset notification queue. And it is understood that the newly received event notification can be placed at the end of the queue, and the head of the queue represents the event notification corresponding to the task currently being executed.
In this embodiment, since the preset notification queue is used to sort the event notifications, after the node information of the cache partition is updated according to the current event notification, the current event notification does not need to be retained, and thus the current event notification in the notification queue can be deleted.
In a specific embodiment of the present invention, when the received event notification is an event notification indicating a failure of the first node, the sending of the task corresponding to the current event notification to the service end described in step S102 includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing the survival node;
when the received event notification is an event notification indicating that the first node has failed to recover, the sending of the task corresponding to the event notification to the service end, which is described in step S102, includes:
and sending a task corresponding to the event notification to the service end so as to restore the taken over cache partition of the first node to the first node.
In this embodiment, when the received event notification is an event notification indicating that the first node has failed, a task corresponding to the event notification may be sent to the service end, so as to take over the cache partition of the first node by using the surviving node. The first node may have 1 or more cache partitions, and may implement takeover of these cache partitions by 1 or more surviving nodes, and a specific takeover rule may be set according to an actual need, which is not described in this application.
When the received event notification is the event notification indicating that the first node has failed to recover, the first node is normal, and a task corresponding to the event notification can be sent to the service end, so that the taken over cache partition of the first node is recovered to the first node. The first node may be any node in the cluster.
As described above, in the conventional scheme, node information of the cache partition is abnormal mainly in the single partition mode, and therefore, the cache partition of the cluster of the present application may be selected as the single partition mode. However, although it should be noted that, in other partition modes, the node information of the cache partition is not easy to be abnormal, the scheme of the present application may still be adopted, and the implementation of the present application is not affected.
In an embodiment of the present invention, the method may further include:
when an event notification indicating failure recovery of the first node is received within a first time period after the event notification indicating failure of the first node is received, information is recorded.
If the event notification indicating the failure of the first node is received within the first time period after the event notification indicating the failure of the first node is received, the embodiment records information, such as the number of the first node, the date of the occurrence of the abnormal condition and the like, and is beneficial to counting and processing the special condition by a worker, namely the embodiment is beneficial to improving the convenience of operation and maintenance.
By applying the technical scheme provided by the embodiment of the invention, the node information of the cache partition is not updated and a new task is executed immediately after the event notification sent by the cluster is received. After receiving the event notification sent by the cluster, if receiving information, which is fed back by the service end and indicates that the task corresponding to the last event notification is executed completely, indicating that the task corresponding to the last event notification is executed completely, updating the node information of the cache partition according to the last event notification, and further sending the task corresponding to the event notification to the service end. After receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, the node information of the cache partition is updated according to the event notification after the task corresponding to the event notification is executed. It can be seen that, in the present application, the node information of the cache partition is not immediately updated when the event notification sent by the cluster is received, but the node information of the cache partition is updated only when the task corresponding to the event notification sent by the cluster is executed each time, so that the condition that the node information of the cache partition is abnormal in the conventional scheme does not occur.
Corresponding to the above method embodiments, the embodiments of the present invention further provide a management system for cache partitions, which can be referred to in correspondence with the above.
Referring to fig. 2, a schematic structural diagram of a management system of a cache partition in the present invention is shown, including:
an event notification receiving unit 201, configured to receive an event notification sent by a cluster;
an execution unit 202, configured to update node information of a cache partition according to a previous event notification after receiving information indicating that a task corresponding to the previous event notification is executed and fed back by a service end, and send a task corresponding to the current event notification to the service end; and after receiving information which is fed back by the service end and represents that the task corresponding to the current event notification is executed, updating the node information of the cache partition according to the current event notification.
In an embodiment of the present invention, the event notification receiving unit 201 is specifically configured to:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, the method also comprises the following steps:
further comprising: and the queue updating unit is used for deleting the current event notification in the notification queue after the execution unit updates the node information of the cache partition according to the current event notification.
In an embodiment of the present invention, when the received event notification is an event notification indicating a failure of the first node, the executing unit 202 sends a task corresponding to the current event notification to the service end, where the task includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing the survival node;
when the received event notification is an event notification indicating that the first node has failed, the execution unit 202 sends a task corresponding to the current event notification to the service end, where the task includes:
and sending a task corresponding to the event notification to the service end so as to restore the taken over cache partition of the first node to the first node.
In a specific embodiment of the present invention, the cache partition of the cluster is in a single partition mode.
In one embodiment of the present invention, the method further comprises:
and the information recording unit is used for recording information when receiving the event notice which shows the fault recovery of the first node within a first time period after receiving the event notice which shows the fault of the first node.
Corresponding to the above method and system embodiments, the embodiments of the present invention further provide a management device for cache partitions and a computer-readable storage medium, which may be referred to in correspondence with the above. The computer readable storage medium has a computer program stored thereon, and the computer program, when executed by a processor, implements the steps of the method for managing a cache partition in any of the above embodiments. A computer-readable storage medium as referred to herein may include Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The management device of the cache partition may include:
a memory for storing a computer program;
a processor for executing a computer program to implement the steps of the method for managing cache partitions in any of the above embodiments.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The principle and the implementation of the present invention are explained in the present application by using specific examples, and the above description of the embodiments is only used to help understanding the technical solution and the core idea of the present invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
Claims (10)
1. A method for managing a cache partition, comprising:
receiving an event notification sent by a cluster;
after receiving information which is fed back by the service end and represents that the task corresponding to the last event notification is executed, updating node information of the cache partition according to the last event notification, and sending the task corresponding to the current event notification to the service end;
and after receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, updating the node information of the cache partition according to the event notification.
2. The method for managing the cache partition according to claim 1, wherein the receiving the event notification sent by the cluster includes:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, after the node information of the cache partition is updated according to the event notification, the method further includes:
and deleting the event notification in the notification queue.
3. The method according to claim 1, wherein when the received event notification is an event notification indicating a failure of the first node, the sending a task corresponding to the current event notification to the service end includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing a survival node;
when the received event notification is an event notification indicating that the first node has failed to recover, the sending a task corresponding to the event notification to the service end includes:
and sending a task corresponding to the event notification to a service end so as to restore the taken over cache partition of the first node to the first node.
4. The method for managing the cache partition according to claim 1, wherein the cache partition of the cluster is in a single partition mode.
5. The method for managing the cache partition according to any one of claims 1 to 4, further comprising:
and recording information when receiving the event notice which shows the fault recovery of the first node within a first time period after receiving the event notice which shows the fault of the first node.
6. A cache-partitioned management system, comprising:
an event notification receiving unit, configured to receive an event notification sent by a cluster;
the execution unit is used for updating the node information of the cache partition according to the last event notification after receiving the information which is fed back by the service end and indicates that the task corresponding to the last event notification is executed, and sending the task corresponding to the current event notification to the service end; and after receiving the information which is fed back by the service end and represents that the task corresponding to the event notification is executed, updating the node information of the cache partition according to the event notification.
7. The management system of a cache partition according to claim 6, wherein the event notification receiving unit is specifically configured to:
receiving an event notification sent by a cluster and putting the event notification into a preset notification queue;
correspondingly, the method also comprises the following steps:
further comprising: and the queue updating unit is used for deleting the current event notification in the notification queue after the execution unit updates the node information of the cache partition according to the current event notification.
8. The system according to claim 6, wherein when the received event notification is an event notification indicating a failure of the first node, the execution unit sends a task corresponding to the current event notification to the service end, and the task includes:
sending a task corresponding to the event notification to a service end so as to take over the cache partition of the first node by utilizing a survival node;
when the received event notification is an event notification indicating that the first node has failed to recover, the execution unit sends a task corresponding to the event notification to the service end, and the task includes:
and sending a task corresponding to the event notification to a service end so as to restore the taken over cache partition of the first node to the first node.
9. A management apparatus for a cache partition, comprising:
a memory for storing a computer program;
processor for executing said computer program for implementing the steps of the method for managing cache partitions according to any of claims 1 to 5.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for managing cache partitions according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110807418.3A CN113703669B (en) | 2021-07-16 | 2021-07-16 | Cache partition management method, system, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110807418.3A CN113703669B (en) | 2021-07-16 | 2021-07-16 | Cache partition management method, system, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113703669A true CN113703669A (en) | 2021-11-26 |
CN113703669B CN113703669B (en) | 2023-08-04 |
Family
ID=78648778
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110807418.3A Active CN113703669B (en) | 2021-07-16 | 2021-07-16 | Cache partition management method, system, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113703669B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024113534A1 (en) * | 2022-11-30 | 2024-06-06 | 苏州元脑智能科技有限公司 | Method and apparatus for controlling storage resources in storage node, and storage node |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050160303A1 (en) * | 2004-01-15 | 2005-07-21 | International Business Machines Corporation | Method, apparatus, and program for minimizing invalid cache notification events in a distributed caching environment |
CN111813348A (en) * | 2020-09-08 | 2020-10-23 | 苏州浪潮智能科技有限公司 | Node event processing device, method, equipment and medium in unified storage equipment |
CN112463437A (en) * | 2020-11-05 | 2021-03-09 | 苏州浪潮智能科技有限公司 | Service recovery method, system and related components of storage cluster system offline node |
-
2021
- 2021-07-16 CN CN202110807418.3A patent/CN113703669B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050160303A1 (en) * | 2004-01-15 | 2005-07-21 | International Business Machines Corporation | Method, apparatus, and program for minimizing invalid cache notification events in a distributed caching environment |
CN111813348A (en) * | 2020-09-08 | 2020-10-23 | 苏州浪潮智能科技有限公司 | Node event processing device, method, equipment and medium in unified storage equipment |
CN112463437A (en) * | 2020-11-05 | 2021-03-09 | 苏州浪潮智能科技有限公司 | Service recovery method, system and related components of storage cluster system offline node |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024113534A1 (en) * | 2022-11-30 | 2024-06-06 | 苏州元脑智能科技有限公司 | Method and apparatus for controlling storage resources in storage node, and storage node |
Also Published As
Publication number | Publication date |
---|---|
CN113703669B (en) | 2023-08-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8775387B2 (en) | Methods and systems for validating accessibility and currency of replicated data | |
CN109639794A (en) | A kind of stateful cluster recovery method, apparatus, equipment and readable storage medium storing program for executing | |
CN112463448B (en) | Distributed cluster database synchronization method, device, equipment and storage medium | |
CN102394914A (en) | Cluster brain-split processing method and device | |
CN109491609B (en) | Cache data processing method, device and equipment and readable storage medium | |
CN108173971A (en) | A kind of MooseFS high availability methods and system based on active-standby switch | |
CN111901176B (en) | Fault determination method, device, equipment and storage medium | |
CN111752488B (en) | Management method and device of storage cluster, management node and storage medium | |
CN108319522A (en) | A method of reinforcing distributed memory system reliability | |
CN113703669A (en) | Management method, system, equipment and storage medium for cache partition | |
CN110858168B (en) | Cluster node fault processing method and device and cluster node | |
EP4060514A1 (en) | Distributed database system and data disaster backup drilling method | |
JP6418377B2 (en) | Management target device, management device, and network management system | |
CN111342986A (en) | Distributed node management method and device, distributed system and storage medium | |
CN111309515B (en) | Disaster recovery control method, device and system | |
CN111897626A (en) | Cloud computing scene-oriented virtual machine high-reliability system and implementation method | |
CN111158956A (en) | Data backup method and related device for cluster system | |
CN114598711A (en) | Data migration method, device, equipment and medium | |
CN116668269A (en) | Arbitration method, device and system for dual-activity data center | |
CN114036129A (en) | Database switching method for reducing data loss | |
JP4485560B2 (en) | Computer system and system management program | |
CN114598593A (en) | Message processing method, system, computing device and computer storage medium | |
CN114116178A (en) | Cluster framework task management method and related device | |
US20060023627A1 (en) | Computing system redundancy and fault tolerance | |
CN108897645B (en) | Database cluster disaster tolerance method and system based on standby heartbeat disk |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |