CN108874594B - Disk replacement method, device, equipment and storage medium in storage cluster - Google Patents

Disk replacement method, device, equipment and storage medium in storage cluster Download PDF

Info

Publication number
CN108874594B
CN108874594B CN201810661713.0A CN201810661713A CN108874594B CN 108874594 B CN108874594 B CN 108874594B CN 201810661713 A CN201810661713 A CN 201810661713A CN 108874594 B CN108874594 B CN 108874594B
Authority
CN
China
Prior art keywords
disk
data processing
common
storage cluster
ssd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810661713.0A
Other languages
Chinese (zh)
Other versions
CN108874594A (en
Inventor
史宗华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201810661713.0A priority Critical patent/CN108874594B/en
Publication of CN108874594A publication Critical patent/CN108874594A/en
Application granted granted Critical
Publication of CN108874594B publication Critical patent/CN108874594B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • G06F11/2033Failover techniques switching over of hardware resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention discloses a disk replacement method in a storage cluster, which comprises the following steps: stopping the data processing daemon process of the first common disk when the first common disk in the storage cluster breaks down and needs to be replaced; cleaning related information of a data processing daemon process of a first common disk; determining a first cache partition allocated for use by a first common disk; clearing residual data of a data processing daemon process of a first common disk in a first cache partition; after the first common disk is determined to be replaced by the second common disk, the first cache partition is reallocated to the data processing daemon process of the second common disk; and starting a data processing daemon process of the second common disk and performing reliability configuration. By applying the technical scheme provided by the embodiment of the invention, the disk can be quickly replaced, and the influence on the storage cluster is reduced. The invention also discloses a device, equipment and a storage medium for replacing the disk in the storage cluster, and the device, the equipment and the storage medium have corresponding technical effects.

Description

Disk replacement method, device, equipment and storage medium in storage cluster
Technical Field
The invention relates to the technical field of computer application, in particular to a method, a device, equipment and a storage medium for replacing a disk in a storage cluster.
Background
With the rapid development of computer technology, distributed storage clusters for mass data storage are widely used in various industries.
In the practical application process, the scale of the storage cluster is getting larger and larger, and the requirement of the user on the read-write speed of the storage cluster is also getting higher and higher, so that the application of a SSD (Solid State Drives) disk with a higher read-write speed in the storage cluster is considered. However, in consideration of the cost of the SSD disk, most storage clusters do not use all SSD disks, but use a low-speed normal disk and an SSD disk in a combined manner. The method comprises the steps of setting a cache partition on an SSD disk, and placing files needing frequent reading and writing of a data processing daemon process of a common disk and data needing frequent reading and writing from the common disk into the corresponding cache partition, so that the reading and writing speed of the whole storage cluster is improved.
Therefore, although the read-write speed and efficiency of the storage cluster are improved, if the disk in the storage cluster fails and data cannot be read and written normally, the disk needs to be replaced, but a rapid replacement method does not exist at present.
In summary, how to effectively solve the problem of disk replacement in a storage cluster is a technical problem that needs to be solved urgently by those skilled in the art.
Disclosure of Invention
The invention aims to provide a method, a device and equipment for replacing a disk in a storage cluster and a storage medium, so that the disk in the storage cluster can be quickly replaced, and the influence on the storage cluster is reduced.
In order to solve the technical problems, the invention provides the following technical scheme:
a disk replacement method in a storage cluster comprises a plurality of ordinary disks and at least one SSD disk, wherein at least one SSD disk is provided with at least one cache partition, and each cache partition is used for being allocated to the corresponding ordinary disk for cache acceleration, and the method comprises the following steps:
stopping the data processing daemon process of a first common disk when the first common disk in the storage cluster fails and needs to be replaced;
cleaning relevant information of a data processing daemon process of the first common disk in the storage cluster;
determining a first cache partition allocated for use by the first common disk;
clearing residual data of a data processing daemon process of the first common disk in the first cache partition;
after the first common disk is determined to be replaced by a second common disk, reallocating the first cache partition to a data processing daemon process of the second common disk;
and starting a data processing daemon process of the second common disk and performing reliability configuration.
In a specific embodiment of the present invention, the clearing up the relevant information of the data processing daemon of the first ordinary disk in the storage cluster includes:
clearing the reliability configuration related to the data processing daemon process of the first common disk;
and clearing the monitoring of the data processing daemon of the first common disk from the monitoring service of the storage cluster.
In a specific embodiment of the present invention, the determining a first cache partition allocated for use by the first ordinary disk includes:
acquiring a slot number of the first common disk;
and determining a first cache partition allocated to the first common disk for use according to the corresponding relation between the slot number of the first common disk and the pre-recorded slot number of the common disk and the identification number of the cache partition.
In one embodiment of the present invention, the method further comprises:
when a first SSD disk in the storage cluster fails and needs to be replaced, determining all third ordinary disks of which cache partitions are located on the first SSD disk;
stopping the data processing daemon process of each third common disk;
cleaning related information of a data processing daemon process of each third common disk in the storage cluster;
after the first SSD disk is determined to be replaced by a second SSD disk, dividing the cache partition of the second SSD disk according to the pre-recorded cache partition information of the first SSD disk;
distributing a cache partition for the data processing daemon process of each third common disk from the newly divided cache partitions;
and starting the data processing daemon process of each third common disk and performing reliability configuration.
In a specific embodiment of the present invention, the clearing up information related to a data processing daemon of each third ordinary disk in the storage cluster includes:
clearing the reliability configuration related to the data processing daemon process of each third common disk;
and clearing the monitoring of the data processing daemon of each third common disk from the monitoring service of the storage cluster.
In a specific embodiment of the present invention, the determining that the cache partition is located in all third ordinary disks on the first SSD disk includes:
acquiring a slot number of the first SSD disk;
and determining all third common disks of which the cache partitions are positioned on the first SSD disk according to the corresponding relation among the slot numbers of the first SSD disk, the pre-recorded slot numbers of the SSD disk and the identification numbers of the cache partitions arranged on the slot numbers of the SSD disk and the pre-recorded corresponding relation among the slot numbers of the common disks and the identification numbers of the cache partitions.
In a specific embodiment of the present invention, after allocating a cache partition to each data processing daemon of the third ordinary disk in the newly partitioned cache partition, the method further includes:
and updating the corresponding relation between the slot number of the SSD disk and the identification number of the cache partition arranged on the SSD disk, and the corresponding relation between the slot number of the ordinary disk and the identification number of the cache partition.
A disk replacement device in a storage cluster, wherein the storage cluster comprises a plurality of ordinary disks and at least one SSD disk, at least one SSD disk is provided with at least one cache partition, and each cache partition is used for being allocated to the corresponding ordinary disk for cache acceleration, and the device comprises:
the first process stopping module is used for stopping the data processing daemon process of the first common disk when the first common disk in the storage cluster fails and needs to be replaced;
the first information cleaning module is used for cleaning the relevant information of the data processing daemon process of the first common disk in the storage cluster;
a partition determination module for determining a first cache partition allocated for use by the first common disk;
a data clearing module, configured to clear residual data of the data processing daemon of the first ordinary disk in the first cache partition;
the first partition allocation module is used for reallocating the first cache partition to a data processing daemon process of a second common disk after the first common disk is determined to be replaced by the second common disk;
and the first process configuration module is used for starting the data processing daemon process of the second common disk and performing reliability configuration.
In one embodiment of the present invention, the method further comprises:
the disk determining module is used for determining all third common disks of which cache partitions are located on the first SSD disk when the first SSD disk in the storage cluster fails and needs to be replaced;
the second process stopping module is used for stopping the data processing daemon process of each third common disk;
the second information cleaning module is used for cleaning the relevant information of the data processing daemon of each third common disk in the storage cluster;
the partition dividing module is used for dividing the cache partition of the second SSD disk according to the pre-recorded cache partition information of the first SSD disk after the first SSD disk is determined to be replaced by the second SSD disk;
the second partition allocation module is used for allocating the cache partitions to the data processing daemon process of each third common disk from the newly divided cache partitions;
and the second process configuration module is used for starting the data processing daemon process of each third common disk and performing reliability configuration.
A disk replacement device in a storage cluster, comprising:
a memory for storing a computer program;
a processor, configured to implement the steps of the disk replacement method in any one of the storage clusters when executing the computer program.
A computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the disk replacement method in a storage cluster according to any one of the preceding claims.
By applying the technical scheme provided by the embodiment of the invention, at least one SSD disk in a storage cluster is provided with at least one cache partition, when a first common disk in the storage cluster is in failure and needs to be replaced, a data processing daemon process of the first common disk is stopped, relevant information of the data processing daemon process in the storage cluster is cleaned, the first cache partition which is allocated to the first common disk is determined to be used, residual data of the data processing daemon process of the first common disk in the first cache partition is cleaned, after the first common disk is determined to be replaced by a second common disk, the first cache partition is reallocated to the data processing daemon process of the second common disk, the data processing daemon process of the second common disk is started and reliability configuration is carried out, so that the replaced second common disk is rapidly added into the storage cluster to work, reducing the impact on the storage cluster.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flowchart illustrating an implementation of a method for replacing a disk in a storage cluster according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating another embodiment of a method for replacing a disk in a storage cluster according to the present invention;
fig. 3 is a schematic structural diagram of a disk replacement device in a storage cluster according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a disk changer in another storage cluster according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a disk replacement device in a storage cluster according to an embodiment of the present invention.
Detailed Description
The core of the invention is to provide a disk replacement method in a storage cluster. In the embodiment of the present invention, the storage cluster is composed of a plurality of storage nodes, and includes a plurality of ordinary disks and at least one SSD disk, and at least one cache partition is disposed in the at least one SSD disk. For performance balancing, for each storage node in the storage cluster, the storage node allocates a partition on its SSD disk for cache acceleration for each low-speed normal disk of the storage node, and such a partition is a cache partition. I.e. each cache partition is used for cache acceleration assigned to the corresponding ordinary disk. In contrast, a normal disk is a low-speed device, and an SSD disk is a high-speed device.
In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, an implementation flowchart of a method for replacing a disk in a storage cluster according to an embodiment of the present invention is shown, where the method includes the following steps:
s110: and when the first common disk in the storage cluster fails and needs to be replaced, stopping the data processing daemon process of the first common disk.
As previously described, a storage cluster includes a plurality of normal disks and at least one SSD disk. As the storage cluster continues to be used, a disk in the storage cluster may fail, thereby affecting the normal operation of the storage cluster, which requires replacement of the failed disk.
In practical application, each ordinary disk performs data reading and writing and other work through respective data processing daemon. When the first ordinary disk in the storage cluster fails and needs to be replaced, the data processing daemon process of the first ordinary disk can be stopped first. The first ordinary disk is an ordinary disk in any storage cluster using a cache partition.
S120: and cleaning the relevant information of the data processing daemon process of the first common disk in the storage cluster.
After stopping the data processing daemon of the first ordinary disk, the relevant information of the data processing daemon in the storage cluster can be cleaned.
Specifically, the reliability configuration related to the data processing daemon of the first ordinary disk may be cleared, and the monitoring of the data processing daemon of the first ordinary disk may be cleared from the monitoring service of the storage cluster.
S130: a first cache partition allocated for use by a first common disk is determined.
In a storage cluster, each ordinary disk may be assigned a cache partition to increase the read and write speed. When a first ordinary disk in the storage cluster fails and needs to be replaced, a first cache partition allocated to the first ordinary disk for use can be determined, and the first cache partition is arranged on the SSD disk.
In one embodiment of the present invention, step S130 may include the following steps:
the method comprises the following steps: acquiring a slot number of a first common disk;
step two: and determining a first cache partition allocated to the first common disk for use according to the corresponding relation between the slot number of the first common disk and the pre-recorded slot number of the common disk and the identification number of the cache partition.
For convenience of description, the above two steps are combined for illustration.
In the embodiment of the present invention, the corresponding relationship between the slot number of the ordinary disk and the identification number of the cache partition may be recorded in advance. Specifically, the recording and the real-time updating can be performed in a storage cluster deployment stage. The identification numbers of the cache partitions are unique system numbers, and the identification numbers of different cache partitions are different.
The pre-recorded data structure of the correspondence between the slot number of the normal disk and the identification number of the cache partition may refer to table 1:
slot number of ordinary disk int type
Cache partitioning for use varchar type
TABLE 1
After the slot number of the first ordinary disk is obtained, the identification number of the cache partition corresponding to the first ordinary disk can be determined according to the slot number and the corresponding relationship between the pre-recorded slot number of the ordinary disk and the identification number of the cache partition, and accordingly, the first cache partition allocated to the first ordinary disk for use is determined.
The slot number can be used for preventing the problem that the corresponding relation changes due to the disk character drift caused by the restart of the storage node.
S140: and clearing residual data of the data processing daemon process of the first common disk in the first cache partition.
When the first ordinary disk works normally, the data processing daemon process of the first ordinary disk uses the first cache partition to perform data reading and writing and other operations, and the first cache partition stores corresponding operation data. After determining the first cache partition allocated to the first ordinary disk, the residual data of the data processing daemon process of the first ordinary disk in the first cache partition may be cleared to prevent interference to a new process after a new disk is replaced.
S150: and after the first common disk is determined to be replaced by the second common disk, reallocating the first cache partition to the data processing daemon process of the second common disk.
And replacing the first common disk with a second common disk to replace the physical disk. In particular, the replacement may be performed by a technician.
After the first ordinary disk is replaced with the second ordinary disk, the data processing daemon process of the second ordinary disk can be preprocessed. The slot number of the second ordinary disk is the slot number of the first ordinary disk, and the second ordinary disk replaces the first ordinary disk to work. At this point, the first cache partition may be reassigned to the data processing daemon of the second generic disk.
S160: and starting a data processing daemon process of the second common disk and performing reliability configuration.
After the first cache partition is reallocated to the data processing daemon process of the second ordinary disk, the data processing daemon process of the second ordinary disk can be started and reliability configuration is carried out, so that the replacement work of the ordinary disk is completed, and the storage cluster can normally run.
By applying the method provided by the embodiment of the invention, at least one SSD disk in a storage cluster is provided with at least one cache partition, when a first common disk in the storage cluster is in failure and needs to be replaced, a data processing daemon process of the first common disk is stopped, relevant information of the data processing daemon process in the storage cluster is cleaned, the first cache partition which is allocated to the first common disk is determined to be used, residual data of the data processing daemon process of the first common disk in the first cache partition is cleaned, after the first common disk is determined to be replaced by a second common disk, the first cache partition is reallocated to the data processing daemon process of the second common disk, the data processing daemon process of the second common disk is started and reliability configuration is carried out, so that the replaced second common disk is rapidly added into the storage cluster to work, reducing the impact on the storage cluster.
Referring to fig. 2, in one embodiment of the present invention, the method may further comprise the steps of:
s210: when the first SSD disk in the storage cluster fails and needs to be replaced, all third ordinary disks of which the cache partitions are located on the first SSD disk are determined.
In the storage cluster, at least one cache partition is arranged in the SSD disk, each cache partition is used for being allocated to the corresponding ordinary disk for cache acceleration, and once the SSD disk fails and needs to be replaced, the normal work of the corresponding ordinary disk is influenced.
Therefore, when the first SSD disk in the storage cluster fails and needs to be replaced, all the third ordinary disks of which the cache partition is located on the first SSD disk may be determined first. The first SSD disk is an SSD disk with a cache partition arranged in any one of the storage clusters. There may be one or more third normal disks.
In one embodiment of the present invention, step S210 may include the following steps:
the first step is as follows: acquiring a slot number of a first SSD disk;
the second step is that: and determining all third common disks of which the cache partitions are positioned on the first SSD disk according to the corresponding relation between the slot numbers of the first SSD disk, the pre-recorded slot numbers of the SSD disk and the identification numbers of the cache partitions arranged on the SSD disk and the corresponding relation between the pre-recorded slot numbers of the common disks and the identification numbers of the cache partitions.
For convenience of description, the above two steps are combined for illustration.
In the embodiment of the present invention, in addition to the correspondence between the slot number of the ordinary disk and the identification number of the cache partition, the correspondence between the slot number of the SSD disk and the identification number of the cache partition set thereon may be recorded in advance. Specifically, the recording and the real-time updating can be performed in a storage cluster deployment stage.
The table 2 may be referred to in a data structure of correspondence between the slot number of the SSD disk and the identification number of the cache partition set thereon, which is recorded in advance:
SSD disk slot number int type
Cache partitioning set on varchar type, with comma separation between partitions
TABLE 2
After the slot number of the first SSD disk is obtained, the identification number of each cache partition set on the first SSD disk may be determined according to the corresponding relationship between the slot number of the first SSD disk and the slot number of the pre-recorded SSD disk and the identification number of the cache partition set thereon, and then the ordinary disk corresponding to each cache partition, that is, all the third ordinary disks of which the cache partitions are located on the first SSD disk, is determined according to the corresponding relationship between the slot number of the pre-recorded ordinary disk and the identification number of the cache partition.
S220: and stopping the data processing daemon of each third ordinary disk.
In practical application, each ordinary disk performs data reading and writing and other work through respective data processing daemon. After determining that the cache partition is located in all the third ordinary disks on the first SSD disk that has a failure and needs to be replaced, the data processing daemon of each third ordinary disk may be stopped first.
S230: and cleaning the related information of the data processing daemon of each third common disk in the storage cluster.
For each third ordinary disk, after stopping the data processing daemon of the third ordinary disk, the relevant information of the data processing daemon of the third ordinary disk may be continuously cleaned.
Specifically, the reliability configuration related to the data processing daemon of each third ordinary disk can be cleared; and clearing the monitoring of the data processing daemon of each third ordinary disk from the monitoring service of the storage cluster.
S240: after the first SSD disk is determined to be replaced by the second SSD disk, the second SSD disk is divided into cache partitions according to the pre-recorded cache partition information of the first SSD disk.
Because the first SSD disk is required to be replaced when the first SSD disk fails, a technician can perform specific physical replacement operation to replace the first SSD disk with the second SSD disk.
And after the first SSD disk is replaced by the second SSD disk, the slot number of the second SSD disk is the slot number of the first SSD disk, and the second SSD disk replaces the first SSD disk to work. At this time, the second SSD disk may be divided into cache partitions according to the pre-recorded cache partition information of the first SSD disk.
The cache partition information of the first SSD disk may include number information of cache partitions set in the first SSD disk and size information of each cache partition.
S250: and allocating the cache partition for each data processing daemon process of the third common disk from the newly divided cache partition.
Before the replacement of the SSD disk is carried out, each cache partition arranged on the first SSD disk corresponds to one third ordinary disk, after the first SSD disk is replaced by the second SSD disk, the cache partition of the second SSD disk is divided according to the cache partition information of the first SSD disk, and the cache partition is distributed for the data processing daemon process of each third ordinary disk from the newly divided cache partition, so that the data processing daemon process of each third ordinary disk continuously uses the corresponding cache partition to carry out operations such as data reading and writing.
S260: and starting the data processing daemon process of each third common disk and performing reliability configuration.
And starting the data processing daemon process of each third common disk and performing reliability configuration, so that the replacement work of the SSD disk is completed, and the storage cluster can normally run.
By the technical scheme provided by the embodiment of the invention, the replaced second SSD disk can be quickly added into the storage cluster to work, and the influence on the storage cluster is reduced.
In an embodiment of the present invention, after step S250, the method may further include the steps of:
and updating the corresponding relation between the slot position number of the SSD disk and the identification number of the cache partition arranged on the SSD disk, and the corresponding relation between the slot position number of the ordinary disk and the identification number of the cache partition.
It can be understood that, after the first SSD disk is replaced with the second SSD disk, the second SSD disk is divided into the cache partitions, and the newly divided cache partitions have the unique system number, that is, the identification numbers of the newly divided cache partitions are different from the identification number of the cache partition of the first SSD disk. Therefore, the corresponding relationship between the slot number of the SSD disk and the identification number of the cache partition set thereon, and the corresponding relationship between the slot number of the normal disk and the identification number of the cache partition need to be updated, so as to be used when the disk is replaced again, thereby avoiding that the storage cluster cannot work normally due to disorder of the corresponding relationship.
Corresponding to the above method embodiment, an embodiment of the present invention further provides a disk replacement device in a storage cluster, where the storage cluster includes a plurality of ordinary disks and at least one SSD disk, at least one SSD disk is provided with at least one cache partition, and each cache partition is used for allocating to a corresponding ordinary disk to perform cache acceleration.
Referring to fig. 3, the apparatus includes the following modules:
a first process stopping module 310, configured to stop a data processing daemon of a first ordinary disk in the storage cluster when the first ordinary disk fails and needs to be replaced;
the first information cleaning module 320 is configured to clean up relevant information of a data processing daemon of a first ordinary disk in the storage cluster;
a partition determination module 330, configured to determine a first cache partition allocated for use by the first ordinary disk;
the data clearing module 340 is configured to clear residual data of the data processing daemon of the first ordinary disk in the first cache partition;
a first partition allocation module 350, configured to reallocate the first cache partition to the data processing daemon process of the second ordinary disk after determining that the first ordinary disk is replaced by the second ordinary disk;
and a first process configuration module 360, configured to start a data processing daemon process of the second ordinary disk and perform reliability configuration.
By applying the device provided by the embodiment of the invention, at least one cache partition is arranged in at least one SSD disk in the storage cluster, when a first ordinary disk in the storage cluster is in failure and needs to be replaced, the data processing daemon of the first ordinary disk is stopped, the relevant information of the data processing daemon in the storage cluster is cleaned, the first cache partition which is allocated to the first ordinary disk is determined to be used, the residual data of the data processing daemon of the first ordinary disk in the first cache partition is cleaned, after the first ordinary disk is determined to be replaced by a second ordinary disk, the first cache partition is reallocated to the data processing daemon of the second ordinary disk, the data processing daemon of the second ordinary disk is started and reliability configuration is carried out, so that the replaced second ordinary disk is quickly added into the storage cluster to work, reducing the impact on the storage cluster.
In an embodiment of the present invention, the first information cleaning module 320 is specifically configured to:
clearing reliability configuration related to a data processing daemon process of the first common disk;
and clearing the monitoring of the data processing daemon of the first common disk from the monitoring service of the storage cluster.
In an embodiment of the present invention, the partition determining module 330 is specifically configured to:
acquiring a slot number of a first common disk;
and determining a first cache partition allocated to the first common disk for use according to the corresponding relation between the slot number of the first common disk and the pre-recorded slot number of the common disk and the identification number of the cache partition.
Referring to fig. 4, in an embodiment of the present invention, the method further includes:
a disk determining module 410, configured to determine all third ordinary disks of the cache partition on the first SSD disk when the first SSD disk in the storage cluster fails and needs to be replaced;
a second process stopping module 420, configured to stop the data processing daemon of each third ordinary disk;
the second information cleaning module 430 is configured to clean up relevant information of the data processing daemon of each third ordinary disk in the storage cluster;
the partition dividing module 440 is configured to, after determining that the first SSD disk is replaced with the second SSD disk, perform cache partition division on the second SSD disk according to the pre-recorded cache partition information of the first SSD disk;
a second partition allocation module 450, configured to allocate a cache partition to the data processing daemon of each third ordinary disk from the newly divided cache partition;
and a second process configuration module 460, configured to start the data processing daemon of each third ordinary disk and perform reliability configuration.
In an embodiment of the present invention, the second information cleaning module 430 is specifically configured to:
clearing the reliability configuration related to the data processing daemon process of each third common disk;
and clearing the monitoring of the data processing daemon of each third ordinary disk from the monitoring service of the storage cluster.
In an embodiment of the present invention, the disk determining module 410 is specifically configured to:
acquiring a slot number of a first SSD disk;
and determining all third common disks of which the cache partitions are positioned on the first SSD disk according to the corresponding relation between the slot numbers of the first SSD disk, the pre-recorded slot numbers of the SSD disk and the identification numbers of the cache partitions arranged on the SSD disk and the corresponding relation between the pre-recorded slot numbers of the common disks and the identification numbers of the cache partitions.
In a specific embodiment of the present invention, the method further includes a relationship updating module, configured to:
after the cache partition is allocated to the data processing daemon of each third ordinary disk from the newly divided cache partition, the corresponding relation between the slot number of the SSD disk and the identification number of the cache partition set thereon and the corresponding relation between the slot number of the ordinary disk and the identification number of the cache partition are updated.
Referring to fig. 5, in correspondence to the above method embodiment, an embodiment of the present invention further provides a disk replacement device in a storage cluster, including:
a memory 510 for storing a computer program;
the processor 520 is configured to implement the steps of the disk replacement method in the storage cluster when executing the computer program.
Corresponding to the above method embodiment, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when being executed by a processor, the computer program implements the steps of the disk replacement method in the storage cluster.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The principle and the implementation of the present invention are explained in the present application by using specific examples, and the above description of the embodiments is only used to help understanding the technical solution and the core idea of the present invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims (11)

1. A disk replacement method in a storage cluster is characterized in that the storage cluster comprises a plurality of ordinary disks and at least one SSD disk, at least one SSD disk is provided with at least one cache partition, and each cache partition is used for being allocated to the corresponding ordinary disk for cache acceleration, and the method comprises the following steps:
stopping the data processing daemon process of a first common disk when the first common disk in the storage cluster fails and needs to be replaced;
cleaning relevant information of a data processing daemon process of the first common disk in the storage cluster;
determining a first cache partition allocated for use by the first common disk;
clearing residual data of a data processing daemon process of the first common disk in the first cache partition;
after the first common disk is determined to be replaced by a second common disk, reallocating the first cache partition to a data processing daemon process of the second common disk;
and starting a data processing daemon process of the second common disk and performing reliability configuration.
2. The method according to claim 1, wherein the cleaning of the information related to the data processing daemon of the first generic disk in the storage cluster comprises:
clearing the reliability configuration related to the data processing daemon process of the first common disk;
and clearing the monitoring of the data processing daemon of the first common disk from the monitoring service of the storage cluster.
3. The method of claim 1, wherein determining the first cache partition allocated for use by the first ordinary disk comprises:
acquiring a slot number of the first common disk;
and determining a first cache partition allocated to the first common disk for use according to the corresponding relation between the slot number of the first common disk and the pre-recorded slot number of the common disk and the identification number of the cache partition.
4. The method for replacing a disk in a storage cluster according to any one of claims 1 to 3, further comprising:
when a first SSD disk in the storage cluster fails and needs to be replaced, determining all third ordinary disks of which cache partitions are located on the first SSD disk;
stopping the data processing daemon process of each third common disk;
cleaning related information of a data processing daemon process of each third common disk in the storage cluster;
after the first SSD disk is determined to be replaced by a second SSD disk, dividing the cache partition of the second SSD disk according to the pre-recorded cache partition information of the first SSD disk;
distributing a cache partition for the data processing daemon process of each third common disk from the newly divided cache partitions;
and starting the data processing daemon process of each third common disk and performing reliability configuration.
5. The method according to claim 4, wherein the cleaning of the information related to the data processing daemon of each third ordinary disk in the storage cluster includes:
clearing the reliability configuration related to the data processing daemon process of each third common disk;
and clearing the monitoring of the data processing daemon of each third common disk from the monitoring service of the storage cluster.
6. The method of claim 4, wherein the determining that the cache partition is located on all third ordinary disks of the first SSD disk comprises:
acquiring a slot number of the first SSD disk;
and determining all third common disks of which the cache partitions are positioned on the first SSD disk according to the corresponding relation among the slot numbers of the first SSD disk, the pre-recorded slot numbers of the SSD disk and the identification numbers of the cache partitions arranged on the slot numbers of the SSD disk and the pre-recorded corresponding relation among the slot numbers of the common disks and the identification numbers of the cache partitions.
7. The method of claim 6, wherein after allocating a cache partition to each third ordinary disk data processing daemon in the newly partitioned cache partition, the method further comprises:
and updating the corresponding relation between the slot number of the SSD disk and the identification number of the cache partition arranged on the SSD disk, and the corresponding relation between the slot number of the ordinary disk and the identification number of the cache partition.
8. A disk replacement device in a storage cluster is characterized in that the storage cluster comprises a plurality of ordinary disks and at least one SSD disk, at least one SSD disk is provided with at least one cache partition, and each cache partition is used for being allocated to the corresponding ordinary disk for cache acceleration, and the device comprises:
the first process stopping module is used for stopping the data processing daemon process of the first common disk when the first common disk in the storage cluster fails and needs to be replaced;
the first information cleaning module is used for cleaning the relevant information of the data processing daemon process of the first common disk in the storage cluster;
a partition determination module for determining a first cache partition allocated for use by the first common disk;
a data clearing module, configured to clear residual data of the data processing daemon of the first ordinary disk in the first cache partition;
the first partition allocation module is used for reallocating the first cache partition to a data processing daemon process of a second common disk after the first common disk is determined to be replaced by the second common disk;
and the first process configuration module is used for starting the data processing daemon process of the second common disk and performing reliability configuration.
9. The apparatus for replacing a disk in a storage cluster according to claim 8, further comprising:
the disk determining module is used for determining all third common disks of which cache partitions are located on the first SSD disk when the first SSD disk in the storage cluster fails and needs to be replaced;
the second process stopping module is used for stopping the data processing daemon process of each third common disk;
the second information cleaning module is used for cleaning the relevant information of the data processing daemon of each third common disk in the storage cluster;
the partition dividing module is used for dividing the cache partition of the second SSD disk according to the pre-recorded cache partition information of the first SSD disk after the first SSD disk is determined to be replaced by the second SSD disk;
the second partition allocation module is used for allocating the cache partitions to the data processing daemon process of each third common disk from the newly divided cache partitions;
and the second process configuration module is used for starting the data processing daemon process of each third common disk and performing reliability configuration.
10. A disk replacement device in a storage cluster, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the disk replacement method in the storage cluster according to any one of claims 1 to 7 when executing the computer program.
11. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of a method for disk replacement in a storage cluster according to any one of claims 1 to 7.
CN201810661713.0A 2018-06-25 2018-06-25 Disk replacement method, device, equipment and storage medium in storage cluster Active CN108874594B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810661713.0A CN108874594B (en) 2018-06-25 2018-06-25 Disk replacement method, device, equipment and storage medium in storage cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810661713.0A CN108874594B (en) 2018-06-25 2018-06-25 Disk replacement method, device, equipment and storage medium in storage cluster

Publications (2)

Publication Number Publication Date
CN108874594A CN108874594A (en) 2018-11-23
CN108874594B true CN108874594B (en) 2021-10-22

Family

ID=64295463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810661713.0A Active CN108874594B (en) 2018-06-25 2018-06-25 Disk replacement method, device, equipment and storage medium in storage cluster

Country Status (1)

Country Link
CN (1) CN108874594B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110764949A (en) * 2019-09-29 2020-02-07 北京浪潮数据技术有限公司 Hard disk replacement method, hard disk replacement device, and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103713861A (en) * 2014-01-09 2014-04-09 浪潮(北京)电子信息产业有限公司 File processing method and system based on hierarchical division
CN103713973A (en) * 2014-01-08 2014-04-09 浪潮(北京)电子信息产业有限公司 Mixed storage backup method and system based on HDD and SSD
CN103729149A (en) * 2013-12-31 2014-04-16 创新科存储技术有限公司 Data storage method
CN105446665A (en) * 2015-12-18 2016-03-30 长城信息产业股份有限公司 Computer storage acceleration system and optimization method thereof
CN106844052A (en) * 2017-01-22 2017-06-13 郑州云海信息技术有限公司 A kind of method and device that fusion cluster is built based on Windows Server

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012053026A1 (en) * 2010-10-18 2012-04-26 Hitachi, Ltd. Data storage apparatus and power control method therefor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729149A (en) * 2013-12-31 2014-04-16 创新科存储技术有限公司 Data storage method
CN103713973A (en) * 2014-01-08 2014-04-09 浪潮(北京)电子信息产业有限公司 Mixed storage backup method and system based on HDD and SSD
CN103713861A (en) * 2014-01-09 2014-04-09 浪潮(北京)电子信息产业有限公司 File processing method and system based on hierarchical division
CN105446665A (en) * 2015-12-18 2016-03-30 长城信息产业股份有限公司 Computer storage acceleration system and optimization method thereof
CN106844052A (en) * 2017-01-22 2017-06-13 郑州云海信息技术有限公司 A kind of method and device that fusion cluster is built based on Windows Server

Also Published As

Publication number Publication date
CN108874594A (en) 2018-11-23

Similar Documents

Publication Publication Date Title
US10296237B2 (en) System and method for raid management, reallocation, and restripping
JP4841632B2 (en) Method, apparatus, and program for assigning processors to logical partitions
US7523356B2 (en) Storage controller and a system for recording diagnostic information
CN108959526B (en) Log management method and log management device
EP3617867A1 (en) Fragment management method and fragment management apparatus
CN111124264B (en) Method, apparatus and computer program product for reconstructing data
US10296252B1 (en) Reducing drive extent allocation changes while splitting a group of storage drives into partnership groups in response to addition of a storage drive to an array of storage drives in a data storage system that uses mapped RAID (redundant array of independent disks) technology
US11474919B2 (en) Method for managing multiple disks, electronic device and computer program product
CN107111627B (en) Online file system checking
CN110347613B (en) Method for realizing RAID in multi-tenant solid-state disk, controller and multi-tenant solid-state disk
CN109725838B (en) Method, apparatus and computer readable medium for managing a plurality of discs
CN108874594B (en) Disk replacement method, device, equipment and storage medium in storage cluster
US20120174076A1 (en) Systems and methods for profiling servers
CN108614743A (en) Super data block processing method and device based on NAND flash
CN112306408A (en) Storage block processing method, device, equipment and storage medium
CN109558068B (en) Data migration method and migration system
WO2013136371A1 (en) Storage system and data management method
CN111949384B (en) Task scheduling method, device, equipment and computer readable storage medium
CN109739688B (en) Snapshot resource space management method and device and electronic equipment
CN110209340B (en) Access method and device of full flash memory storage system
CN114721585A (en) Storage management method, apparatus and computer program product
CN114816856A (en) Data backup method, device and equipment and readable storage medium
CN109739445A (en) A kind of date storage method of storage system, system and associated component
CN113868292A (en) Data reading method, data writing method, device and system
JP2016012166A (en) Storage management device, storage management program, and control method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant