CN109669822B - Electronic device, method for creating backup storage pool, and computer-readable storage medium - Google Patents

Electronic device, method for creating backup storage pool, and computer-readable storage medium Download PDF

Info

Publication number
CN109669822B
CN109669822B CN201811438576.0A CN201811438576A CN109669822B CN 109669822 B CN109669822 B CN 109669822B CN 201811438576 A CN201811438576 A CN 201811438576A CN 109669822 B CN109669822 B CN 109669822B
Authority
CN
China
Prior art keywords
storage
standby
pool
osd
object storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811438576.0A
Other languages
Chinese (zh)
Other versions
CN109669822A (en
Inventor
宋小兵
姜文峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201811438576.0A priority Critical patent/CN109669822B/en
Publication of CN109669822A publication Critical patent/CN109669822A/en
Application granted granted Critical
Publication of CN109669822B publication Critical patent/CN109669822B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • G06F11/2033Failover techniques switching over of hardware resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a distributed storage technology, and discloses an electronic device, a method for creating a standby storage pool and a computer readable storage medium. According to the configuration data sent by a user, a plurality of OSD are selected from all storage nodes to serve as standby OSD, standby Pool is created, the mapping relation between the standby Pool and the standby OSD is created, and then the mapping relation between the standby Pool and the standby OSD is stored. Compared with the prior art, the invention sets the standby OSD and the standby Pool in the distributed storage system, and when a certain OSD in the distributed storage system fails, the standby OSD and the standby Pool can be started, thereby reducing the influence of the OSD failure on the performance of the distributed storage system.

Description

Electronic device, method for creating backup storage pool, and computer-readable storage medium
Technical Field
The present invention relates to the field of distributed storage technologies, and in particular, to an electronic device, a method for creating a backup storage pool, and a computer readable storage medium.
Background
The CEPH distributed file system is a distributed storage system with large capacity, high performance and high reliability. The core component of CEPH is OSD (Object Storage Device ) which manages a separate hard disk and provides a read-write access interface for Object-based Storage. The CEPH cluster is composed of a plurality of independent OSD, and the number of the OSD can be dynamically increased and decreased. The CEPH client distributes the Object data (objects) to different OSDs for storage via the crum algorithm. Wherein, CRUSH is a pseudo-random distribution algorithm, the algorithm firstly attributes the object data to a Group (PG) in a storage Pool (Pool) through a HASH value (HASH), and then calculates the OSD stored in the PG, thereby storing the object data which belongs to the same PG in the target OSD corresponding to the PG.
Currently, no spare OSD or spare Pool is set in the CEPH distributed file system, so when an OSD in the CEPH fails, no spare OSD or spare Pool can be activated, resulting in a decrease in the storage performance of the CEPH.
Disclosure of Invention
It is a primary object of the present invention to provide an electronic device, a method of creating a backup storage Pool, and a computer-readable storage medium, which aim to set a backup OSD and a backup Pool for CEPH.
In order to achieve the above object, the present invention provides an electronic device, which is communicatively connected to a plurality of storage nodes, each of the storage nodes being provided with a plurality of object storage devices, the electronic device including a memory and a processor, the memory storing a creation program of a standby storage pool, the creation program of the standby storage pool implementing the following steps when executed by the processor:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: and creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment.
Preferably, the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
Preferably, the processor executes a creation program of the spare storage pool, and after the creation step, further implements the steps of:
and a detection step: detecting whether a preset event is triggered in real time or at fixed time;
a redirection step: when detected, intercepting a write request sent to a preset storage location within a preset time interval, and redirecting the write request to the standby storage pool.
Preferably, the preset event includes failure of one or more primary object storage devices, where the primary object storage devices are object storage devices in the storage node except for standby object storage devices;
the processor executes the creation program of the spare storage pool, and after the detecting step, further implements the steps of:
the standby object storage devices are used for replacing the failed main object storage devices one by one.
In addition, in order to achieve the above object, the present invention further provides a method for creating a backup storage pool, which is applicable to an electronic device, where the electronic device is communicatively connected to a plurality of storage nodes, and a plurality of object storage devices are disposed in each storage node, and the method includes the steps of:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: and creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment.
Preferably, the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
Preferably, after the creating step, the method further comprises:
and a detection step: detecting whether a preset event is triggered in real time or at fixed time;
a redirection step: when detected, intercepting a write request sent to a preset storage location within a preset time interval, and redirecting the write request to the standby storage pool.
Preferably, the preset event includes failure of one or more primary object storage devices, where the primary object storage devices are object storage devices in the storage node except for standby object storage devices;
after the detecting step, the method further comprises:
the standby object storage devices are used for replacing the failed main object storage devices one by one.
In addition, in order to achieve the above object, the present invention provides a distributed storage system, where the distributed storage system includes an electronic device and a plurality of storage nodes, the electronic device is respectively connected to the storage nodes in a communication manner, the electronic device includes a memory and a processor, the memory stores a creating program of a standby storage pool, and when the creating program of the standby storage pool is executed by the processor, the following steps are implemented:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: and creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment.
Preferably, the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
Preferably, the processor executes a creation program of the spare storage pool, and after the creation step, further implements the steps of:
and a detection step: detecting whether a preset event is triggered in real time or at fixed time;
a redirection step: when detected, intercepting a write request sent to a preset storage location within a preset time interval, and redirecting the write request to the standby storage pool.
Preferably, the preset event includes failure of one or more primary object storage devices, where the primary object storage devices are object storage devices in the storage node except for standby object storage devices;
the processor executes the creation program of the spare storage pool, and after the detecting step, further implements the steps of:
the standby object storage devices are used for replacing the failed main object storage devices one by one.
In addition, to achieve the above object, the present invention also proposes a computer-readable storage medium storing a creation program of a backup storage pool, the creation program of the backup storage pool being executable by at least one processor to cause the at least one processor to perform the steps of the creation method of the backup storage pool as set forth in any one of the above.
According to the configuration data sent by a user, a plurality of OSD are selected from all storage nodes to serve as standby OSD, standby Pool is created, the mapping relation between the standby Pool and the standby OSD is created, and then the mapping relation between the standby Pool and the standby OSD is stored. Compared with the prior art, the invention sets the standby OSD and the standby Pool in the distributed storage system, and when a certain OSD in the distributed storage system fails, the standby OSD and the standby Pool can be started, thereby reducing the influence of the OSD failure on the performance of the distributed storage system.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to the structures shown in these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a system architecture of a first embodiment of a distributed storage system according to the present invention;
FIG. 2 is a schematic diagram of an operating environment of a first embodiment of a creator of a spare storage pool of the present invention;
FIG. 3 is a block diagram of a first embodiment of a creation process for a spare storage pool according to the present invention;
FIG. 4 is a block diagram of a second embodiment of a creation process for a spare storage pool according to the present invention;
FIG. 5 is a flowchart illustrating a first embodiment of a method for creating a backup storage pool according to the present invention;
fig. 6 is a flowchart of a second embodiment of a method for creating a backup storage pool according to the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
Referring to fig. 1, a system architecture diagram of a first embodiment of a distributed storage system according to the present invention is shown.
In the present embodiment, the distributed storage system includes a plurality of storage nodes 3 (for example, one host is one storage node), and a plurality of OSDs are provided in each storage node 3.
In some application scenarios, an electronic device 1 is further provided in the distributed storage system, and the electronic device 1 is communicatively connected to each storage node 3 through a network 2.
In some application scenarios, the electronic device 1 described above is provided independently of the distributed storage system and is communicatively connected (e.g., via the network 2).
In this embodiment, the minimum storage unit in the above-mentioned distributed storage system is object data (object), one object data is a data block with a size not exceeding a predetermined value (for example, 4 MB), each object data is mapped into a corresponding PG, and the distributed storage system does not directly operate the object data, but performs data processing (for example, data addressing, data migration, etc.) in the PG as a basic unit.
The distributed storage system supports a multi-copy policy, for example, if the number of copies of the distributed storage system is preset to be three, three copies (copies) exist for all object data corresponding to one PG in the distributed storage system, and each copy of all object data corresponding to the PG is correspondingly stored in three OSDs.
In the following, various embodiments of the present invention will be presented based on the above-described distributed system and related devices.
The invention provides a creation program of a standby storage pool.
Referring to FIG. 2, a schematic diagram of an operating environment of a first embodiment of a creator 10 of a spare storage pool according to the present invention is shown.
In the present embodiment, the creation program 10 of the spare storage pool is installed and run in the electronic apparatus 1. The electronic device 1 may be a computing device such as a desktop computer, a notebook computer, a palm computer, a server, or the like. The electronic device 1 may include, but is not limited to, a memory 11, a processor 12, which communicate with each other via a program bus. Fig. 2 shows only the electronic device 1 with components 11, 12, but it is understood that not all shown components are required to be implemented, and that more or fewer components may alternatively be implemented.
The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a hard disk or a memory of the electronic device 1. The memory 11 may in other embodiments also be an external storage device of the electronic apparatus 1, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic apparatus 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic apparatus 1. The memory 11 is used for storing application software installed in the electronic device 1 and various data, such as program codes of the creation program 10 of the spare memory pool. The memory 11 may also be used to temporarily store data that has been output or is to be output.
The processor 12 may in some embodiments be a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as executing the creation program 10 of a spare memory pool, etc.
Referring to FIG. 3, a block diagram of a first embodiment of a creation process 10 for a spare storage pool according to the present invention is shown. In this embodiment, the creating program 10 of the spare memory pool may be divided into one or more modules, and one or more modules are stored in the memory 11 and executed by one or more processors (the processor 12 in this embodiment) to complete the present invention. For example, in fig. 3, the creation program 10 of the spare storage pool may be divided into a receiving module 101, a determining module 102, and a creating module 103. The modules referred to in the present invention are a series of computer program instruction segments capable of performing a specific function, more suitable than the program describing the execution of the creation program 10 of the spare memory pool in the electronic device 1, wherein:
a receiving module 101, configured to receive configuration data sent by a user.
The receiving module 101 provides a user setting interface, and generates configuration data according to input data of a user. The configuration data includes fault thresholds of the respective storage nodes or OSD identification information corresponding to the respective storage nodes.
The failure threshold of the storage node refers to a threshold of the number of OSDs that fail in the storage node.
In addition, in some embodiments, the configuration data further includes a number of copies of the backup Pool to be created and a number of PGs.
And the determining module 102 is configured to select, according to the configuration data, a plurality of OSDs from the storage nodes as standby OSDs.
The step of selecting, by the determining module 102, a plurality of OSDs from each storage node as standby OSDs according to the configuration data includes at least two embodiments, respectively:
mode one:
and acquiring fault thresholds of all the storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting the OSD of the configuration quantity from all the storage nodes as standby OSD.
Wherein the number of configurations of the storage nodes is greater than or equal to the failure threshold. For example, if the failure threshold of a storage node is 2, 2 spare OSDs may be configured for the storage node, or more than two spare OSDs may be configured for the storage node. It should be noted that, in this embodiment, although the number of storage nodes configured is limited to be greater than or equal to the failure threshold, the number of storage nodes configured should not be too large, that is, too many spare OSDs should not be configured for a storage node, so as to avoid wasting resources.
Mode two:
and acquiring OSD identification information corresponding to each storage node from the configuration data, and taking the OSD corresponding to each OSD identification information as a standby OSD.
The two modes are different in that in the first mode, the user only needs to set the configuration quantity of each storage node, and the electronic device automatically selects the OSD from each storage node as the standby OSD according to the configuration quantity. And the second mode is that the user selects a plurality of OSD as standby OSD in the OSD of each storage node.
It should be noted that the total number of standby OSDs in this embodiment should be greater than or equal to a preset threshold, which may be set according to the number of copies of the standby Pool to be created. For example, the preset threshold is set equal to the number of copies of the standby Pool to be created.
And a creating module 103, configured to create a backup Pool according to the configuration data, create a mapping relationship between the backup Pool and the backup OSD, and save the mapping relationship between the backup Pool and the backup OSD.
After the mapping relationship between the spare Pool and the spare OSD, the spare OSD is only used to store the object data of the PG in the spare Pool.
According to the configuration data sent by the user, the embodiment selects a plurality of OSD as standby OSD from each storage node, creates standby Pool, creates a mapping relation between standby Pool and standby OSD, and stores the mapping relation between standby Pool and standby OSD. Compared with the prior art, the embodiment sets the standby OSD and the standby Pool in the distributed storage system, when a certain OSD in the distributed storage system fails, the standby OSD and the standby Pool can be started, and the influence of the OSD failure on the performance of the distributed storage system is reduced.
As shown in fig. 4, fig. 4 is a program block diagram of a second embodiment of the creation program of the spare storage pool of the present invention.
The present embodiment further includes a detection module 104 and a redirection module 105 based on the first embodiment, wherein:
the detecting module 104 is configured to detect whether a preset event is triggered in real time or at a fixed time.
In this embodiment, the preset event includes failure of one or more active OSDs, where the active OSDs are OSDs in the storage node except for the standby OSDs. The method for detecting whether the preset event is triggered comprises the following steps: and detecting whether an active OSD fails or not by adopting a heartbeat mechanism, sending detection messages to each active OSD in real time or at regular time, and determining that the active OSD fails if the active OSD does not return a reply message within a preset time length.
In some application scenarios, the preset event further includes one or more active OSDs executing a write request timeout. The method for detecting whether the preset event is triggered comprises the following steps: the timing starts when a write request for object data is received from an active OSD. And stopping timing when the main OSD completes the writing operation of the object data and the current recorded duration is smaller than a second preset duration, and determining that the main OSD does not execute timeout. And stopping timing when the main OSD does not complete the writing operation of the object data and the current recorded duration is equal to a second preset duration, and determining that the main OSD executes overtime.
In some application scenarios, the preset event further includes receiving a redirection instruction.
And the redirection module 105 is configured to intercept a write request sent to a preset storage location in a preset time interval when the detection module 104 detects the write request, and redirect the write request to the standby Pool.
When one or more main OSD fails, determining PG corresponding to each object data stored in the failed main OSD according to a mapping relation between the predetermined object data and PG, and taking each determined PG as a failed PG. And reducing the number of copies of all object data corresponding to all the faulty PGs from a first preset number to a second preset number (for example, from three copies to one copy). Then, the failed main OSD is replaced one by one using the spare OSD. For example, a spare OSD is selected as a new active OSD, and the new active OSD is used to replace the failed active OSD. Then, the number of copies of all object data corresponding to all the failed PGs is increased from the second preset number to the first preset number (for example, from the copy to the three copy). And then, according to a predetermined mapping relation between PG and active OSD, taking a first preset number of active OSD corresponding to each fault PG as a fault OSD group, redirecting the write request to a standby Pool when one fault OSD group receives the write request of the object data, and executing the write request by using the standby Pool.
The method for selecting the new active OSD includes: and searching for a standby OSD which is positioned at the same storage node as the main OSD with the fault. If so, the searched standby OSD is used as a new main OSD. If not, a standby OSD is randomly selected as a new active OSD. In addition, the specific implementation of replacing the failed active OSD with the new active OSD includes: and releasing the mapping relation between the preset equipment identification information of the main OSD with the fault and the position information (such as a network port value) of the main OSD with the fault, distributing the equipment identification information of the main OSD with the fault to the new main OSD as the equipment identification information of the new main OSD, and reestablishing and storing the mapping relation between the equipment identification information of the new main OSD and the position information of the new main OSD so as to realize the replacement of the new main OSD and the main OSD with the fault.
When the preset event is that one or more main OSD executing writing requests are overtime, marking each main OSD executing overtime as suspicious OSD. And then, according to a predetermined mapping relation between the main OSD and the main OSD groups, determining all the main OSD groups corresponding to each suspicious OSD, and marking all the determined main OSD groups as suspicious OSD groups. When a suspicious OSD receives a write request, the write request is redirected to a backup Pool, and the new write request is executed by the backup Pool.
In this embodiment, when a preset event is triggered, a write request sent to a preset storage location is intercepted, and the write request is redirected to the standby Pool. The method can reduce the influence of main OSD faults or execution overtime on the write performance of the distributed storage system, and effectively ensures the execution efficiency of the write request.
Further, in other embodiments, the program further includes a prompt module (not shown).
And the prompting module is used for detecting the number of standby OSD in each storage node and the total number of the standby OSD in real time or at fixed time. And sending out prompt information when the number of standby OSD in the storage node is smaller than or equal to the fault threshold value or when the total number of standby OSD is smaller than or equal to a preset threshold value (for example, the number of copies of standby Pool).
In addition, the invention provides a method for creating the standby storage pool.
As shown in fig. 5, fig. 5 is a flowchart of a first embodiment of a method for creating a backup storage pool according to the present invention.
In this embodiment, the method is applicable to an electronic device, where the electronic device is communicatively connected to a plurality of storage nodes, and each storage node is provided with a plurality of OSDs, and the method includes the steps of:
step S10, receiving configuration data sent by a user.
Providing a user setting interface, and generating configuration data according to input data of a user. The configuration data includes fault thresholds of the respective storage nodes or OSD identification information corresponding to the respective storage nodes.
The failure threshold of the storage node refers to a threshold of the number of OSDs that fail in the storage node.
In addition, in some embodiments, the configuration data further includes a number of copies of the backup Pool to be created and a number of PGs.
Step S20, selecting a plurality of OSD as standby OSD from each storage node according to the configuration data.
At least two embodiments exist in the step S20, which are respectively:
mode one:
and acquiring fault thresholds of all the storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting the OSD of the configuration quantity from all the storage nodes as standby OSD.
Wherein the number of configurations of the storage nodes is greater than or equal to the failure threshold. For example, if the failure threshold of a storage node is 2, 2 spare OSDs may be configured for the storage node, or more than two spare OSDs may be configured for the storage node. It should be noted that, in this embodiment, although the number of storage nodes configured is limited to be greater than or equal to the failure threshold, the number of storage nodes configured should not be too large, that is, too many spare OSDs should not be configured for a storage node, so as to avoid wasting resources.
Mode two:
and acquiring OSD identification information corresponding to each storage node from the configuration data, and taking the OSD corresponding to each OSD identification information as a standby OSD.
The two modes are different in that in the first mode, the user only needs to set the configuration quantity of each storage node, and the electronic device automatically selects the OSD from each storage node as the standby OSD according to the configuration quantity. And the second mode is that the user selects a plurality of OSD as standby OSD in the OSD of each storage node.
It should be noted that the total number of standby OSDs in this embodiment should be greater than or equal to a preset threshold, which may be set according to the number of copies of the standby Pool to be created. For example, the preset threshold is set equal to the number of copies of the standby Pool to be created.
And step S30, creating a standby Pool according to the configuration data, creating a mapping relation between the standby Pool and the standby OSD, and storing the mapping relation between the standby Pool and the standby OSD.
After the mapping relationship between the spare Pool and the spare OSD, the spare OSD is only used to store the object data of the PG in the spare Pool.
According to the configuration data sent by the user, the embodiment selects a plurality of OSD as standby OSD from each storage node, creates standby Pool, and the mapping relation between standby Pool and standby OSD, and then saves the mapping relation between standby Pool and standby OSD. Compared with the prior art, the embodiment sets the standby OSD and the standby Pool in the distributed storage system, when a certain OSD in the distributed storage system fails, the standby OSD and the standby Pool can be started, and the influence of the OSD failure on the performance of the distributed storage system is reduced.
As shown in fig. 6, fig. 6 is a flowchart of a second embodiment of a method for creating a backup storage pool according to the present invention.
The present embodiment is based on the first embodiment, after step S30, the method further includes:
step S40, detecting whether the preset event is triggered in real time or at regular time.
In this embodiment, the preset event includes failure of one or more active OSDs, where the active OSDs are OSDs in the storage node except for the standby OSDs. The method for detecting whether the preset event is triggered comprises the following steps: and detecting whether an active OSD fails or not by adopting a heartbeat mechanism, sending detection messages to each active OSD in real time or at regular time, and determining that the active OSD fails if the active OSD does not return a reply message within a preset time length.
In some application scenarios, the preset event further includes one or more active OSDs executing a write request timeout. The method for detecting whether the preset event is triggered comprises the following steps: the timing starts when a write request for object data is received from an active OSD. And stopping timing when the main OSD completes the writing operation of the object data and the current recorded duration is smaller than a second preset duration, and determining that the main OSD does not execute timeout. And stopping timing when the main OSD does not complete the writing operation of the object data and the current recorded duration is equal to a second preset duration, and determining that the main OSD executes overtime.
In some application scenarios, the preset event further includes receiving a redirection instruction.
Step S50, when detecting, intercepting the write request sent to the preset storage position in the preset time interval, and redirecting the write request to the standby Pool.
When one or more main OSD fails, determining PG corresponding to each object data stored in the failed main OSD according to a mapping relation between the predetermined object data and PG, and taking each determined PG as a failed PG. And reducing the number of copies of all object data corresponding to all the faulty PGs from a first preset number to a second preset number (for example, from three copies to one copy). Then, the failed main OSD is replaced one by one using the spare OSD. For example, a spare OSD is selected as a new active OSD, and the new active OSD is used to replace the failed active OSD. Then, the number of copies of all object data corresponding to all the failed PGs is increased from the second preset number to the first preset number (for example, from the copy to the three copy). And then, according to a predetermined mapping relation between PG and active OSD, taking a first preset number of active OSD corresponding to each fault PG as a fault OSD group, redirecting the write request to a standby Pool when one fault OSD group receives the write request of the object data, and executing the write request by using the standby Pool.
The method for selecting the new active OSD includes: and searching for a standby OSD which is positioned at the same storage node as the main OSD with the fault. If so, the searched standby OSD is used as a new main OSD. If not, a standby OSD is randomly selected as a new active OSD. In addition, the specific implementation of replacing the failed active OSD with the new active OSD includes: and releasing the mapping relation between the preset equipment identification information of the main OSD with the fault and the position information (such as a network port value) of the main OSD with the fault, distributing the equipment identification information of the main OSD with the fault to the new main OSD as the equipment identification information of the new main OSD, and reestablishing and storing the mapping relation between the equipment identification information of the new main OSD and the position information of the new main OSD so as to realize the replacement of the new main OSD and the main OSD with the fault.
When the preset event is that one or more main OSD executing writing requests are overtime, marking each main OSD executing overtime as suspicious OSD. And then, according to a predetermined mapping relation between the main OSD and the main OSD groups, determining all the main OSD groups corresponding to each suspicious OSD, and marking all the determined main OSD groups as suspicious OSD groups. When a suspicious OSD receives a write request, the write request is redirected to a backup Pool, and the new write request is executed by the backup Pool.
In this embodiment, when a preset event is triggered, a write request sent to a preset storage location is intercepted, and the write request is redirected to the standby Pool. The method can reduce the influence of main OSD faults or execution overtime on the write performance of the distributed storage system, and effectively ensures the execution efficiency of the write request.
Further, in other embodiments, after step S30, the method further includes:
detecting the number of standby OSD and the total number of standby OSD in each storage node in real time or at regular time. And sending out prompt information when the number of standby OSD in the storage node is smaller than or equal to the fault threshold value or when the total number of standby OSD is smaller than or equal to a preset threshold value (for example, the number of copies of standby Pool).
Further, the present invention also proposes a computer readable storage medium storing a program for creating a backup storage pool, where the program for creating a backup storage pool is executable by at least one processor, so that the at least one processor performs the method for creating a backup storage pool in any of the foregoing embodiments.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the invention, and all equivalent structural changes made by the description of the present invention and the accompanying drawings or direct/indirect application in other related technical fields are included in the scope of the invention.

Claims (8)

1. An electronic device, wherein the electronic device is communicatively connected to a plurality of storage nodes, each storage node is provided with a plurality of object storage devices, the electronic device includes a memory and a processor, the memory stores a creation program of a standby storage pool, and the creation program of the standby storage pool is executed by the processor to implement the following steps:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment;
wherein the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
2. The electronic device of claim 1, wherein the processor executes a creation program for the spare storage pool, and after the creating step, further performs the steps of:
and a detection step: detecting whether a preset event is triggered in real time or at fixed time;
a redirection step: when detected, intercepting a write request sent to a preset storage location within a preset time interval, and redirecting the write request to the standby storage pool.
3. The electronic apparatus of claim 2, wherein the preset event comprises a failure of one or more primary object storage devices, the primary object storage devices being object storage devices in the storage node other than standby object storage devices;
the processor executes the creation program of the spare storage pool, and after the detecting step, further implements the steps of:
the standby object storage devices are used for replacing the failed main object storage devices one by one.
4. A method of creating a backup storage pool, adapted to an electronic device, wherein the electronic device is communicatively connected to a plurality of storage nodes, each of the storage nodes having a plurality of object storage devices disposed therein, the method comprising the steps of:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment;
wherein the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
5. The method of creating a spare storage pool of claim 4, wherein after said creating step, the method further comprises:
and a detection step: detecting whether a preset event is triggered in real time or at fixed time;
a redirection step: when detected, intercepting a write request sent to a preset storage location within a preset time interval, and redirecting the write request to the standby storage pool.
6. The method of claim 5, wherein the predetermined event comprises a failure of one or more primary object storage devices, the primary object storage devices being object storage devices in the storage node other than the backup object storage device;
after the detecting step, the method further comprises:
the standby object storage devices are used for replacing the failed main object storage devices one by one.
7. The distributed storage system is characterized by comprising an electronic device and a plurality of storage nodes, wherein the electronic device is respectively connected with each storage node in a communication way, the electronic device comprises a memory and a processor, the memory is stored with a creation program of a standby storage pool, and the creation program of the standby storage pool realizes the following steps when being executed by the processor:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment;
wherein the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
8. A computer-readable storage medium storing a creation program of a spare storage pool executable by at least one processor to cause the at least one processor to perform the steps of the method of creating a spare storage pool according to any of claims 4-6.
CN201811438576.0A 2018-11-28 2018-11-28 Electronic device, method for creating backup storage pool, and computer-readable storage medium Active CN109669822B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811438576.0A CN109669822B (en) 2018-11-28 2018-11-28 Electronic device, method for creating backup storage pool, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811438576.0A CN109669822B (en) 2018-11-28 2018-11-28 Electronic device, method for creating backup storage pool, and computer-readable storage medium

Publications (2)

Publication Number Publication Date
CN109669822A CN109669822A (en) 2019-04-23
CN109669822B true CN109669822B (en) 2023-06-06

Family

ID=66143303

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811438576.0A Active CN109669822B (en) 2018-11-28 2018-11-28 Electronic device, method for creating backup storage pool, and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN109669822B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287151B (en) * 2019-05-20 2023-08-22 平安科技(深圳)有限公司 Distributed storage system, data writing method, device and storage medium
CN112835511B (en) * 2019-11-25 2022-09-20 浙江宇视科技有限公司 Data writing method, device, equipment and medium of distributed storage cluster
CN111290909A (en) * 2020-01-19 2020-06-16 山东汇贸电子口岸有限公司 System and method for monitoring and alarming ceph cluster
CN112363980A (en) * 2020-11-03 2021-02-12 网宿科技股份有限公司 Data processing method and device for distributed system
CN114510379B (en) * 2022-04-21 2022-11-01 山东百盟信息技术有限公司 Distributed array video data storage device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107678918A (en) * 2017-09-26 2018-02-09 郑州云海信息技术有限公司 The OSD heartbeat mechanisms method to set up and device of a kind of distributed file system
CN107729185A (en) * 2017-10-26 2018-02-23 新华三技术有限公司 A kind of fault handling method and device
CN108287660A (en) * 2017-01-09 2018-07-17 中国移动通信集团河北有限公司 Date storage method and equipment
CN108694209A (en) * 2017-04-11 2018-10-23 华为技术有限公司 Object-based distributed index method and client

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9774678B2 (en) * 2009-10-29 2017-09-26 International Business Machines Corporation Temporarily storing data in a dispersed storage network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108287660A (en) * 2017-01-09 2018-07-17 中国移动通信集团河北有限公司 Date storage method and equipment
CN108694209A (en) * 2017-04-11 2018-10-23 华为技术有限公司 Object-based distributed index method and client
CN107678918A (en) * 2017-09-26 2018-02-09 郑州云海信息技术有限公司 The OSD heartbeat mechanisms method to set up and device of a kind of distributed file system
CN107729185A (en) * 2017-10-26 2018-02-23 新华三技术有限公司 A kind of fault handling method and device

Also Published As

Publication number Publication date
CN109669822A (en) 2019-04-23

Similar Documents

Publication Publication Date Title
CN109669822B (en) Electronic device, method for creating backup storage pool, and computer-readable storage medium
CN109614276B (en) Fault processing method and device, distributed storage system and storage medium
CN109656896B (en) Fault repairing method and device, distributed storage system and storage medium
CN109656895B (en) Distributed storage system, data writing method, device and storage medium
US10261853B1 (en) Dynamic replication error retry and recovery
US10095576B2 (en) Anomaly recovery method for virtual machine in distributed environment
US10884645B2 (en) Virtual machine hot migration method, host machine and storage medium
US20150095597A1 (en) High performance intelligent virtual desktop infrastructure using volatile memory arrays
US8375200B2 (en) Embedded device and file change notification method of the embedded device
US9354907B1 (en) Optimized restore of virtual machine and virtual disk data
CN109558260B (en) Kubernetes fault elimination system, method, equipment and medium
US9417973B2 (en) Apparatus and method for fault recovery
JP6288275B2 (en) Virtualization infrastructure management apparatus, virtualization infrastructure management system, virtualization infrastructure management method, and virtualization infrastructure management program
KR102558330B1 (en) Apparatus and method for distributing and storing data
US8060773B1 (en) Systems and methods for managing sub-clusters within a multi-cluster computing system subsequent to a network-partition event
CN108205482B (en) File mount restoration methods
US10579299B2 (en) Method, apparatus, server and storage medium of erasing cloud host in cloud-computing environment
CN111541762A (en) Data processing method, management server, device and storage medium
CN115543881B (en) PCIE (peripheral component interconnect express) equipment adaptation method, PCIE equipment adaptation system, computer equipment and storage medium
CN108173892B (en) Cloud mirror image operation method and device
CN112328423A (en) Processing method, device and storage medium for search service loophole
CN109324931B (en) Method for realizing vmware mount recovery in data de-duplication system
US20200073759A1 (en) Maximum data recovery of scalable persistent memory
CN110908821A (en) Method, device, equipment and storage medium for task failure management
CN113609199B (en) Database system, server, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant