CN109669822B - Electronic device, method for creating backup storage pool, and computer-readable storage medium - Google Patents
Electronic device, method for creating backup storage pool, and computer-readable storage medium Download PDFInfo
- Publication number
- CN109669822B CN109669822B CN201811438576.0A CN201811438576A CN109669822B CN 109669822 B CN109669822 B CN 109669822B CN 201811438576 A CN201811438576 A CN 201811438576A CN 109669822 B CN109669822 B CN 109669822B
- Authority
- CN
- China
- Prior art keywords
- storage
- standby
- pool
- osd
- object storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2023—Failover techniques
- G06F11/2033—Failover techniques switching over of hardware resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/062—Securing storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0643—Management of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention relates to a distributed storage technology, and discloses an electronic device, a method for creating a standby storage pool and a computer readable storage medium. According to the configuration data sent by a user, a plurality of OSD are selected from all storage nodes to serve as standby OSD, standby Pool is created, the mapping relation between the standby Pool and the standby OSD is created, and then the mapping relation between the standby Pool and the standby OSD is stored. Compared with the prior art, the invention sets the standby OSD and the standby Pool in the distributed storage system, and when a certain OSD in the distributed storage system fails, the standby OSD and the standby Pool can be started, thereby reducing the influence of the OSD failure on the performance of the distributed storage system.
Description
Technical Field
The present invention relates to the field of distributed storage technologies, and in particular, to an electronic device, a method for creating a backup storage pool, and a computer readable storage medium.
Background
The CEPH distributed file system is a distributed storage system with large capacity, high performance and high reliability. The core component of CEPH is OSD (Object Storage Device ) which manages a separate hard disk and provides a read-write access interface for Object-based Storage. The CEPH cluster is composed of a plurality of independent OSD, and the number of the OSD can be dynamically increased and decreased. The CEPH client distributes the Object data (objects) to different OSDs for storage via the crum algorithm. Wherein, CRUSH is a pseudo-random distribution algorithm, the algorithm firstly attributes the object data to a Group (PG) in a storage Pool (Pool) through a HASH value (HASH), and then calculates the OSD stored in the PG, thereby storing the object data which belongs to the same PG in the target OSD corresponding to the PG.
Currently, no spare OSD or spare Pool is set in the CEPH distributed file system, so when an OSD in the CEPH fails, no spare OSD or spare Pool can be activated, resulting in a decrease in the storage performance of the CEPH.
Disclosure of Invention
It is a primary object of the present invention to provide an electronic device, a method of creating a backup storage Pool, and a computer-readable storage medium, which aim to set a backup OSD and a backup Pool for CEPH.
In order to achieve the above object, the present invention provides an electronic device, which is communicatively connected to a plurality of storage nodes, each of the storage nodes being provided with a plurality of object storage devices, the electronic device including a memory and a processor, the memory storing a creation program of a standby storage pool, the creation program of the standby storage pool implementing the following steps when executed by the processor:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: and creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment.
Preferably, the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
Preferably, the processor executes a creation program of the spare storage pool, and after the creation step, further implements the steps of:
and a detection step: detecting whether a preset event is triggered in real time or at fixed time;
a redirection step: when detected, intercepting a write request sent to a preset storage location within a preset time interval, and redirecting the write request to the standby storage pool.
Preferably, the preset event includes failure of one or more primary object storage devices, where the primary object storage devices are object storage devices in the storage node except for standby object storage devices;
the processor executes the creation program of the spare storage pool, and after the detecting step, further implements the steps of:
the standby object storage devices are used for replacing the failed main object storage devices one by one.
In addition, in order to achieve the above object, the present invention further provides a method for creating a backup storage pool, which is applicable to an electronic device, where the electronic device is communicatively connected to a plurality of storage nodes, and a plurality of object storage devices are disposed in each storage node, and the method includes the steps of:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: and creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment.
Preferably, the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
Preferably, after the creating step, the method further comprises:
and a detection step: detecting whether a preset event is triggered in real time or at fixed time;
a redirection step: when detected, intercepting a write request sent to a preset storage location within a preset time interval, and redirecting the write request to the standby storage pool.
Preferably, the preset event includes failure of one or more primary object storage devices, where the primary object storage devices are object storage devices in the storage node except for standby object storage devices;
after the detecting step, the method further comprises:
the standby object storage devices are used for replacing the failed main object storage devices one by one.
In addition, in order to achieve the above object, the present invention provides a distributed storage system, where the distributed storage system includes an electronic device and a plurality of storage nodes, the electronic device is respectively connected to the storage nodes in a communication manner, the electronic device includes a memory and a processor, the memory stores a creating program of a standby storage pool, and when the creating program of the standby storage pool is executed by the processor, the following steps are implemented:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: and creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment.
Preferably, the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
Preferably, the processor executes a creation program of the spare storage pool, and after the creation step, further implements the steps of:
and a detection step: detecting whether a preset event is triggered in real time or at fixed time;
a redirection step: when detected, intercepting a write request sent to a preset storage location within a preset time interval, and redirecting the write request to the standby storage pool.
Preferably, the preset event includes failure of one or more primary object storage devices, where the primary object storage devices are object storage devices in the storage node except for standby object storage devices;
the processor executes the creation program of the spare storage pool, and after the detecting step, further implements the steps of:
the standby object storage devices are used for replacing the failed main object storage devices one by one.
In addition, to achieve the above object, the present invention also proposes a computer-readable storage medium storing a creation program of a backup storage pool, the creation program of the backup storage pool being executable by at least one processor to cause the at least one processor to perform the steps of the creation method of the backup storage pool as set forth in any one of the above.
According to the configuration data sent by a user, a plurality of OSD are selected from all storage nodes to serve as standby OSD, standby Pool is created, the mapping relation between the standby Pool and the standby OSD is created, and then the mapping relation between the standby Pool and the standby OSD is stored. Compared with the prior art, the invention sets the standby OSD and the standby Pool in the distributed storage system, and when a certain OSD in the distributed storage system fails, the standby OSD and the standby Pool can be started, thereby reducing the influence of the OSD failure on the performance of the distributed storage system.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to the structures shown in these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a system architecture of a first embodiment of a distributed storage system according to the present invention;
FIG. 2 is a schematic diagram of an operating environment of a first embodiment of a creator of a spare storage pool of the present invention;
FIG. 3 is a block diagram of a first embodiment of a creation process for a spare storage pool according to the present invention;
FIG. 4 is a block diagram of a second embodiment of a creation process for a spare storage pool according to the present invention;
FIG. 5 is a flowchart illustrating a first embodiment of a method for creating a backup storage pool according to the present invention;
fig. 6 is a flowchart of a second embodiment of a method for creating a backup storage pool according to the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
Referring to fig. 1, a system architecture diagram of a first embodiment of a distributed storage system according to the present invention is shown.
In the present embodiment, the distributed storage system includes a plurality of storage nodes 3 (for example, one host is one storage node), and a plurality of OSDs are provided in each storage node 3.
In some application scenarios, an electronic device 1 is further provided in the distributed storage system, and the electronic device 1 is communicatively connected to each storage node 3 through a network 2.
In some application scenarios, the electronic device 1 described above is provided independently of the distributed storage system and is communicatively connected (e.g., via the network 2).
In this embodiment, the minimum storage unit in the above-mentioned distributed storage system is object data (object), one object data is a data block with a size not exceeding a predetermined value (for example, 4 MB), each object data is mapped into a corresponding PG, and the distributed storage system does not directly operate the object data, but performs data processing (for example, data addressing, data migration, etc.) in the PG as a basic unit.
The distributed storage system supports a multi-copy policy, for example, if the number of copies of the distributed storage system is preset to be three, three copies (copies) exist for all object data corresponding to one PG in the distributed storage system, and each copy of all object data corresponding to the PG is correspondingly stored in three OSDs.
In the following, various embodiments of the present invention will be presented based on the above-described distributed system and related devices.
The invention provides a creation program of a standby storage pool.
Referring to FIG. 2, a schematic diagram of an operating environment of a first embodiment of a creator 10 of a spare storage pool according to the present invention is shown.
In the present embodiment, the creation program 10 of the spare storage pool is installed and run in the electronic apparatus 1. The electronic device 1 may be a computing device such as a desktop computer, a notebook computer, a palm computer, a server, or the like. The electronic device 1 may include, but is not limited to, a memory 11, a processor 12, which communicate with each other via a program bus. Fig. 2 shows only the electronic device 1 with components 11, 12, but it is understood that not all shown components are required to be implemented, and that more or fewer components may alternatively be implemented.
The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a hard disk or a memory of the electronic device 1. The memory 11 may in other embodiments also be an external storage device of the electronic apparatus 1, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic apparatus 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic apparatus 1. The memory 11 is used for storing application software installed in the electronic device 1 and various data, such as program codes of the creation program 10 of the spare memory pool. The memory 11 may also be used to temporarily store data that has been output or is to be output.
The processor 12 may in some embodiments be a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as executing the creation program 10 of a spare memory pool, etc.
Referring to FIG. 3, a block diagram of a first embodiment of a creation process 10 for a spare storage pool according to the present invention is shown. In this embodiment, the creating program 10 of the spare memory pool may be divided into one or more modules, and one or more modules are stored in the memory 11 and executed by one or more processors (the processor 12 in this embodiment) to complete the present invention. For example, in fig. 3, the creation program 10 of the spare storage pool may be divided into a receiving module 101, a determining module 102, and a creating module 103. The modules referred to in the present invention are a series of computer program instruction segments capable of performing a specific function, more suitable than the program describing the execution of the creation program 10 of the spare memory pool in the electronic device 1, wherein:
a receiving module 101, configured to receive configuration data sent by a user.
The receiving module 101 provides a user setting interface, and generates configuration data according to input data of a user. The configuration data includes fault thresholds of the respective storage nodes or OSD identification information corresponding to the respective storage nodes.
The failure threshold of the storage node refers to a threshold of the number of OSDs that fail in the storage node.
In addition, in some embodiments, the configuration data further includes a number of copies of the backup Pool to be created and a number of PGs.
And the determining module 102 is configured to select, according to the configuration data, a plurality of OSDs from the storage nodes as standby OSDs.
The step of selecting, by the determining module 102, a plurality of OSDs from each storage node as standby OSDs according to the configuration data includes at least two embodiments, respectively:
mode one:
and acquiring fault thresholds of all the storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting the OSD of the configuration quantity from all the storage nodes as standby OSD.
Wherein the number of configurations of the storage nodes is greater than or equal to the failure threshold. For example, if the failure threshold of a storage node is 2, 2 spare OSDs may be configured for the storage node, or more than two spare OSDs may be configured for the storage node. It should be noted that, in this embodiment, although the number of storage nodes configured is limited to be greater than or equal to the failure threshold, the number of storage nodes configured should not be too large, that is, too many spare OSDs should not be configured for a storage node, so as to avoid wasting resources.
Mode two:
and acquiring OSD identification information corresponding to each storage node from the configuration data, and taking the OSD corresponding to each OSD identification information as a standby OSD.
The two modes are different in that in the first mode, the user only needs to set the configuration quantity of each storage node, and the electronic device automatically selects the OSD from each storage node as the standby OSD according to the configuration quantity. And the second mode is that the user selects a plurality of OSD as standby OSD in the OSD of each storage node.
It should be noted that the total number of standby OSDs in this embodiment should be greater than or equal to a preset threshold, which may be set according to the number of copies of the standby Pool to be created. For example, the preset threshold is set equal to the number of copies of the standby Pool to be created.
And a creating module 103, configured to create a backup Pool according to the configuration data, create a mapping relationship between the backup Pool and the backup OSD, and save the mapping relationship between the backup Pool and the backup OSD.
After the mapping relationship between the spare Pool and the spare OSD, the spare OSD is only used to store the object data of the PG in the spare Pool.
According to the configuration data sent by the user, the embodiment selects a plurality of OSD as standby OSD from each storage node, creates standby Pool, creates a mapping relation between standby Pool and standby OSD, and stores the mapping relation between standby Pool and standby OSD. Compared with the prior art, the embodiment sets the standby OSD and the standby Pool in the distributed storage system, when a certain OSD in the distributed storage system fails, the standby OSD and the standby Pool can be started, and the influence of the OSD failure on the performance of the distributed storage system is reduced.
As shown in fig. 4, fig. 4 is a program block diagram of a second embodiment of the creation program of the spare storage pool of the present invention.
The present embodiment further includes a detection module 104 and a redirection module 105 based on the first embodiment, wherein:
the detecting module 104 is configured to detect whether a preset event is triggered in real time or at a fixed time.
In this embodiment, the preset event includes failure of one or more active OSDs, where the active OSDs are OSDs in the storage node except for the standby OSDs. The method for detecting whether the preset event is triggered comprises the following steps: and detecting whether an active OSD fails or not by adopting a heartbeat mechanism, sending detection messages to each active OSD in real time or at regular time, and determining that the active OSD fails if the active OSD does not return a reply message within a preset time length.
In some application scenarios, the preset event further includes one or more active OSDs executing a write request timeout. The method for detecting whether the preset event is triggered comprises the following steps: the timing starts when a write request for object data is received from an active OSD. And stopping timing when the main OSD completes the writing operation of the object data and the current recorded duration is smaller than a second preset duration, and determining that the main OSD does not execute timeout. And stopping timing when the main OSD does not complete the writing operation of the object data and the current recorded duration is equal to a second preset duration, and determining that the main OSD executes overtime.
In some application scenarios, the preset event further includes receiving a redirection instruction.
And the redirection module 105 is configured to intercept a write request sent to a preset storage location in a preset time interval when the detection module 104 detects the write request, and redirect the write request to the standby Pool.
When one or more main OSD fails, determining PG corresponding to each object data stored in the failed main OSD according to a mapping relation between the predetermined object data and PG, and taking each determined PG as a failed PG. And reducing the number of copies of all object data corresponding to all the faulty PGs from a first preset number to a second preset number (for example, from three copies to one copy). Then, the failed main OSD is replaced one by one using the spare OSD. For example, a spare OSD is selected as a new active OSD, and the new active OSD is used to replace the failed active OSD. Then, the number of copies of all object data corresponding to all the failed PGs is increased from the second preset number to the first preset number (for example, from the copy to the three copy). And then, according to a predetermined mapping relation between PG and active OSD, taking a first preset number of active OSD corresponding to each fault PG as a fault OSD group, redirecting the write request to a standby Pool when one fault OSD group receives the write request of the object data, and executing the write request by using the standby Pool.
The method for selecting the new active OSD includes: and searching for a standby OSD which is positioned at the same storage node as the main OSD with the fault. If so, the searched standby OSD is used as a new main OSD. If not, a standby OSD is randomly selected as a new active OSD. In addition, the specific implementation of replacing the failed active OSD with the new active OSD includes: and releasing the mapping relation between the preset equipment identification information of the main OSD with the fault and the position information (such as a network port value) of the main OSD with the fault, distributing the equipment identification information of the main OSD with the fault to the new main OSD as the equipment identification information of the new main OSD, and reestablishing and storing the mapping relation between the equipment identification information of the new main OSD and the position information of the new main OSD so as to realize the replacement of the new main OSD and the main OSD with the fault.
When the preset event is that one or more main OSD executing writing requests are overtime, marking each main OSD executing overtime as suspicious OSD. And then, according to a predetermined mapping relation between the main OSD and the main OSD groups, determining all the main OSD groups corresponding to each suspicious OSD, and marking all the determined main OSD groups as suspicious OSD groups. When a suspicious OSD receives a write request, the write request is redirected to a backup Pool, and the new write request is executed by the backup Pool.
In this embodiment, when a preset event is triggered, a write request sent to a preset storage location is intercepted, and the write request is redirected to the standby Pool. The method can reduce the influence of main OSD faults or execution overtime on the write performance of the distributed storage system, and effectively ensures the execution efficiency of the write request.
Further, in other embodiments, the program further includes a prompt module (not shown).
And the prompting module is used for detecting the number of standby OSD in each storage node and the total number of the standby OSD in real time or at fixed time. And sending out prompt information when the number of standby OSD in the storage node is smaller than or equal to the fault threshold value or when the total number of standby OSD is smaller than or equal to a preset threshold value (for example, the number of copies of standby Pool).
In addition, the invention provides a method for creating the standby storage pool.
As shown in fig. 5, fig. 5 is a flowchart of a first embodiment of a method for creating a backup storage pool according to the present invention.
In this embodiment, the method is applicable to an electronic device, where the electronic device is communicatively connected to a plurality of storage nodes, and each storage node is provided with a plurality of OSDs, and the method includes the steps of:
step S10, receiving configuration data sent by a user.
Providing a user setting interface, and generating configuration data according to input data of a user. The configuration data includes fault thresholds of the respective storage nodes or OSD identification information corresponding to the respective storage nodes.
The failure threshold of the storage node refers to a threshold of the number of OSDs that fail in the storage node.
In addition, in some embodiments, the configuration data further includes a number of copies of the backup Pool to be created and a number of PGs.
Step S20, selecting a plurality of OSD as standby OSD from each storage node according to the configuration data.
At least two embodiments exist in the step S20, which are respectively:
mode one:
and acquiring fault thresholds of all the storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting the OSD of the configuration quantity from all the storage nodes as standby OSD.
Wherein the number of configurations of the storage nodes is greater than or equal to the failure threshold. For example, if the failure threshold of a storage node is 2, 2 spare OSDs may be configured for the storage node, or more than two spare OSDs may be configured for the storage node. It should be noted that, in this embodiment, although the number of storage nodes configured is limited to be greater than or equal to the failure threshold, the number of storage nodes configured should not be too large, that is, too many spare OSDs should not be configured for a storage node, so as to avoid wasting resources.
Mode two:
and acquiring OSD identification information corresponding to each storage node from the configuration data, and taking the OSD corresponding to each OSD identification information as a standby OSD.
The two modes are different in that in the first mode, the user only needs to set the configuration quantity of each storage node, and the electronic device automatically selects the OSD from each storage node as the standby OSD according to the configuration quantity. And the second mode is that the user selects a plurality of OSD as standby OSD in the OSD of each storage node.
It should be noted that the total number of standby OSDs in this embodiment should be greater than or equal to a preset threshold, which may be set according to the number of copies of the standby Pool to be created. For example, the preset threshold is set equal to the number of copies of the standby Pool to be created.
And step S30, creating a standby Pool according to the configuration data, creating a mapping relation between the standby Pool and the standby OSD, and storing the mapping relation between the standby Pool and the standby OSD.
After the mapping relationship between the spare Pool and the spare OSD, the spare OSD is only used to store the object data of the PG in the spare Pool.
According to the configuration data sent by the user, the embodiment selects a plurality of OSD as standby OSD from each storage node, creates standby Pool, and the mapping relation between standby Pool and standby OSD, and then saves the mapping relation between standby Pool and standby OSD. Compared with the prior art, the embodiment sets the standby OSD and the standby Pool in the distributed storage system, when a certain OSD in the distributed storage system fails, the standby OSD and the standby Pool can be started, and the influence of the OSD failure on the performance of the distributed storage system is reduced.
As shown in fig. 6, fig. 6 is a flowchart of a second embodiment of a method for creating a backup storage pool according to the present invention.
The present embodiment is based on the first embodiment, after step S30, the method further includes:
step S40, detecting whether the preset event is triggered in real time or at regular time.
In this embodiment, the preset event includes failure of one or more active OSDs, where the active OSDs are OSDs in the storage node except for the standby OSDs. The method for detecting whether the preset event is triggered comprises the following steps: and detecting whether an active OSD fails or not by adopting a heartbeat mechanism, sending detection messages to each active OSD in real time or at regular time, and determining that the active OSD fails if the active OSD does not return a reply message within a preset time length.
In some application scenarios, the preset event further includes one or more active OSDs executing a write request timeout. The method for detecting whether the preset event is triggered comprises the following steps: the timing starts when a write request for object data is received from an active OSD. And stopping timing when the main OSD completes the writing operation of the object data and the current recorded duration is smaller than a second preset duration, and determining that the main OSD does not execute timeout. And stopping timing when the main OSD does not complete the writing operation of the object data and the current recorded duration is equal to a second preset duration, and determining that the main OSD executes overtime.
In some application scenarios, the preset event further includes receiving a redirection instruction.
Step S50, when detecting, intercepting the write request sent to the preset storage position in the preset time interval, and redirecting the write request to the standby Pool.
When one or more main OSD fails, determining PG corresponding to each object data stored in the failed main OSD according to a mapping relation between the predetermined object data and PG, and taking each determined PG as a failed PG. And reducing the number of copies of all object data corresponding to all the faulty PGs from a first preset number to a second preset number (for example, from three copies to one copy). Then, the failed main OSD is replaced one by one using the spare OSD. For example, a spare OSD is selected as a new active OSD, and the new active OSD is used to replace the failed active OSD. Then, the number of copies of all object data corresponding to all the failed PGs is increased from the second preset number to the first preset number (for example, from the copy to the three copy). And then, according to a predetermined mapping relation between PG and active OSD, taking a first preset number of active OSD corresponding to each fault PG as a fault OSD group, redirecting the write request to a standby Pool when one fault OSD group receives the write request of the object data, and executing the write request by using the standby Pool.
The method for selecting the new active OSD includes: and searching for a standby OSD which is positioned at the same storage node as the main OSD with the fault. If so, the searched standby OSD is used as a new main OSD. If not, a standby OSD is randomly selected as a new active OSD. In addition, the specific implementation of replacing the failed active OSD with the new active OSD includes: and releasing the mapping relation between the preset equipment identification information of the main OSD with the fault and the position information (such as a network port value) of the main OSD with the fault, distributing the equipment identification information of the main OSD with the fault to the new main OSD as the equipment identification information of the new main OSD, and reestablishing and storing the mapping relation between the equipment identification information of the new main OSD and the position information of the new main OSD so as to realize the replacement of the new main OSD and the main OSD with the fault.
When the preset event is that one or more main OSD executing writing requests are overtime, marking each main OSD executing overtime as suspicious OSD. And then, according to a predetermined mapping relation between the main OSD and the main OSD groups, determining all the main OSD groups corresponding to each suspicious OSD, and marking all the determined main OSD groups as suspicious OSD groups. When a suspicious OSD receives a write request, the write request is redirected to a backup Pool, and the new write request is executed by the backup Pool.
In this embodiment, when a preset event is triggered, a write request sent to a preset storage location is intercepted, and the write request is redirected to the standby Pool. The method can reduce the influence of main OSD faults or execution overtime on the write performance of the distributed storage system, and effectively ensures the execution efficiency of the write request.
Further, in other embodiments, after step S30, the method further includes:
detecting the number of standby OSD and the total number of standby OSD in each storage node in real time or at regular time. And sending out prompt information when the number of standby OSD in the storage node is smaller than or equal to the fault threshold value or when the total number of standby OSD is smaller than or equal to a preset threshold value (for example, the number of copies of standby Pool).
Further, the present invention also proposes a computer readable storage medium storing a program for creating a backup storage pool, where the program for creating a backup storage pool is executable by at least one processor, so that the at least one processor performs the method for creating a backup storage pool in any of the foregoing embodiments.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the invention, and all equivalent structural changes made by the description of the present invention and the accompanying drawings or direct/indirect application in other related technical fields are included in the scope of the invention.
Claims (8)
1. An electronic device, wherein the electronic device is communicatively connected to a plurality of storage nodes, each storage node is provided with a plurality of object storage devices, the electronic device includes a memory and a processor, the memory stores a creation program of a standby storage pool, and the creation program of the standby storage pool is executed by the processor to implement the following steps:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment;
wherein the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
2. The electronic device of claim 1, wherein the processor executes a creation program for the spare storage pool, and after the creating step, further performs the steps of:
and a detection step: detecting whether a preset event is triggered in real time or at fixed time;
a redirection step: when detected, intercepting a write request sent to a preset storage location within a preset time interval, and redirecting the write request to the standby storage pool.
3. The electronic apparatus of claim 2, wherein the preset event comprises a failure of one or more primary object storage devices, the primary object storage devices being object storage devices in the storage node other than standby object storage devices;
the processor executes the creation program of the spare storage pool, and after the detecting step, further implements the steps of:
the standby object storage devices are used for replacing the failed main object storage devices one by one.
4. A method of creating a backup storage pool, adapted to an electronic device, wherein the electronic device is communicatively connected to a plurality of storage nodes, each of the storage nodes having a plurality of object storage devices disposed therein, the method comprising the steps of:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment;
wherein the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
5. The method of creating a spare storage pool of claim 4, wherein after said creating step, the method further comprises:
and a detection step: detecting whether a preset event is triggered in real time or at fixed time;
a redirection step: when detected, intercepting a write request sent to a preset storage location within a preset time interval, and redirecting the write request to the standby storage pool.
6. The method of claim 5, wherein the predetermined event comprises a failure of one or more primary object storage devices, the primary object storage devices being object storage devices in the storage node other than the backup object storage device;
after the detecting step, the method further comprises:
the standby object storage devices are used for replacing the failed main object storage devices one by one.
7. The distributed storage system is characterized by comprising an electronic device and a plurality of storage nodes, wherein the electronic device is respectively connected with each storage node in a communication way, the electronic device comprises a memory and a processor, the memory is stored with a creation program of a standby storage pool, and the creation program of the standby storage pool realizes the following steps when being executed by the processor:
a receiving step: receiving configuration data sent by a user;
determining: according to the configuration data, selecting a plurality of object storage devices from each storage node as standby object storage devices respectively;
the creation step: creating a standby storage pool according to the configuration data, creating a mapping relation between the standby storage pool and standby object storage equipment, and storing the mapping relation between the standby storage pool and the standby object storage equipment;
wherein the determining step includes:
acquiring fault thresholds of all storage nodes from the configuration data, respectively determining the configuration quantity of all the storage nodes according to the fault thresholds of all the storage nodes, and randomly selecting object storage devices with the configuration quantity from all the storage nodes as standby object storage devices, wherein the configuration quantity of the storage nodes is larger than or equal to the fault thresholds;
or, obtaining object storage device identification information corresponding to each storage node from the configuration data, and taking the object storage device corresponding to each object storage device identification information as a standby object storage device.
8. A computer-readable storage medium storing a creation program of a spare storage pool executable by at least one processor to cause the at least one processor to perform the steps of the method of creating a spare storage pool according to any of claims 4-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811438576.0A CN109669822B (en) | 2018-11-28 | 2018-11-28 | Electronic device, method for creating backup storage pool, and computer-readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811438576.0A CN109669822B (en) | 2018-11-28 | 2018-11-28 | Electronic device, method for creating backup storage pool, and computer-readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109669822A CN109669822A (en) | 2019-04-23 |
CN109669822B true CN109669822B (en) | 2023-06-06 |
Family
ID=66143303
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811438576.0A Active CN109669822B (en) | 2018-11-28 | 2018-11-28 | Electronic device, method for creating backup storage pool, and computer-readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109669822B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110287151B (en) * | 2019-05-20 | 2023-08-22 | 平安科技(深圳)有限公司 | Distributed storage system, data writing method, device and storage medium |
CN112835511B (en) * | 2019-11-25 | 2022-09-20 | 浙江宇视科技有限公司 | Data writing method, device, equipment and medium of distributed storage cluster |
CN111290909A (en) * | 2020-01-19 | 2020-06-16 | 山东汇贸电子口岸有限公司 | System and method for monitoring and alarming ceph cluster |
CN112363980A (en) * | 2020-11-03 | 2021-02-12 | 网宿科技股份有限公司 | Data processing method and device for distributed system |
CN114510379B (en) * | 2022-04-21 | 2022-11-01 | 山东百盟信息技术有限公司 | Distributed array video data storage device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107678918A (en) * | 2017-09-26 | 2018-02-09 | 郑州云海信息技术有限公司 | The OSD heartbeat mechanisms method to set up and device of a kind of distributed file system |
CN107729185A (en) * | 2017-10-26 | 2018-02-23 | 新华三技术有限公司 | A kind of fault handling method and device |
CN108287660A (en) * | 2017-01-09 | 2018-07-17 | 中国移动通信集团河北有限公司 | Date storage method and equipment |
CN108694209A (en) * | 2017-04-11 | 2018-10-23 | 华为技术有限公司 | Object-based distributed index method and client |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9774678B2 (en) * | 2009-10-29 | 2017-09-26 | International Business Machines Corporation | Temporarily storing data in a dispersed storage network |
-
2018
- 2018-11-28 CN CN201811438576.0A patent/CN109669822B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108287660A (en) * | 2017-01-09 | 2018-07-17 | 中国移动通信集团河北有限公司 | Date storage method and equipment |
CN108694209A (en) * | 2017-04-11 | 2018-10-23 | 华为技术有限公司 | Object-based distributed index method and client |
CN107678918A (en) * | 2017-09-26 | 2018-02-09 | 郑州云海信息技术有限公司 | The OSD heartbeat mechanisms method to set up and device of a kind of distributed file system |
CN107729185A (en) * | 2017-10-26 | 2018-02-23 | 新华三技术有限公司 | A kind of fault handling method and device |
Also Published As
Publication number | Publication date |
---|---|
CN109669822A (en) | 2019-04-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109669822B (en) | Electronic device, method for creating backup storage pool, and computer-readable storage medium | |
CN109614276B (en) | Fault processing method and device, distributed storage system and storage medium | |
CN109656896B (en) | Fault repairing method and device, distributed storage system and storage medium | |
CN109656895B (en) | Distributed storage system, data writing method, device and storage medium | |
US10261853B1 (en) | Dynamic replication error retry and recovery | |
US10095576B2 (en) | Anomaly recovery method for virtual machine in distributed environment | |
US10884645B2 (en) | Virtual machine hot migration method, host machine and storage medium | |
US20150095597A1 (en) | High performance intelligent virtual desktop infrastructure using volatile memory arrays | |
US8375200B2 (en) | Embedded device and file change notification method of the embedded device | |
US9354907B1 (en) | Optimized restore of virtual machine and virtual disk data | |
CN109558260B (en) | Kubernetes fault elimination system, method, equipment and medium | |
US9417973B2 (en) | Apparatus and method for fault recovery | |
JP6288275B2 (en) | Virtualization infrastructure management apparatus, virtualization infrastructure management system, virtualization infrastructure management method, and virtualization infrastructure management program | |
KR102558330B1 (en) | Apparatus and method for distributing and storing data | |
US8060773B1 (en) | Systems and methods for managing sub-clusters within a multi-cluster computing system subsequent to a network-partition event | |
CN108205482B (en) | File mount restoration methods | |
US10579299B2 (en) | Method, apparatus, server and storage medium of erasing cloud host in cloud-computing environment | |
CN111541762A (en) | Data processing method, management server, device and storage medium | |
CN115543881B (en) | PCIE (peripheral component interconnect express) equipment adaptation method, PCIE equipment adaptation system, computer equipment and storage medium | |
CN108173892B (en) | Cloud mirror image operation method and device | |
CN112328423A (en) | Processing method, device and storage medium for search service loophole | |
CN109324931B (en) | Method for realizing vmware mount recovery in data de-duplication system | |
US20200073759A1 (en) | Maximum data recovery of scalable persistent memory | |
CN110908821A (en) | Method, device, equipment and storage medium for task failure management | |
CN113609199B (en) | Database system, server, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |