CN108540315B - Distributed storage system, method and device - Google Patents

Distributed storage system, method and device Download PDF

Info

Publication number
CN108540315B
CN108540315B CN201810269046.1A CN201810269046A CN108540315B CN 108540315 B CN108540315 B CN 108540315B CN 201810269046 A CN201810269046 A CN 201810269046A CN 108540315 B CN108540315 B CN 108540315B
Authority
CN
China
Prior art keywords
storage
fault
domains
protection domain
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810269046.1A
Other languages
Chinese (zh)
Other versions
CN108540315A (en
Inventor
郑乾坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Technologies Co Ltd Chengdu Branch
Original Assignee
New H3C Technologies Co Ltd Chengdu Branch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Technologies Co Ltd Chengdu Branch filed Critical New H3C Technologies Co Ltd Chengdu Branch
Priority to CN201810269046.1A priority Critical patent/CN108540315B/en
Publication of CN108540315A publication Critical patent/CN108540315A/en
Application granted granted Critical
Publication of CN108540315B publication Critical patent/CN108540315B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/508Network service management, e.g. ensuring proper service fulfilment according to agreements based on type of value added network service under agreement
    • H04L41/5096Network service management, e.g. ensuring proper service fulfilment according to agreements based on type of value added network service under agreement wherein the managed service relates to distributed or central networked applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Retry When Errors Occur (AREA)

Abstract

The present disclosure provides a distributed storage system, method and apparatus; the system includes a plurality of protection domains; each protection domain comprises a plurality of fault domains, each fault domain comprises a group of storage nodes; and each copy of the same data is stored on the storage nodes of different fault domains in the same protection domain, so that when the fault storage node occurs, maintenance is performed in the protection domain to which the fault storage node belongs, and the maximum allowable fault number of the fault domain in each protection domain is smaller than the number of the copies. The present disclosure may improve the security and reliability of a distributed storage system.

Description

Distributed storage system, method and device
Technical Field
The present disclosure relates to the field of data storage technologies, and in particular, to a distributed storage system, method, and apparatus.
Background
With the rapid development and wide use of internet technology, mass data is continuously generated every day, and therefore, a storage system as a data storage carrier is more and more important. The traditional storage system often has the defects of poor expansibility, poor safety and reliability and the like, so that the traditional storage system cannot meet the storage requirements of users. Distributed storage systems are increasingly gaining acceptance by users with their good data processing capabilities and reliability.
However, because the distributed storage system has the characteristic of a distributed hardware structure, if one or more storage servers in the system break down, the data service performance of the whole system is often affected, sometimes all services of the system are interrupted, and as the scale of the system is larger and larger, the affected service range is larger and larger, so that the service performance of the distributed storage system is lower.
Disclosure of Invention
In view of the above, the present disclosure is directed to a distributed storage system, a method and an apparatus for improving service performance of the distributed storage system.
In order to achieve the above purpose, the technical scheme adopted by the disclosure is as follows:
in a first aspect, the present disclosure provides a distributed storage system comprising a plurality of protection domains; each protection domain comprises a plurality of fault domains, each fault domain comprises a group of storage nodes; and each copy of the same data is stored on the storage nodes of different fault domains in the same protection domain, so that when the fault storage node occurs, maintenance is performed in the protection domain to which the fault storage node belongs, and the maximum allowable fault number of the fault domain in each protection domain is smaller than the number of the copies.
In a second aspect, the present disclosure provides a distributed storage method, which is applied to a monitoring node of a distributed storage system, where the system includes a plurality of protection domains; each protection domain comprises a plurality of fault domains, each fault domain comprises a group of storage nodes; the method comprises the following steps: receiving data to be stored; generating multiple copies of data; and respectively storing a plurality of copies of the data to storage nodes of different fault domains in the same protection domain.
In a third aspect, the present disclosure provides a distributed storage apparatus comprising a memory and a processor, wherein the memory is used for storing one or more computer instructions, and the one or more computer instructions are executed by the processor to implement the above-mentioned distributed storage method.
According to the distributed storage system, the distributed storage method and the distributed storage device, the system is divided into the plurality of protection domains, each copy of the same data can be stored in the storage nodes of different fault domains in the same protection domain, and when a fault storage node occurs, maintenance is performed in the protection domain to which the fault storage node belongs, so that the maximum tolerable number of the fault nodes of the system is increased, the influence range of the fault storage node and the probability of service stop of the system are reduced, the service performance of the distributed storage system is more stable, and the safety and the reliability of the system are improved.
Additional features and advantages of the disclosure will be set forth in the description which follows, or in part may be learned by the practice of the above-described techniques of the disclosure, or may be learned by practice of the disclosure.
In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic structural diagram of a distributed storage system according to an embodiment of the present disclosure;
FIG. 2 is a schematic structural diagram of another distributed storage system provided in the embodiments of the present disclosure;
FIG. 3 is a schematic structural diagram of another distributed storage system provided in the embodiments of the present disclosure;
FIG. 4 is a schematic structural diagram of another distributed storage system provided in the embodiments of the present disclosure;
FIG. 5 is a schematic structural diagram of another distributed storage system provided in the embodiments of the present disclosure;
FIG. 6 is a schematic structural diagram of another distributed storage system provided in the embodiments of the present disclosure;
fig. 7 is a flowchart of a distributed storage method according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a distributed storage device according to an embodiment of the present disclosure.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the embodiments of the present disclosure will be described clearly and completely with reference to the accompanying drawings, and it is to be understood that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
The distributed storage system dispersedly stores data on a plurality of independent devices and shares storage load by utilizing a plurality of storage servers; compared with traditional network storage architectures such as direct connection additional storage, network additional storage and storage area network, the distributed storage system has better reliability, availability and access efficiency, and is easy for system scale expansion.
For enterprise-level application platforms, especially internet service providers (e.g., providers of websites or application programs for shopping, retrieval, and map services), as the business scale is continuously enlarged, the business becomes more and more complex, and more concurrent user requests are made, and more data are processed; the distributed storage system can enhance the overall processing capacity of the distributed system by increasing the number of servers along with the continuous expansion of business requirements of enterprises so as to support high concurrency and mass data processing capacity and meet the computing requirements brought by the growth of the business of the enterprises.
Referring to fig. 1, a schematic diagram of a distributed storage system is shown; the management server (also called a monitoring node) dispersedly stores the data in the storage servers (also called storage nodes) connected with the management server, records the storage positions of various types of data, and simultaneously monitors the working state (such as whether a fault occurs or not) of each storage server; the client is in communication connection with the management server. For example, a user sends a data query request through a client, and after receiving the request, the management server searches a storage location corresponding to the data, such as an identifier of the storage server, an identifier of a hard disk stored in the storage server, and the like, according to the request, so as to read the data from the storage location and return the data to the client.
In a distributed storage system, if one or more storage servers are in failure, a management server cannot provide data services stored in the storage server to a client, and usually returns information of data reading failure to the client; especially for enterprise-class users, even if the storage server fails for a short time, the low-reliability storage causes a large amount of business service failures, and economic losses are caused.
In order to provide a certain security and reliability to the storage system, the storage system is usually divided into a plurality of fault domains, for example, as shown in fig. 1, one storage server may be divided into one fault domain; and when the storage server in one fault domain fails, the same data can be acquired from other fault domains, so that the storage system has higher fault resistance and higher reliability, thereby ensuring the smooth operation of data service and simultaneously ensuring the safety of the data not to be influenced.
The number of the generated copies is usually determined by the business requirements of the enterprise or the redundancy strategy of the current storage system; the larger the number of copies is, the stronger the reliability of the storage system is, but the smaller the amount of data that can be stored; the smaller the number of copies, the lower the storage system reliability, but the larger the amount of data that can be stored.
In addition, the erasure code technology can be adopted to process the data to be stored; specifically, firstly, dividing data to be stored into n original data blocks, and then obtaining m check data blocks according to the n original data blocks through operation, so as to obtain n + m data blocks in total; when any data block (including the original data block and the check data block) not larger than m blocks has an error, the n original data blocks can be recovered by using a corresponding reconstruction algorithm through the remaining data blocks. In the storage system, the n + m data blocks are distributed in different fault domains; thus, when a storage server within one of the failure domains fails, data blocks may be retrieved from the other failure domains and the original data blocks reconstructed.
The division range and the level of the fault domain can be adjusted and changed according to the scale of the storage system, the business requirements of enterprises and other factors; for example, when the scale of the storage system is small, one or more storage servers may be divided into a fault domain, and each storage server is usually provided with a plurality of storage hard disks; when the system is large in scale, the system usually adopts a form of a base station, and one base station manages a plurality of storage servers, and at this time, one or more base stations and the managed storage servers thereof can be divided into a fault domain. In the fault domain division process, it is common to keep the fault domains consistent, that is, the number of storage servers in each fault domain is the same, and the performance configuration of each storage server in each fault domain is the same.
Generally, if a storage node (e.g., a storage server or a storage hard disk) in a failed domain fails, the failed domain needs to be recovered from the failure; in the fault recovery process, the storage system can continue to provide services, but the overall operation efficiency and performance are affected, so that the service efficiency of the business is reduced; if a plurality of fault domains in the storage system have fault nodes, and the number of the fault domains of the fault nodes reaches the number of the copies of the data, it is likely that all the storage nodes where the copies of some data are located have faults, so that the data are lost.
When the scale of the distributed storage system is larger and larger, the number of fault domains in the system and the number of storage nodes in each fault domain are increased; after the storage nodes in the fault domain are increased, the probability of the fault nodes in the fault domain is also increased; for example, if there are 4 storage nodes in the failure domain, the probability of the failure node is 4; when the number of storage nodes of the fault domain is extended to 10, the probability of the fault node is increased to 10. After the number of fault domains in the system, the number of fault domains of a fault node in the storage system can easily reach the set copy number, namely the probability that the storage system needs to interrupt service is greatly increased; the storage system is frequently interrupted, so that the stability of the storage system is poor, the service performance is influenced, and the business requirements of enterprise users are difficult to meet.
Based on the problem that the large-scale storage system has poor service performance, the disclosed embodiment provides a distributed storage system, a method and a device; the technology can be applied to distributed storage clusters and storage systems, and particularly can be applied to distributed storage systems of enterprise users. As shown in FIG. 2, the distributed storage system includes a plurality of protection domains; each protection domain comprises a plurality of fault domains, each fault domain comprises a group of storage nodes; and each copy of the same data is stored on the storage nodes of different fault domains in the same protection domain, so that when the fault storage node occurs, the maintenance is carried out in the protection domain to which the fault storage node belongs.
In addition, in each protection domain in the disclosed embodiment, the maximum allowable number of failures of the failure domain is smaller than the number of copies.
In this embodiment, the number of protection domains included in the system, the number of fault domains included in each protection domain, and the number of storage nodes included in each fault domain are not specifically limited; a plurality of copies of data are randomly stored in different fault domains of the same protection domain; or, according to the storage condition of each fault domain in the protection domain, a plurality of copies are stored in different specified fault domains of the same protection domain. Fig. 2 illustrates an example in which the system includes two protection domains, each of which includes three fault domains; if the same data contains two copies, the two copies can be randomly stored in two fault domains of the same protection domain, or stored in two designated fault domains of the same protection domain according to the storage condition of each fault domain. If the same data contains three copies, the three copies can be stored in three failure domains of the same protection domain, respectively.
Because each copy of the same data is stored in the same protection domain, each protection domain stores completely different data, and data intersection does not exist between the protection domains, namely, the copies of the same data cannot be stored between every two protection domains; when a storage node fails and needs to be maintained in a system, the operation efficiency and performance of a protection domain to which the storage node belongs are influenced; and the operation efficiency and performance of other protection domains are kept unchanged, so that the problem that the overall performance of the system is reduced due to one failed storage node is solved, and the stability and reliability of the system are improved.
As can be seen from the above description, under the condition that no protection domain is set, the system usually requires that the maximum allowable number of failures of a failure domain is smaller than the number of copies, since data intersection may exist between every two failure domains in the system, if the number of failure domains of a failed node reaches the number of copies, in order to avoid data loss, the storage system usually needs to interrupt service, and repair the failed domain. If the protection domains are set in the above manner, the maximum allowable fault number of the fault domains in each protection domain is also smaller than the number of the copies, and if the number of the fault domains with the fault node in a certain protection domain reaches the number of the copies, only the data service of the protection domain needs to be interrupted, and other protection domains are not influenced. Generally, the system as a whole stops service only when the number of fault domains with fault nodes in all protection domains reaches the number of copies; by setting the protection domain, the probability of stopping the service of the system is greatly reduced, and the reliability of the system is improved.
Taking the distributed storage system shown in fig. 2 as an example, it is assumed that the number of copies of data is 2; if one fault domain of one protection domain has a fault storage node, the protection domain performs fault repair and does not affect the other protection domain; if two fault domains of one protection domain have fault storage nodes, the protection domain stops service and does not affect the other protection domain; the distributed storage system stops service only when two or more fault domains of one protection domain have fault storage nodes and two or more fault domains of the other protection domain also have fault storage nodes, which is equivalent to four fault storage nodes of the system.
As shown in fig. 2, if the system does not set a protection domain or shuts down the function of the protection domain, and if a failed storage node occurs in a failed domain, the overall performance of the system is affected if the number of copies of data is also 2; if two or more fault domains fail to provide storage nodes, the system will stop service if two fault domains fail to provide storage nodes.
As can be seen from the above comparison, in the distributed storage system provided in the embodiment of the present disclosure, by dividing the system into multiple protection domains, each copy of the same data can be stored in the storage nodes of different fault domains in the same protection domain, so that when a fault occurs in a storage node, maintenance is performed in the protection domain to which the fault storage node belongs.
Referring to fig. 3 and 4, schematic structural diagrams of another distributed storage system are shown respectively; in this embodiment, the system includes 18 hosts (i.e., the above-mentioned storage nodes), and 6 base stations (which may also be referred to as Rack or Rack) in total, each base station manages 3 hosts; in the system, each base station and a host managed by the base station are a fault domain; in fig. 4, one protection domain includes three fault domains.
When the data copy to be stored is 2, for example, copy a and copy B; the copy A and the copy B are simultaneously stored in a protection domain 1 or a protection domain 2; if the protection domain 1 is stored at the same time, two base stations among the base station 1, the base station 2 and the base station 3 can be selected to store the copy a and the copy B, for example, the copy a is stored in the base station 1, and the copy B is stored in the base station 3. At this time, even if one base station in the protection domain 1 and the protection domain 2 has a host fault, the system can still operate; the system interrupts service only when two base stations in the protection domain 1 and the protection domain 2 have host faults.
Likewise, as shown in fig. 3, if the system does not set a protection domain, copy a and copy B may be stored in any two base stations from base station 1 to base station 6; when a large amount of data exists, two base stations from the base station 1 to the base station 6 may possibly store copies of the same data; in 6 base stations, if any fault domain has a fault storage node, the overall performance of the system is affected; in 6 base stations, the system stops service when any two or more fault domains have fault storage nodes.
As can be seen from the above, the number of failed storage nodes allowed in the storage system or the number of failed storage hard disks in different nodes is fixed due to the limitation of the number of copies of data; therefore, the greater the number of storage nodes or storage hard disks in the system, the greater the probability that the system will be affected by system traffic due to node failure, which is generally in a linear growth relationship. By dividing the system into protection domains, the probability can be reduced, and the number of the fault storage nodes allowed in each protection domain or the fault storage hard disks in different nodes is consistent with the number of the fault storage nodes allowed in the whole cluster or the fault storage hard disks in different nodes when the protection domain is not divided. Therefore, for the distributed storage system with the same scale, after the protection domain is divided, under the condition that the service is not interrupted, the number of the maximum fault nodes that the system can tolerate is increased by a multiple compared with the number of the maximum fault nodes that the system can tolerate when the protection domain is not divided, taking the system shown in fig. 3 and 4 as an example, when the protection domain is not divided, the number of the maximum fault nodes that the system can tolerate is 1, and if the number of the maximum fault nodes is greater than 1, the system interrupts the service; after the two protection domains are divided, the maximum number of fault nodes which can be tolerated by the system is 2, and if the maximum number of fault nodes is more than 2, the system interrupts service. Therefore, the method effectively improves the reliability, stability, robustness and data security of the distributed storage system under abnormal conditions.
In another embodiment, considering that the data itself has multiple types, the service requirements corresponding to different types of data are different; for example, for a data type requiring a fast read/write speed, a block storage service may be employed; for the data type with high sharing performance, file storage service can be adopted; for data types which need to be read and written quickly and have high sharing performance, object storage service with high storage cost can be adopted.
When the scale of the distributed storage system is large, interfaces of the multiple storage services may be set to provide different storage services; however, the distribution positions of interfaces of various storage services are random, and when data is stored, the storage system is limited by conditions such as storage positions and storage spaces, and is difficult to select a matched storage service according to the service requirements of the data.
For example, a data type requiring high sharing performance is stored in a block storage service, and a data type requiring high read-write speed is stored in a file storage service, and this mixed storage manner makes the storage performance of various data unable to meet the service requirements, resulting in low overall storage efficiency of the system.
For the above reasons, the embodiment of the present disclosure further provides another distributed storage system, as shown in fig. 5, the system includes a plurality of partitions, and one or more protection domains with the same storage performance are respectively disposed in each partition; the storage performance of the storage nodes within each protection domain is the same.
In order to make the protection domains in the same partition have the same storage performance, the same storage service may be used in the same partition, that is, the storage service of the partition may be one of block storage, file storage and object storage; in order to make the storage performance of the storage nodes in each protection domain the same, the storage nodes in the protection domain may use storage devices with the same or similar physical resource models, for example, the storage nodes may use SSD (Solid State Drive) to provide high performance storage service, and may use HDD (Hard Disk Drive) to provide normal performance storage service.
When a storage node in a system is partitioned, the storage node is usually partitioned from a physical layer to realize isolation of physical resources among partitions, physical resources are not shared among the partitions except a monitoring node, and each partition is used for storing data corresponding to a pre-configured storage service, so that different physical resources used by different storage services are ensured. On a physical level, the total set of the storage nodes in each storage system can be regarded as a large storage pool, and the storage nodes in each partition can also be regarded as a storage pool corresponding to the partition, each storage pool being used for providing a more subdivided storage service; different storage pools may include storage nodes with different storage performance, and when performing a storage service, different storage pools may be selected according to the requirements of the service. By partitioning the storage pool, isolation of different storage services on physical resources may be further achieved.
The distributed storage system can be divided into a plurality of partitions according to the performance of each storage node, and then each partition is divided into protection domains with the same storage performance, and the storage performance of the storage nodes in each protection domain is the same; the partition mode enables data of different storage services to be distributed on different physical resources, the storage services with different performances are provided through the physical resources with different performances, the data types are matched with the storage performances, and the storage efficiency of the system is improved on the premise of ensuring the safety and the reliability of the system.
The embodiment of the present disclosure further provides another distributed storage system, as shown in fig. 6, the system may select whether to set a protection domain according to factors such as actual service requirements and scale of each partition; as shown in partition 1, each base station and the host managed by the base station are directly divided into fault domains in partition 1; in the partition 2, firstly, the base station 1, the base station 2, the base station 3 and the host computer managed by the base station are divided into a protection domain 1, and the base station 4, the base station 5, the base station 6 and the host computer managed by the base station are divided into a protection domain 2; taking protection domain 1 as an example, each base station and the host managed by the base station are divided into fault domains. It is to be understood that each base station typically manages one or more hosts, and the hosts of some of the base stations in fig. 6 are not shown for convenience of illustration.
The partitions can be used as storage area units in a cluster (i.e., the distributed storage system), each partition includes a plurality of hosts in the cluster, and different storage service is provided for users in different partitions; the partition can meet the isolation between different storage service data and the block storage requirements of users using different performances. On the basis of the partitions, each partition is further divided into a plurality of protection domains so as to improve the reliability of the cluster; and the fault domain is the minimum unit of data distribution in the cluster, and due to the fact that the fault domain is divided, several copies or data blocks of one data can be distributed in several different fault domains of the same partition, and when the cluster fails, the reliability of the cluster can be further improved.
Corresponding to the system implementation mode, the disclosure also provides a distributed storage method; the method is applied to monitoring nodes of the distributed storage system, and the monitoring nodes are connected with all storage nodes in the distributed storage system and used for fault monitoring and fault maintenance of data storage and the storage nodes. The system includes a plurality of protection domains; each protection domain comprises a plurality of fault domains, each fault domain comprises a group of storage nodes; as shown in fig. 7, the method includes the steps of:
step S702, receiving data to be stored;
step S704, generating a plurality of copies of the data;
step S706, respectively storing multiple copies of the data to storage nodes of different fault domains in the same protection domain.
The distributed storage method comprises the steps that after data to be stored are received, multiple copies of the data are generated, and then the multiple copies of the data are stored on storage nodes of different fault domains in the same protection domain respectively; the method can be used for maintaining in the protection domain to which the failed storage node belongs when the failed storage node occurs, so that the maximum number of the failed nodes tolerable by the system is increased, the influence range of the failed storage node and the probability of stopping service of the system are reduced, and the safety and the reliability of the distributed storage system are improved.
In the system, the storage performance of the storage nodes in each protection domain is the same; the system also comprises a plurality of partitions, wherein one or a plurality of protection domains with the same storage performance are respectively arranged in each partition, and a plurality of copies are respectively stored on the storage nodes of different fault domains in the same protection domain, and the method comprises the following steps: judging storage service corresponding to the data, wherein the storage service comprises block storage, object storage or file storage; selecting a partition according to the storage service corresponding to the data; and respectively storing a plurality of copies of the data on the storage nodes of different fault domains in the same protection domain of the selected partition.
The method further comprises the following steps: monitoring storage nodes in each protection domain; if the fault storage node is monitored and the number of the fault domains in the protection domain does not reach the preset maximum allowable fault number, maintaining the protection domain to which the fault storage node belongs; and if the fault storage node is monitored and the number of the fault domains in the protection domain reaches the preset maximum allowable fault number, interrupting the storage service of the protection domain to which the fault storage node belongs.
The embodiment provides a distributed storage device corresponding to the method embodiment. FIG. 8 is a schematic diagram of the distributed storage apparatus, which may be disposed on a monitoring node in a distributed storage system; as shown in fig. 8, the apparatus includes a memory 100 and a processor 101; wherein the memory 100 is used to store one or more computer instructions that are executed by the processor to implement the distributed storage methods described above, which may include one or more of the above methods.
Further, the distributed storage apparatus shown in fig. 8 further includes a bus 102 and a communication interface 103, and the processor 101, the communication interface 103, and the memory 100 are connected by the bus 102.
The Memory 100 may include a high-speed Random Access Memory (RAM) and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 103 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used. The bus 102 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 8, but that does not indicate only one bus or one type of bus.
The processor 101 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 101. The Processor 101 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present disclosure may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present disclosure may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 100, and the processor 101 reads the information in the memory 100, and completes the steps of the method of the foregoing embodiment in combination with the hardware thereof.
Embodiments of the present invention further provide a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions, and when the machine-executable instructions are called and executed by a processor, the machine-executable instructions cause the processor to implement the distributed storage method, and specific implementation may refer to method embodiments, and will not be described herein again.
The implementation principle and the resulting technical effect of the distributed storage apparatus provided by the embodiment of the present invention are the same as those of the foregoing method embodiments, and for the sake of brief description, no mention is made in the apparatus embodiment, and reference may be made to the corresponding contents in the foregoing method embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, each functional module or unit in each embodiment of the present disclosure may be integrated together to form an independent part, or each module may exist alone, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (7)

1. A distributed storage system, the system comprising a plurality of protection domains; each of the protection domains comprises a plurality of fault domains, each of the fault domains comprising a set of storage nodes;
each copy of the same data is stored on storage nodes of different fault domains in the same protection domain, so that when a fault storage node occurs, maintenance is carried out in the protection domain to which the fault storage node belongs;
in each protection domain, the maximum allowable number of faults of the fault domain is less than the number of the copies;
the storage performance of the storage nodes in each protection domain is the same;
the system also comprises a plurality of partitions, and one or more protection domains with the same storage performance are respectively arranged in each partition;
each partition is used for storing data corresponding to a pre-configured storage service; the storage service comprises block storage, object storage or file storage; the system also comprises an interface corresponding to the storage service; the system is configured to provide the storage service through the interface.
2. The system of claim 1, wherein the physical resource models of the storage nodes within each of the protection domains are the same; the physical resource model comprises a Solid State Disk (SSD) hard disk or a Hard Disk Drive (HDD) hard disk.
3. The system of claim 1, wherein a plurality of said replicas are randomly stored in different failure domains of a same protection domain; or, according to the storage condition of each fault domain in the protection domain, a plurality of copies are stored in different specified fault domains of the same protection domain.
4. The system of claim 1, further comprising a monitoring node coupled to the storage node for performing fault monitoring and fault maintenance of the data storage and storage node.
5. A distributed storage method is applied to a monitoring node of a distributed storage system, wherein the system comprises a plurality of protection domains; each of the protection domains comprises a plurality of fault domains, each of the fault domains comprising a set of storage nodes; the system also comprises an interface corresponding to the storage service; the system is used for providing the storage service through the interface; the method comprises the following steps:
receiving data to be stored;
generating a plurality of copies of the data;
respectively storing a plurality of copies of the data to storage nodes of different fault domains in the same protection domain;
the storage performance of the storage nodes in each protection domain is the same; the system also comprises a plurality of partitions, and one or more protection domains with the same storage performance are respectively arranged in each partition; the step of storing the plurality of copies to storage nodes of different fault domains in the same protection domain respectively includes:
judging storage service corresponding to the data, wherein the storage service comprises block storage, object storage or file storage;
selecting a partition according to the storage service corresponding to the data;
and respectively storing a plurality of copies of the data on storage nodes of different fault domains in the same protection domain of the selected partition.
6. The method of claim 5, further comprising:
monitoring storage nodes in each of the protection domains;
if the fault storage node is monitored and the number of the fault domains in the protection domain does not reach the preset maximum allowable fault number, maintaining the protection domain to which the fault storage node belongs;
and if the fault storage node is monitored and the number of the fault domains in the protection domain reaches the preset maximum allowable fault number, interrupting the storage service of the protection domain to which the fault storage node belongs.
7. A distributed storage apparatus comprising a memory and a processor, wherein the memory is configured to store one or more computer instructions that are executed by the processor to implement the method of claim 5 or 6.
CN201810269046.1A 2018-03-28 2018-03-28 Distributed storage system, method and device Active CN108540315B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810269046.1A CN108540315B (en) 2018-03-28 2018-03-28 Distributed storage system, method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810269046.1A CN108540315B (en) 2018-03-28 2018-03-28 Distributed storage system, method and device

Publications (2)

Publication Number Publication Date
CN108540315A CN108540315A (en) 2018-09-14
CN108540315B true CN108540315B (en) 2021-12-07

Family

ID=63482272

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810269046.1A Active CN108540315B (en) 2018-03-28 2018-03-28 Distributed storage system, method and device

Country Status (1)

Country Link
CN (1) CN108540315B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020082888A1 (en) * 2018-10-25 2020-04-30 华为技术有限公司 Method, system and apparatus for restoring data in storage system
CN111552436B (en) * 2018-10-25 2022-02-25 华为技术有限公司 Data recovery method, system and device in storage system
CN111381770B (en) * 2018-12-30 2021-07-06 浙江宇视科技有限公司 Data storage switching method, device, equipment and storage medium
US11157482B2 (en) * 2019-02-05 2021-10-26 Seagate Technology Llc Data distribution within a failure domain tree
CN110096301A (en) * 2019-05-08 2019-08-06 深信服科技股份有限公司 A kind of hot upgrade method of storage system, system and electronic equipment and storage medium
CN111625421B (en) * 2020-05-26 2021-07-16 云和恩墨(北京)信息技术有限公司 Method and device for monitoring distributed storage system, storage medium and processor
CN113821165B (en) * 2021-08-20 2023-12-22 济南浪潮数据技术有限公司 Distributed cluster fusion storage method, system and equipment
CN114466030B (en) * 2021-12-27 2024-03-12 天翼云科技有限公司 Management method and device of data distributed storage strategy and distributed storage system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010048048A2 (en) * 2008-10-24 2010-04-29 Microsoft Corporation Configuration management in distributed data systems
CN104735107A (en) * 2013-12-20 2015-06-24 中国移动通信集团公司 Recovery method and device for data copies in distributed storage system
CN105706056A (en) * 2013-10-03 2016-06-22 微软技术许可有限责任公司 Fault domains on modern hardware
CN105912612A (en) * 2016-04-06 2016-08-31 中广天择传媒股份有限公司 Distributed file system and data equilibrium distribution method orienting same
CN107085546A (en) * 2016-02-16 2017-08-22 深圳市深信服电子科技有限公司 Data managing method and device based on failure field technique

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010048048A2 (en) * 2008-10-24 2010-04-29 Microsoft Corporation Configuration management in distributed data systems
CN105706056A (en) * 2013-10-03 2016-06-22 微软技术许可有限责任公司 Fault domains on modern hardware
CN104735107A (en) * 2013-12-20 2015-06-24 中国移动通信集团公司 Recovery method and device for data copies in distributed storage system
CN107085546A (en) * 2016-02-16 2017-08-22 深圳市深信服电子科技有限公司 Data managing method and device based on failure field technique
CN105912612A (en) * 2016-04-06 2016-08-31 中广天择传媒股份有限公司 Distributed file system and data equilibrium distribution method orienting same

Also Published As

Publication number Publication date
CN108540315A (en) 2018-09-14

Similar Documents

Publication Publication Date Title
CN108540315B (en) Distributed storage system, method and device
US11740826B2 (en) Policy-based hierarchical data protection in distributed storage
CN107544862B (en) Stored data reconstruction method and device based on erasure codes and storage node
CN107943421B (en) Partition division method and device based on distributed storage system
US10445197B1 (en) Detecting failover events at secondary nodes
US10423476B2 (en) Aggressive searching for missing data in a DSN memory that has had migrations
US11442827B2 (en) Policy-based hierarchical data protection in distributed storage
WO2016180049A1 (en) Storage management method and distributed file system
CN110825704B (en) Data reading method, data writing method and server
US8060773B1 (en) Systems and methods for managing sub-clusters within a multi-cluster computing system subsequent to a network-partition event
CN107133228A (en) A kind of method and device of fast resampling
CN104468150A (en) Method for realizing fault migration through virtual host and virtual host service device
WO2020263372A1 (en) Distributed object storage system with dynamic spreading
US20230137007A1 (en) Data storage method, storage system, storage device, and storage medium
CN108133034B (en) Shared storage access method and related device
US11886309B2 (en) Cell-based storage system with failure isolation
CN115756955A (en) Data backup and data recovery method and device and computer equipment
CN109840051B (en) Data storage method and device of storage system
Lee et al. Erasure coded storage systems for cloud storage—challenges and opportunities
CN111045853A (en) Method and device for improving erasure code recovery speed and background server
US9684668B1 (en) Systems and methods for performing lookups on distributed deduplicated data systems
US8621260B1 (en) Site-level sub-cluster dependencies
CN113391937B (en) Method, electronic device and computer program product for storage management
CN102904946B (en) Method and device for managing nodes in cluster
JP2018524705A (en) Method and system for processing data access requests during data transfer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant