CN108846009B - Copy data storage method and device in ceph - Google Patents

Copy data storage method and device in ceph Download PDF

Info

Publication number
CN108846009B
CN108846009B CN201810400813.8A CN201810400813A CN108846009B CN 108846009 B CN108846009 B CN 108846009B CN 201810400813 A CN201810400813 A CN 201810400813A CN 108846009 B CN108846009 B CN 108846009B
Authority
CN
China
Prior art keywords
fault
domains
stored
storage
domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810400813.8A
Other languages
Chinese (zh)
Other versions
CN108846009A (en
Inventor
韩庆波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201810400813.8A priority Critical patent/CN108846009B/en
Publication of CN108846009A publication Critical patent/CN108846009A/en
Application granted granted Critical
Publication of CN108846009B publication Critical patent/CN108846009B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0727Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a storage system, e.g. in a DASD or network based storage system

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a method and a device for storing copy data in ceph, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a topological structure of an object storage device OSD cluster; dividing at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains, and taking an undivided fault domain and a virtual fault sub-domain obtained by division as fault domains to be stored; selecting the number of fault domains to be stored as storage fault domains based on the number of the copy data needing to be stored, wherein different virtual fault sub domains in the selected storage fault domains belong to different physical fault domains; and respectively storing the duplicate data in one OSD of each storage fault domain. By using the copy data storage method provided by the embodiment of the invention, the probability of loss of the copy data in ceph can be reduced.

Description

Copy data storage method and device in ceph
Technical Field
The present invention relates to the field of data management technologies, and in particular, to a copy data storage method and apparatus in a ceph (distributed file system), an electronic device, and a storage medium.
Background
ceph is a distributed file system, and for ceph, PG (placement group) is a virtual node of data storage, and a carrier of PG is a hardware storage unit that may be an entity, such as OSD (object storage device). Each PG has R copies of data, which are stored on R different OSDs, respectively, and the R OSDs belong to different physical fault domains. The physical fault domain is a storage area which is artificially divided, and the division of the physical fault domain is to avoid that the same copy data stored in a certain storage area is affected when the certain storage area fails, and the same copy data is usually stored in different physical fault domains.
If multiple OSDs, which respectively belong to different physical fault domains, storing the copy data of a PG fail, the copy data of the PG is lost. A collection containing a certain number of OSDs is usually referred to as an OSD cluster.
The probability formula for a known missing PG's duplicate data is:
Figure BDA0001645674270000011
wherein R represents the number of duplicate data, PrThe probability that R OSD simultaneously fails is represented, N represents the number of OSD in the OSD cluster, and M represents the number of the distribution condition of R copy data of one PG in the OSD cluster.
For example, referring to fig. 1, fig. 1 is a schematic diagram of a copy data storage method in ceph in the prior art, as shown in fig. 1, a PG has three copy data, which are stored on three OSDs located in three different fault domains, respectively, and each chassis in fig. 1 is a fault domain, that is, three OSDs storing the copy data of the PG are located on three different chassis, respectively.
With respect to fig. 1, there are 24 OSDs on each rack, and the distribution of 3 copies of the PG in the OSD cluster is 24 × 24 × 24, that is, in the formula for calculating the probability of missing PG data, the value of M is 24 × 24 × 24 — 13824, which is larger, and also means that the probability of missing PG copies is larger.
Disclosure of Invention
The embodiment of the invention aims to provide a method and a device for storing copy data in ceph, electronic equipment and a storage medium, so as to reduce the probability of loss of the copy data in the ceph.
The specific technical scheme is as follows:
the embodiment of the invention provides a method for storing copy data in ceph, which comprises the following steps:
acquiring a topological structure of an OSD cluster, wherein the OSD cluster comprises a plurality of OSD, and the topological structure represents the division condition of a physical fault domain of the OSD cluster;
dividing at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains, and taking an undivided fault domain and a virtual fault sub-domain obtained by division as fault domains to be stored;
selecting the number of fault domains to be stored as storage fault domains based on the number of the replica data to be stored, wherein different virtual fault sub domains in the selected storage fault domains belong to different physical fault domains;
and respectively storing the duplicate data in one OSD of each storage fault domain.
Optionally, the dividing at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains includes:
and dividing each physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains.
Optionally, before the selecting the number of fault domains to be stored as the storage fault domains, the method further includes:
dividing the obtained fault domains to be stored into a plurality of groups of fault domains to be stored, wherein different virtual fault sub domains in each group of fault domains to be stored belong to different physical fault domains;
the selecting the number of fault domains to be stored as storage fault domains comprises:
and selecting the number of fault domains to be stored as storage fault domains from one of the plurality of groups of fault domains to be stored.
Optionally, the dividing at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains includes:
dividing each physical fault domain of the OSD cluster into two virtual fault sub-domains;
before the selecting the number of fault domains to be stored as the storage fault domains, the method further includes:
dividing the obtained fault domains to be stored into two groups of fault domains to be stored, wherein different virtual fault sub domains in each group of fault domains to be stored belong to different physical fault domains;
the selecting the number of fault domains to be stored as storage fault domains comprises:
and selecting the number of fault domains to be stored as storage fault domains from one of the two groups of fault domains to be stored.
Optionally, the number of the duplicate data to be stored is the same as the number of the physical fault domains of the OSD cluster.
The embodiment of the invention also provides a device for storing the copy data in the ceph, which comprises:
the system comprises a topological structure obtaining module, a judging module and a judging module, wherein the topological structure obtaining module is used for obtaining a topological structure of an OSD cluster, the OSD cluster comprises a plurality of OSD, and the topological structure represents the division condition of a physical fault domain of the OSD cluster;
the fault domain dividing module is used for dividing at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains, and taking the non-divided fault domains and the divided virtual fault sub-domains as fault domains to be stored;
the storage domain selection module is used for selecting the number of fault domains to be stored as storage fault domains according to the number of the copy data to be stored, wherein different virtual fault sub domains in the selected storage fault domains belong to different physical fault domains;
and the storage module is used for respectively storing the duplicate data in one OSD of each storage fault domain.
Optionally, the fault domain dividing module is specifically configured to:
and dividing each physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains.
Optionally, the apparatus further comprises:
the grouping module is used for dividing the obtained fault domains to be stored into a plurality of groups of fault domains to be stored, and different virtual fault sub domains in each group of fault domains to be stored belong to different physical fault domains;
the storage domain selection module is specifically configured to:
and selecting the number of fault domains to be stored as storage fault domains from one of the plurality of groups of fault domains to be stored.
Optionally, the fault domain dividing module is specifically configured to:
dividing each physical fault domain of the OSD cluster into two virtual fault sub-domains;
the device further comprises:
the dividing module is used for dividing the obtained fault domains to be stored into two groups of fault domains to be stored, and different virtual fault sub domains in each group of fault domains to be stored belong to different physical fault domains;
and the selecting module is used for selecting the number of fault domains to be stored as storage fault domains from one of the two groups of fault domains to be stored.
Optionally, the number of the duplicate data to be stored is the same as the number of the physical fault domains of the OSD cluster.
The embodiment of the invention also provides electronic equipment which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory finish mutual communication through the communication bus;
the memory is used for storing a computer program;
the processor is configured to implement any of the above method steps when executing the program stored in the memory.
An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements any of the above method steps.
By using the copy data storage method, the device, the electronic equipment and the storage medium in ceph provided by the embodiment of the invention, at least one physical fault domain in the OSD cluster can be divided into a plurality of virtual fault sub-domains, fault domains with the same number as that of copies are selected for copy data storage, and different virtual fault sub-domains in the selected fault domains belong to different physical fault domains. The number of OSD contained in the selected fault domain is smaller than that contained in the physical fault domain before division, so that the number of distribution conditions of the copy data in the selected fault domain is reduced, and the copy data storage method provided by the embodiment of the invention can reduce the probability of the loss of the copy data in ceph according to the existing probability formula of the loss of the copy data.
Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is a diagram illustrating a method for storing copy data in ceph according to the prior art;
fig. 2 is a flowchart of a method for storing copy data in ceph according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a copy data storage method in ceph according to an embodiment of the present invention;
fig. 4 is another schematic diagram of a copy data storage method in ceph according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a copy data storage device in ceph according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.
In order to solve the problems in the prior art, embodiments of the present invention provide a method and an apparatus for storing copy data in ceph, an electronic device, and a storage medium, which can solve the problem in the prior art that the probability of losing copy data of PG is relatively high.
Referring to fig. 2, fig. 2 is a flowchart of a copy data storage method in ceph according to an embodiment of the present invention, where the method may include the following steps:
step S201: acquiring a topological structure of an OSD cluster, wherein the OSD cluster comprises a plurality of OSD, and the topological structure represents the division condition of a physical fault domain of the OSD cluster;
in ceph, a copy of data may be stored on OSD, and before storing the data, the topology of the OSD cluster may be obtained first, and the topology of the OSD cluster may be obtained by calling a flush map (ceph middle level distribution map) describing the hierarchy of the ceph storage system, for example, how many racks the storage system includes, how many storage devices each includes, how many OSDs each includes, and the like.
In the embodiment of the present invention, the obtained topology structure of the OSD cluster may include a division condition of a physical fault domain, where the physical fault domain is a storage region that is artificially divided, and the division of the physical fault domain is to avoid that the same duplicate data stored in a certain storage region is affected when the certain storage region fails, so that the same duplicate data is usually stored in different physical fault domains. Referring to fig. 1, the physical fault domain level in fig. 1 is a rack level, that is, each rack is a physical fault domain, and as can be seen from fig. 1, each rack contains 24 OSDs.
Step S202: dividing at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains, and taking an undivided fault domain and a virtual fault sub-domain obtained by division as fault domains to be stored;
in this step, the division of the physical failure domain may be modified by modifying ceph rule (a data storage rule in ceph), which defines a constraint condition for storing duplicate data, for example, for three identical duplicate data to be stored, ceph rule defines a rule by which the three duplicate data are stored in three different physical failure domains.
In the embodiment of the invention, at least one physical fault domain of the OSD cluster can be divided into a plurality of virtual physical fault sub-domains by modifying ceph rule, and after the division is finished, the non-divided fault domain and the divided virtual fault sub-domains are both used as fault domains to be stored.
Each physical fault domain of the OSD cluster may also be divided into a plurality of virtual physical fault sub-domains, which is not limited in the embodiment of the present invention.
Step S203: for the copy data to be stored, selecting the copy number of fault domains to be stored as storage fault domains based on the number of the copy data to be stored, wherein different virtual fault sub domains in the selected storage fault domains belong to different physical fault domains;
in the embodiment of the invention, after the physical fault domains are divided, the fault domains to be stored in the number can be selected to store the duplicate data according to the number of the duplicate data. For example, for replica data with the number of replicas being three, three fault domains to be stored may be selected to store the three replicas respectively, and the three fault domains to be stored belong to different physical fault domains, so that it may be ensured that the three replica data are stored in different physical fault domains.
In the embodiment of the invention, after the physical fault domains are divided, the obtained fault domains to be stored can be divided into a plurality of groups of fault domains to be stored, and different virtual fault sub domains of each group of fault domains to be stored belong to different physical fault domains. Therefore, when the fault to be stored in which the duplicate data is stored is determined, a group of fault domains to be stored can be selected from a plurality of groups of fault domains to be stored obtained by division, and then the fault domains to be stored with the number of duplicates are selected from the group of fault domains to be stored as the storage fault domains, wherein the storage fault domains are the selected fault domains to be stored with the duplicate data.
For example, referring to fig. 3, fig. 3 is a schematic diagram of a copy data storage method according to an embodiment of the present invention, for data to be stored with a copy number of three, three fault domains are finally divided for the data to be stored for data storage. Before that, if the number of the acquired physical fault domains is 4, the physical fault domains a, b, c and d shown in fig. 3 are included. In the embodiment shown in fig. 3, the physical fault domain a is not divided into sub-domains, but the physical fault domains b, c, and d are divided into virtual fault sub-domains b1, b2, c1, c2, d1, and d2, respectively, and then a, b1, b2, c1, c2, d1, and d2 are all used as fault domains to be stored.
After the division is completed, the fault domains to be stored can be grouped, and different virtual fault sub domains in each group of fault domains to be stored are ensured to belong to different physical fault domains. For example, a, b1, c1 and d2 may be divided into one group, or b1, c2 and d2 may be divided into one group, that is, the number of fault domains to be stored included in each group may be greater than or equal to the number of copies of copy data to be stored.
After the grouping is determined, one group of fault domains to be stored can be selected from the multiple groups of fault domains to be stored for storing the duplicate data, and since the number of the fault domains in the selected group of fault domains to be stored is possibly greater than the number of the duplicates, the number of the fault domains to be stored can be selected from the selected group of fault domains to be stored, and the duplicate data can be stored in the number of the fault domains to be stored. For example, if a selected group of fault domains to be stored includes fault domains a, b1, c1 and d2, since the number of copies is 3, a, b1 and c1 can be selected from the group of fault domains to be stored for storing the copy data.
Step S204: the duplicate data is stored in one OSD for each storage fault domain.
After the number of fault domains to be stored in the copy is selected, the fault domains to be stored may be determined as storage fault domains, that is, the copy data to be stored may be stored in the storage fault domains, each storage fault domain includes a plurality of OSDs, and the copy data may be stored in one OSD of each storage fault domain.
It can be seen that by using the method for storing duplicate data in ceph provided by the embodiment of the present invention, at least one physical fault domain in an OSD cluster can be divided into a plurality of virtual fault sub-domains, and then fault domains with the same number as that of duplicates are selected for storing the duplicate data, and different virtual fault sub-domains in the selected fault domains belong to different physical fault domains. The number of OSD contained in the selected fault domain is smaller than that contained in the physical fault domain before division, so that the number of distribution conditions of the copy data in the selected fault domain is reduced, and the copy data storage method provided by the embodiment of the invention can reduce the probability of the loss of the copy data in ceph according to the existing probability formula of the loss of the copy data.
In this embodiment of the present invention, dividing at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains may include:
dividing each physical fault domain of the OSD cluster into two virtual fault sub-domains;
before selecting the fault domains to be stored with the number of copies as the storage fault domains, the method may further include:
dividing the obtained fault domains to be stored into two groups of fault domains to be stored, wherein different virtual fault sub domains in each group of fault domains to be stored belong to different physical fault domains;
selecting the fault domains to be stored with the number of copies as storage fault domains may include:
and selecting a number of fault domains to be stored as storage fault domains from one of the two groups of fault domains to be stored.
Referring to fig. 4, fig. 4 is a schematic diagram of a copy data storage method in ceph according to an embodiment of the present invention, in the embodiment shown in fig. 4, each physical fault domain, that is, each rack in the figure, is divided into two virtual fault sub-domains, respectively, a1, a2, B1, B2, C1, and C2, as shown in fig. 4, each virtual fault sub-domain includes 12 OSDs, in the embodiment shown in fig. 4, two groups of fault domains to be stored may be determined, and the virtual fault sub-domains in each group of fault domains to be stored belong to different fault domains, for example, a1, B1, and C1 are determined as one group, and a2, B2, and C2 are determined as another group.
Then, when data storage is performed, one of the two groups may be selected to store the duplicate data, and the duplicate data to be stored is stored on one OSD in the selected group of storage failure domains, because each storage failure domain includes 12 OSDs, then, for three identical duplicate data, when the three identical duplicate data are stored, 12 × 12 × 12 possible storage situations are selected for each group of storage failure domains, then the number of the final storage situations of the duplicate data in the failure domains is 12 × 12 × 12 × 2, which is smaller than the 24 × 24 × 24 possible storage situations obtained by using the duplicate data storage method in the prior art, and it can be known from the existing probability formula of data loss distribution that the duplicate data storage method provided in the embodiment of the present invention can reduce the probability of the loss of the duplicate data in ceph.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a copy data storage device in ceph according to an embodiment of the present invention, where the schematic structural diagram may include:
a topology obtaining module 501, configured to obtain a topology of an OSD cluster of an object storage unit, where the OSD cluster includes a plurality of OSDs, and the topology represents a division condition of a physical fault domain of the OSD cluster;
a fault domain dividing module 502, configured to divide at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains, and use an undivided fault domain and a divided virtual fault sub-domain as a fault domain to be stored;
a storage domain selecting module 503, configured to select, based on the number of the copy data to be stored, a number of fault domains to be stored of the copy number as storage fault domains for the copy data to be stored, where different virtual fault sub domains in the selected storage fault domain belong to different physical fault domains;
the storage module 504 is configured to store the replica data in one OSD of each storage fault domain.
In this embodiment of the present invention, the fault domain dividing module 502 may be specifically configured to:
each physical fault domain of the OSD cluster is divided into a plurality of virtual fault sub-domains.
In this embodiment of the present invention, on the basis of the copy data storage device in ceph shown in fig. 5, the method may further include:
the grouping module is used for dividing the obtained fault domains to be stored into a plurality of groups of fault domains to be stored, and different virtual fault sub domains in each group of fault domains to be stored belong to different physical fault domains;
the storage domain selection module may be specifically configured to:
and selecting the fault domains to be stored with the number of copies as storage fault domains from one fault domain to be stored in the plurality of groups of fault domains to be stored.
In this embodiment of the present invention, the fault domain dividing module may be specifically configured to:
dividing each physical fault domain of the OSD cluster into two virtual fault sub-domains;
on the basis of the copy data storage device in ceph shown in fig. 5, the method may further include:
the dividing module is used for dividing the obtained fault domains to be stored into two groups of fault domains to be stored, and different virtual fault sub domains in each group of fault domains to be stored belong to different physical fault domains;
and the selection module is used for selecting the fault domains to be stored with the number of copies from one fault domain to be stored in two fault domains to be stored as the storage fault domains.
In the embodiment of the present invention, the number of the duplicate data to be stored may be the same as the number of the physical fault domains of the OSD cluster.
The embodiment of the invention discloses electronic equipment, which is shown in figure 6. Comprises a processor 601, a communication interface 602, a memory 603 and a communication bus 604, wherein the processor 601, the communication interface 602 and the memory 603 are communicated with each other through the communication bus 604,
a memory 603 for storing a computer program;
the processor 601 is configured to implement any of the above method steps when executing the program stored in the memory 603.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
The embodiment of the invention also provides a computer readable storage medium, wherein a computer program is stored in the computer readable storage medium, and the computer program is used for realizing any one of the method steps when being executed by a processor.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the invention are brought about in whole or in part when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, the electronic device, the computer-readable storage medium, and the computer program product embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiments.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (9)

1. A method for storing copy data in a distributed file system ceph is characterized by comprising the following steps:
acquiring a topological structure of an OSD cluster of an object storage device, wherein the OSD cluster comprises a plurality of OSD, and the topological structure represents the division condition of a physical fault domain of the OSD cluster;
dividing at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains, and taking an undivided fault domain and a virtual fault sub-domain obtained by division as fault domains to be stored;
selecting the number of fault domains to be stored as storage fault domains based on the number of the replica data to be stored, wherein different virtual fault sub domains in the selected storage fault domains belong to different physical fault domains;
respectively storing the duplicate data in an OSD of each storage fault domain;
before the selecting the number of fault domains to be stored as the storage fault domains, the method further includes:
dividing the obtained fault domains to be stored into a plurality of groups of fault domains to be stored, wherein different virtual fault sub domains in each group of fault domains to be stored belong to different physical fault domains;
the selecting the number of fault domains to be stored as storage fault domains comprises:
and selecting the number of fault domains to be stored as storage fault domains from one of the plurality of groups of fault domains to be stored.
2. The method of claim 1, wherein the dividing at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains comprises:
and dividing each physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains.
3. The method of claim 1, wherein the dividing at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains comprises:
dividing each physical fault domain of the OSD cluster into two virtual fault sub-domains;
before the selecting the number of fault domains to be stored as the storage fault domains, the method further includes:
dividing the obtained fault domains to be stored into two groups of fault domains to be stored, wherein different virtual fault sub domains in each group of fault domains to be stored belong to different physical fault domains;
the selecting the number of fault domains to be stored as storage fault domains comprises:
and selecting the number of fault domains to be stored as storage fault domains from one of the two groups of fault domains to be stored.
4. The method of claim 3, wherein the number of replica data to be stored is the same as the number of physical fault domains of the OSD cluster.
5. A replica data storage apparatus in a distributed file system ceph, the apparatus comprising:
the system comprises a topological structure acquisition module, a data processing module and a data processing module, wherein the topological structure acquisition module is used for acquiring a topological structure of an OSD cluster of an object storage device, the OSD cluster comprises a plurality of OSD, and the topological structure represents the division condition of a physical fault domain of the OSD cluster;
the fault domain dividing module is used for dividing at least one physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains, and taking the non-divided fault domains and the divided virtual fault sub-domains as fault domains to be stored;
the storage domain selection module is used for selecting the number of fault domains to be stored as storage fault domains according to the number of the copy data to be stored, wherein different virtual fault sub domains in the selected storage fault domains belong to different physical fault domains;
the storage module is used for respectively storing the duplicate data in one OSD of each storage fault domain;
the device further comprises:
the grouping module is used for dividing the obtained fault domains to be stored into a plurality of groups of fault domains to be stored, and different virtual fault sub domains in each group of fault domains to be stored belong to different physical fault domains;
the storage domain selection module is specifically configured to:
and selecting the number of fault domains to be stored as storage fault domains from one of the plurality of groups of fault domains to be stored.
6. The apparatus according to claim 5, wherein the fault domain partitioning module is specifically configured to:
and dividing each physical fault domain of the OSD cluster into a plurality of virtual fault sub-domains.
7. The apparatus of claim 5, wherein the fault domain partitioning module is specifically configured to:
dividing each physical fault domain of the OSD cluster into two virtual fault sub-domains;
the device further comprises:
the dividing module is used for dividing the obtained fault domains to be stored into two groups of fault domains to be stored, and different virtual fault sub domains in each group of fault domains to be stored belong to different physical fault domains;
and the selecting module is used for selecting the number of fault domains to be stored as storage fault domains from one of the two groups of fault domains to be stored.
8. The apparatus of claim 7, wherein the number of replica data to be stored is the same as the number of physical fault domains of the OSD cluster.
9. An electronic device, comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete communication with each other through the communication bus;
the memory is used for storing a computer program;
the processor, when executing the program stored in the memory, implementing the method steps of any of claims 1-4.
CN201810400813.8A 2018-04-28 2018-04-28 Copy data storage method and device in ceph Active CN108846009B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810400813.8A CN108846009B (en) 2018-04-28 2018-04-28 Copy data storage method and device in ceph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810400813.8A CN108846009B (en) 2018-04-28 2018-04-28 Copy data storage method and device in ceph

Publications (2)

Publication Number Publication Date
CN108846009A CN108846009A (en) 2018-11-20
CN108846009B true CN108846009B (en) 2021-02-05

Family

ID=64212397

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810400813.8A Active CN108846009B (en) 2018-04-28 2018-04-28 Copy data storage method and device in ceph

Country Status (1)

Country Link
CN (1) CN108846009B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112068976A (en) * 2019-06-10 2020-12-11 北京京东尚科信息技术有限公司 Data backup storage method and device, electronic equipment and storage medium
CN112578992B (en) * 2019-09-27 2022-07-22 西安华为技术有限公司 Data storage method and data storage device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778255A (en) * 2014-02-25 2014-05-07 深圳市中博科创信息技术有限公司 Distributed file system and data distribution method thereof
CN107291594A (en) * 2017-06-30 2017-10-24 上海白虹软件科技股份有限公司 The device and method that openstack platforms are monitored and managed to ceph
CN107704212A (en) * 2017-10-31 2018-02-16 紫光华山信息技术有限公司 A kind of data processing method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10222986B2 (en) * 2015-05-15 2019-03-05 Cisco Technology, Inc. Tenant-level sharding of disks with tenant-specific storage modules to enable policies per tenant in a distributed storage system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778255A (en) * 2014-02-25 2014-05-07 深圳市中博科创信息技术有限公司 Distributed file system and data distribution method thereof
CN107291594A (en) * 2017-06-30 2017-10-24 上海白虹软件科技股份有限公司 The device and method that openstack platforms are monitored and managed to ceph
CN107704212A (en) * 2017-10-31 2018-02-16 紫光华山信息技术有限公司 A kind of data processing method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于Ceph的分布式存储节能技术研究与实现;沈良好;《中国优秀硕士学位论文全文数据库》;20170315;第3-42页 *

Also Published As

Publication number Publication date
CN108846009A (en) 2018-11-20

Similar Documents

Publication Publication Date Title
CN108829738B (en) Data storage method and device in ceph
CN108897628B (en) Method and device for realizing distributed lock and electronic equipment
CN108121804B (en) Cross-region distributed data storage method, device, terminal and storage medium
CN107133228A (en) A kind of method and device of fast resampling
CN108846009B (en) Copy data storage method and device in ceph
CN112463058B (en) Fragmented data sorting method and device and storage node
CN108648156B (en) Method and device for marking stray points in point cloud data, electronic equipment and storage medium
CN103324713A (en) Data processing method and device in multistage server and data processing system
CN108804568B (en) Method and device for storing copy data in Openstack in ceph
CN115756955A (en) Data backup and data recovery method and device and computer equipment
US20150278543A1 (en) System and Method for Optimizing Storage of File System Access Control Lists
CN112650692A (en) Heap memory allocation method, device and storage medium
CN111563115B (en) Statistical method and device for data distribution information in distributed database
CN111046004B (en) Data file storage method, device, equipment and storage medium
CN114138181A (en) Method, device, equipment and readable medium for placing, grouping and selecting owners in binding pool
CN112256691B (en) Data mapping method and device and electronic equipment
CN105389394A (en) Data request processing method and device based on a plurality of database clusters
CN115865839B (en) ACL management method, ACL management device, communication equipment and storage medium
CN111404828A (en) Method and device for realizing global flow control
CN115687359A (en) Data table partitioning method and device, storage medium and computer equipment
CN112000482B (en) Memory management method and device, electronic equipment and storage medium
CN113849482A (en) Data migration method and device and electronic equipment
CN108984780B (en) Method and device for managing disk data based on data structure supporting repeated key value tree
CN110058790B (en) Method, apparatus and computer program product for storing data
CN111884932A (en) Link determination method, device, equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant