CN109582509A - Distributed file system disaster tolerance configuration method, device and readable storage medium storing program for executing - Google Patents

Distributed file system disaster tolerance configuration method, device and readable storage medium storing program for executing Download PDF

Info

Publication number
CN109582509A
CN109582509A CN201710910147.8A CN201710910147A CN109582509A CN 109582509 A CN109582509 A CN 109582509A CN 201710910147 A CN201710910147 A CN 201710910147A CN 109582509 A CN109582509 A CN 109582509A
Authority
CN
China
Prior art keywords
disaster tolerance
hard disk
performance
file system
distributed file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710910147.8A
Other languages
Chinese (zh)
Inventor
宋柏森
许显月
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201710910147.8A priority Critical patent/CN109582509A/en
Publication of CN109582509A publication Critical patent/CN109582509A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3034Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a storage system, e.g. DASD based or network based
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of distributed file system disaster tolerance configuration method, device and computer readable storage mediums, this method comprises: obtaining the performance of multiple hard disks in distributed file system where object storage device node;It is multiple capability containers by multiple hard disk partitions according to the performance of multiple hard disks;According to preset disaster tolerance domain, multiple capability containers are divided into one or more disaster tolerance containers and generate corresponding disaster tolerance rule;Topology information is generated according to one or more disaster tolerance containers and corresponding disaster tolerance rule;It sends topology information in the cluster of distributed file system.It according to the technique and scheme of the present invention, is multiple capability containers by hard disk partition according to performance, it is contemplated that hard disk performance improves Information Security;It is basic further division disaster tolerance container and disaster tolerance rule with capability container according still further to the disaster tolerance region that user divides, and generates topology information and be issued in cluster, disaster tolerance configures the demand for meeting user.

Description

Distributed file system disaster tolerance configuration method, device and readable storage medium storing program for executing
Technical field
The present invention relates to field of computer technology more particularly to a kind of distributed file system disaster tolerance configuration methods, device And computer readable storage medium.
Background technique
CEPH (distributed file system) is a distributed mass memory system increased income by American Red cap company, reason It is limited by the memory capacity of the upper not upper limit.CEPH, which is externally distributed, provides block, three kinds of file, object storage modes, is current non- Often active distributed memory system.
CEPH, which is different from traditional storage system, to be proposed based on CRUSH algorithm (a pseudo-random data Distribution Algorithm) The strategy of data distribution is carried out, and CRUSH algorithm is a kind of algorithm of decentralization.Traditional storage system needs metadata to take Device be engaged in search the storage location of data, the presence of meta data server causes to be easy to appear hot spot and Single Point of Faliure and performance Bottleneck.And CEPH only needs to be calculated in client by CRUSH algorithm, so that it may know the storage direction of data, keep away Hot spot and Single Point of Faliure, and performance with higher are exempted from.The CRUSH algorithm of CEPH be by CRUSH MAP (topology) into Row control, after the completion of CEPH deployment, it can default and generate a CRUSH MAP.User can pass through if there is the demand of oneself Manual editing CRUSH MAP oneself defines CRUSH MAP.CRUSH MAP in CEPH similar to a kind of physical topology figure description, CRUSH algorithm can complete the calculating of data distribution by analyzing CRUSH MAP.
The generation of the CRUSH MAP of CEPH at present needs user to be familiar with CRUSH algorithm comparison, and requires to CRUSH Some concepts of MAP will understand and be familiar with, and skilled some relevant tools of CEPH of grasp using and ordering, to Family requires relatively high.And it is all the mode of operation of the interaction of order line, it is very not square for commercial and production environment use Just.The CRUSH rule that CEPH default generates at the same time, is based only on the other disaster tolerance of host-level, there is no consider as shown in Figure 1 To actual physics networking and disaster tolerance the case where, especially for different performance hard-disc storage server composition cluster, Such as in a big cluster, there are up to a hundred storage servers, the performance of the hard disk of different storage servers may be different, point There is not the storage service of the server of economic hard disk HDD (mechanical hard disk) and the hard disk of high performance SSD (solid state hard disk) Device.If the CRUSH storage rule of the CEPH using default, actual disaster tolerance situation would not be considered, for data Safety make not ensure, and cannot distinguish between the hard disk of different performance, and then have the loss of performance.If to realize The storage pool of different performance and the division in disaster tolerance domain need user manual modification CRUSH MAP, but for one by up to a hundred The big cluster of even thousands of a storage server compositions, administrator needs to log in each machine, and checks each block disk Performance, manual editing CRUSH MAP later, this is not only a kind of time-consuming and laborious work, and error-prone, subsequent liter Grade and maintenance are all very troublesome.
Summary of the invention
It is a primary object of the present invention to propose that a kind of distributed file system disaster tolerance configuration method, device and computer can Read storage medium, it is intended to configure automatically according to the disaster tolerance of distributed file system, so that the disaster tolerance that disaster tolerance configuration meets user needs It asks, and avoids occurring the loss of performance.
To achieve the above object, the present invention provides a kind of distributed file system disaster tolerance configuration methods, including following step It is rapid: to obtain the performance of multiple hard disks in distributed file system where object storage device node;According to the multiple hard disk Performance, by the multiple hard disk partition be multiple capability containers;According to preset disaster tolerance domain, the multiple capability container is drawn It is divided into one or more disaster tolerance containers and generates corresponding disaster tolerance rule;According to one or more of disaster tolerance containers and accordingly Disaster tolerance rule generates topology information;The topology information is sent in the cluster of the distributed file system.
Optionally, distributed file system disaster tolerance configuration method above-mentioned, the performance according to the multiple hard disk will Whether the multiple hard disk partition is multiple capability containers, specifically include: being mechanical hard disk or solid-state according to the multiple hard disk Hard disk determines the performance of the multiple hard disk.
Optionally, distributed file system disaster tolerance configuration method above-mentioned, the performance according to the multiple hard disk will The multiple hard disk partition is multiple capability containers, further includes: for the mechanical hard disk in the multiple hard disk, obtains the machine The revolution of tool hard disk;According to the revolution of the mechanical hard disk, the performance of the mechanical hard disk is determined.
Optionally, distributed file system disaster tolerance configuration method above-mentioned, which is characterized in that believe the topology described Before breath is sent in the cluster of the distributed file system, further includes: judge whether the topology information can cause currently The disaster tolerance rule that uses is deleted, and executes described send the distributed text for the topology information when the judgment result is No In the cluster of part system.
Optionally, distributed file system disaster tolerance configuration method above-mentioned is right in the acquisition distributed file system Before the performance of multiple hard disks as where storage devices node, further includes: judge each object in the distributed file system Whether the hard disk where storage devices node updates, and executes the acquisition distributed field system when the judgment result is yes The performance of multiple hard disks in system where object storage device node.
To achieve the above object, the present invention also provides a kind of distributed file system disaster tolerance configuration devices, comprising: hard disk Module can be obtained, for obtaining the performance of multiple hard disks in distributed file system where object storage device node;Performance The multiple hard disk partition is multiple capability containers for the performance according to the multiple hard disk by container division module;Disaster tolerance Container division module, for according to preset disaster tolerance domain, the multiple capability container to be divided into one or more disaster tolerance containers And generate corresponding disaster tolerance rule;Topology information generation module, for according to one or more of disaster tolerance containers and accordingly Disaster tolerance rule generates topology information;Topology information sending module, for sending the distributed document for the topology information In the cluster of system.
Optionally, distributed file system disaster tolerance configuration device above-mentioned, the hard disk performance obtain module according to Whether multiple hard disks are mechanical hard disk or solid state hard disk, determine the performance of the multiple hard disk.
Optionally, distributed file system disaster tolerance configuration device above-mentioned, the hard disk performance obtain module for described Mechanical hard disk in multiple hard disks, obtains the revolution of the mechanical hard disk, and according to the revolution of the mechanical hard disk, determine described in The performance of mechanical hard disk.
Optionally, distributed file system disaster tolerance configuration device above-mentioned, further includes: abnormality detection module, for judging Whether the topology information can cause currently used disaster tolerance rule to be deleted, when the judgment result is No the topology information hair Amplification module executes in the cluster for sending the topology information to the distributed file system.
Optionally, distributed file system disaster tolerance configuration device above-mentioned, further includes: hard disk updates detection module, is used for Judge whether the hard disk in the distributed file system where each object storage device node updates, and in judging result The hard disk performance obtains module and executes in the acquisition distributed file system where object storage device node when to be The performance of multiple hard disks.
To achieve the above object, the present invention also provides a kind of computer readable storage medium, the computer-readable storages Media storage has one or more program, and one or more of programs can be executed by one or more processor, with The step of realizing distributed file system disaster tolerance configuration method above-mentioned.
According to above technical scheme, it is known that distributed file system disaster tolerance configuration method, device and computer of the invention can Storage medium is read to have at least the following advantages:
According to the technique and scheme of the present invention, the hard disk where object storage device node each for distributed file system, Automatically detect its performance, according to performance by hard disk partition be multiple capability containers, it is contemplated that hard disk performance improves data safety Property;It is basic further division disaster tolerance container and disaster tolerance rule with capability container according still further to the disaster tolerance region that user divides, and raw It is issued in cluster at topology information, disaster tolerance configures the demand for meeting user, and does not need user and configure manually.
Detailed description of the invention
Fig. 1 is the architecture diagram of the distributed file system of the prior art;
Fig. 2 is the flow chart of distributed file system disaster tolerance configuration method according to an embodiment of the invention;
Fig. 3 is the flow chart of distributed file system disaster tolerance configuration method according to an embodiment of the invention;
Fig. 4 is the architecture diagram of distributed file system according to an embodiment of the invention;
Fig. 5 is the block diagram of distributed file system disaster tolerance configuration device according to an embodiment of the invention;
Fig. 6 is the block diagram of distributed file system disaster tolerance configuration device according to an embodiment of the invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
In subsequent description, it is only using the suffix for indicating such as " module ", " component " or " unit " of element Be conducive to explanation of the invention, itself there is no a distinctive meaning.Therefore, " module ", " component " or " unit " can mix Ground uses.
As shown in Fig. 2, providing a kind of distributed file system disaster tolerance configuration method, purpose in one embodiment of the present of invention It is the disaster tolerance storage resource for preferably providing different performance based on CEPH for users to use, guarantees that user is using CEPH cluster When, the storage resource of different storage performances can be used, and disaster tolerance domain is created by user and ensure that the safety of data Property.The distributed file system disaster tolerance configuration method of the present embodiment the following steps are included:
Step S210 obtains the performance of multiple hard disks in distributed file system where object storage device node.
In the present embodiment, disaster tolerance domain is created using a kind of performance based on the hard disk and by user to realize different storages The storage resource of performance and Information Security provides scheme.Firstly, the technical solution of the present embodiment can be managed at WEB (page) It is realized on server, after CEPH cluster is taken over by WEB management server, in each CEPH node, the management system of WEB is all real An existing client-side program, inquired by client-side program each CEPH OSD (Object Store Device, i.e., pair As storing equipment) performance of the hard disk of node deployment, and these data are reported to web server.
Multiple hard disk partitions are multiple capability containers according to the performance of multiple hard disks by step S220.
In the present embodiment, after hard disk performance of the WEB server by the OSD for being collected into all nodes of cluster, according to hard The performance profile of disk creates different CRUSH BUCKET (container), for example, HDD all hard disks be created as one it is Eco-power All hard disks of CRUSH BUCKET, SSD create a high performance CRUSH BUCKET.
Multiple capability containers are divided into one or more disaster tolerance containers and life according to preset disaster tolerance domain by step S230 At corresponding disaster tolerance rule.
In the present embodiment, preset disaster tolerance domain includes but is not limited to the disaster tolerance domain that user divides.It is initially accessed cluster All hosts are all in the same default disaster tolerance domain, and disaster tolerance domains different in interface manual creation by user later will pacify The host that mono- group of Quan Xingwei is put among the same disaster tolerance domain.In the different disaster tolerance domains created based on user, before not On the CRUSH BUCKET of same performance, it is further continued for creating the CRUSH BUCKET of the different performance based on different disaster tolerance domains, and Generate corresponding CRUSH RULE (rule).
Step S240 generates topology information according to one or more disaster tolerance containers and corresponding disaster tolerance rule.
Step S250, sends topology information in the cluster of distributed file system.
It in the present embodiment, will be under CRUSH MAP (topology) text above by preset CRUSH TOOL (tool) It is dealt into CEPH cluster.In this way user can have a different performance and data safety storage resource for users to use.
It in the technical scheme of this embodiment, can be with firstly, actual performance by analyzing the hard disk where each OSD Obtain the true storage performance of each OSD;Secondly, difference can be provided by dividing to OSD according to the performance of hard disk For users to use, there will be no the wastes of storage resource in this way for the storage resource of performance, can also give full play to the advantage of CEPH; Again, disaster tolerance domain is created by user oneself, can guarantees that the OSD of different performance is distributed in different disaster tolerance domains in this way.Cause This not will cause data when a disaster tolerance domain is damaged because of power down or other reasons according to the technical solution of the present embodiment Loss, and then ensure that the safety of data.
As shown in figure 3, providing a kind of distributed file system disaster tolerance configuration method, this reality in one embodiment of the present of invention Apply the distributed file system disaster tolerance configuration method of example the following steps are included:
Step S310, judges whether the hard disk in distributed file system where each object storage device node occurs more Newly.
In the present embodiment, generation dilatation, capacity reducing, OSD hard disk replacement situation Shi Douhui cause OSDMAP's in CEPH Variation, the variation of detection OSDMAP automatically at this time, when OSDMAP variation, can from newly triggering above-mentioned CRUSH analysis process, with New satisfactory CRUSH MAP is generated, for users to use.It can be seen that the technical solution of the present embodiment, the above-mentioned feelings of automatic identification Condition is simultaneously handled for each case distribution, be ensure that the transparent of user, is facilitated subsequent O&M and use.Whole Improve availability.
Step S320 is obtained more where object storage device node in distributed file system when the judgment result is yes The performance of a hard disk.
In the present embodiment, as shown in figure 4, for stored CEPH cluster, on each node of CEPH cluster It (can realize) analysis present node deploys what clothes of CEPH according to the technical solution of the present embodiment by the demons of node Business.The information for mainly obtaining the OSD of node, by obtaining the information of OSD, where can collecting the OSD of this node The type of hard disk is a kind of hard disk of the types such as HDD, SSD, NVME (logical device interface standard), machinery is examined and seized and takes this These information are returned to server by command channel later by the rotary speed information of a hard disk.
Whether step S330 is mechanical hard disk or solid state hard disk according to multiple hard disks, determines the performance of multiple hard disks.
Step S340 obtains the revolution of mechanical hard disk for the mechanical hard disk in multiple hard disks;According to turning for mechanical hard disk Number, determines the performance of mechanical hard disk.
In the present embodiment, after the performance data for collecting the hard disk of all OSD of CEPH cluster, hard disc data is carried out Performance divides, and on the basis of existing CRUSH MAP, creates the CRUSH BUCKET of three types, is economy, property respectively The CRUSH BUCKET of these three types of creation, is put into the disaster tolerance domain of default, and will be newly-generated by energy, high-performance later CRUSH MAP is issued to CEPH cluster by command channel.
Specifically, the acquisition of hard disk type and revolving speed is carried out for the hard disk of all OSD nodes.It checks whether to acquire The performance data of all OSD in cluster if it is checks whether the disk of each OSD is mechanical disk, if not adds this OSD Enter high performance CRUSH BUCKET, if whether the revolving speed for being to look at this OSD is higher than performance threshold, if being higher than performance threshold High performance CRUSH BUCKET is added in this OSD, if checking whether the revolving speed of this OSD is higher than economic door lower than performance threshold Limit, if being higher than the CRUSH BUCKET that performance is added in this OSD by economic thresholding, if this OSD be added lower than economic thresholding Economic CRUSH BUCKET.
Wherein, if there is the OSD of specified performance, then the performance of hard disk is corresponded to according to specified OSD performance modification OSD, The CRUSH BUCKET based on performance is generated according to above-mentioned all processes later.
Multiple capability containers are divided into one or more disaster tolerance containers and life according to preset disaster tolerance domain by step S350 At corresponding disaster tolerance rule.
In the present embodiment, the client end interface of a figure can be provided, for user so that user divides disaster tolerance domain;With Family carries out the creation and division in disaster tolerance domain in the client end interface of figure, later on original basis performance CRUSH BUCKET On, continue the BUCKET for generating disaster tolerance, and generate the CRUSH RULE rule based on disaster tolerance domain of different performances respectively, in turn The storage resource based on different performance safety disaster tolerance domains is provided to use to user.
Step S360 generates topology information according to one or more disaster tolerance containers and corresponding disaster tolerance rule.
Step S370, judges whether topology information can cause currently used disaster tolerance rule to be deleted.
In the present embodiment, first look at whether user creates disaster tolerance domain, according to the disaster tolerance of creation if creating Continue to divide disaster tolerance domain on the basis of performance domain in domain;If user does not create disaster tolerance domain, the disaster tolerance of a default is created Domain.
Step S380, when the judgment result is No sends topology information in the cluster of distributed file system.
In the present embodiment, analysis is carried out abnormality detection to the CRUSH MAP by generating, checking can cause to be made CRUSH RULE is deleted, if can cause the abnormality detections such as the degradation of PG (placement group), if abnormality detection condition knot Fruit cannot then be issued in CEPH cluster to be abnormal, therefore all operations before rollback, not be issued in CEPH cluster, and mention For error message, operation is terminated;CEPH cluster is carried out for the CRUSH MAP by abnormality detection to issue, if issuing failure, It then repeats to continue to issue, user can be provided if issuing successfully based on different hard disk performances and with disaster tolerance domain Storage resource uses.
According to the technical solution of the present embodiment, the cumbersome configuration of CEPH is avoided, enormously simplifies the step of O&M operation It suddenly, is a kind of full-automatic, intelligent, safe visualization CRUSH disaster tolerance configuration technology based on hard disk performance, and It is used widely in production environment.
As shown in figure 5, providing a kind of distributed file system disaster tolerance configuration device, purpose in one embodiment of the present of invention It is the disaster tolerance storage resource for preferably providing different performance based on CEPH for users to use, guarantees that user is using CEPH cluster When, the storage resource of different storage performances can be used, and disaster tolerance domain is created by user and ensure that the safety of data Property.The distributed file system disaster tolerance configuration device of the present embodiment comprises the following modules:
Hard disk performance obtains module 510, obtains multiple hard where object storage device node in distributed file system The performance of disk.
In the present embodiment, disaster tolerance domain is created using a kind of performance based on the hard disk and by user to realize different storages The storage resource of performance and Information Security provides scheme.Firstly, the technical solution of the present embodiment can be managed at WEB (page) It is realized on server, after CEPH cluster is taken over by WEB management server, in each CEPH node, the management system of WEB is all real An existing client-side program, inquired by client-side program each CEPH OSD (Object Store Device, i.e., pair As storing equipment) performance of the hard disk of node deployment, and these data are reported to web server.
Multiple hard disk partitions are multiple capability containers according to the performance of multiple hard disks by capability container division module 520.
In the present embodiment, after hard disk performance of the WEB server by the OSD for being collected into all nodes of cluster, according to hard The performance profile of disk creates different CRUSH BUCKET (container), for example, HDD all hard disks be created as one it is Eco-power All hard disks of CRUSH BUCKET, SSD create a high performance CRUSH BUCKET.
Multiple capability containers are divided into one or more appearances according to preset disaster tolerance domain by disaster tolerance container division module 530 Calamity container simultaneously generates corresponding disaster tolerance rule.
In the present embodiment, preset disaster tolerance domain includes but is not limited to the disaster tolerance domain that user divides.It is initially accessed cluster All hosts are all in the same default disaster tolerance domain, and disaster tolerance domains different in interface manual creation by user later will pacify The host that mono- group of Quan Xingwei is put among the same disaster tolerance domain.In the different disaster tolerance domains created based on user, before not On the CRUSH BUCKET of same performance, it is further continued for creating the CRUSH BUCKET of the different performance based on different disaster tolerance domains, and Generate corresponding CRUSH RULE (rule).
Topology information generation module 540 generates topology letter according to one or more disaster tolerance containers and corresponding disaster tolerance rule Breath.
Topology information sending module 550, sends topology information in the cluster of distributed file system.
It in the present embodiment, will be under CRUSH MAP (topology) text above by preset CRUSH TOOL (tool) It is dealt into CEPH cluster.In this way user can have a different performance and data safety storage resource for users to use.
It in the technical scheme of this embodiment, can be with firstly, actual performance by analyzing the hard disk where each OSD Obtain the true storage performance of each OSD;Secondly, difference can be provided by dividing to OSD according to the performance of hard disk For users to use, there will be no the wastes of storage resource in this way for the storage resource of performance, can also give full play to the advantage of CEPH; Again, disaster tolerance domain is created by user oneself, can guarantees that the OSD of different performance is distributed in different disaster tolerance domains in this way.Cause This not will cause data when a disaster tolerance domain is damaged because of power down or other reasons according to the technical solution of the present embodiment Loss, and then ensure that the safety of data.
As shown in fig. 6, providing a kind of distributed file system disaster tolerance configuration device, this reality in one embodiment of the present of invention The distributed file system disaster tolerance configuration device for applying example comprises the following modules:
Hard disk updates detection module 610, judges the hard disk in distributed file system where each object storage device node Whether update.
In the present embodiment, generation dilatation, capacity reducing, OSD hard disk replacement situation Shi Douhui cause OSDMAP's in CEPH Variation, the variation of detection OSDMAP automatically at this time, when OSDMAP variation, can from newly triggering above-mentioned CRUSH analysis process, with New satisfactory CRUSH MAP is generated, for users to use.It can be seen that the technical solution of the present embodiment, the above-mentioned feelings of automatic identification Condition is simultaneously handled for each case distribution, be ensure that the transparent of user, is facilitated subsequent O&M and use.Whole Improve availability.
Hard disk performance obtains module 620, obtains object storage device in distributed file system when the judgment result is yes The performance of multiple hard disks where node.
In the present embodiment, as shown in figure 4, for stored CEPH cluster, on each node of CEPH cluster It (can realize) analysis present node deploys what clothes of CEPH according to the technical solution of the present embodiment by the demons of node Business.The information for mainly obtaining the OSD of node, by obtaining the information of OSD, where can collecting the OSD of this node The type of hard disk is a kind of hard disk of the types such as HDD, SSD, NVME (logical device interface standard), machinery is examined and seized and takes this These information are returned to server by command channel later by the rotary speed information of a hard disk.
Whether capability container division module 630 is mechanical hard disk or solid state hard disk according to multiple hard disks, determines multiple hard disks Performance, and for the mechanical hard disk in multiple hard disks, obtain the revolution of mechanical hard disk;According to the revolution of mechanical hard disk, really Determine the performance of mechanical hard disk.
In the present embodiment, after the performance data for collecting the hard disk of all OSD of CEPH cluster, hard disc data is carried out Performance divides, and on the basis of existing CRUSH MAP, creates the CRUSH BUCKET of three types, is economy, property respectively The CRUSH BUCKET of these three types of creation, is put into the disaster tolerance domain of default, and will be newly-generated by energy, high-performance later CRUSH MAP is issued to CEPH cluster by command channel.
Specifically, the acquisition of hard disk type and revolving speed is carried out for the hard disk of all OSD nodes.It checks whether to acquire The performance data of all OSD in cluster if it is checks whether the disk of each OSD is mechanical disk, if not adds this OSD Enter high performance CRUSH BUCKET, if whether the revolving speed for being to look at this OSD is higher than performance threshold, if being higher than performance threshold High performance CRUSH BUCKET is added in this OSD, if checking whether the revolving speed of this OSD is higher than economic door lower than performance threshold Limit, if being higher than the CRUSH BUCKET that performance is added in this OSD by economic thresholding, if this OSD be added lower than economic thresholding Economic CRUSH BUCKET.
Wherein, if there is the OSD of specified performance, then the performance of hard disk is corresponded to according to specified OSD performance modification OSD, The CRUSH BUCKET based on performance is generated according to above-mentioned all processes later.
Multiple capability containers are divided into one or more appearances according to preset disaster tolerance domain by disaster tolerance container division module 640 Calamity container simultaneously generates corresponding disaster tolerance rule.
In the present embodiment, the client end interface of a figure can be provided, for user so that user divides disaster tolerance domain;With Family carries out the creation and division in disaster tolerance domain in the client end interface of figure, later on original basis performance CRUSH BUCKET On, continue the BUCKET for generating disaster tolerance, and generate the CRUSH RULE rule based on disaster tolerance domain of different performances respectively, in turn The storage resource based on different performance safety disaster tolerance domains is provided to use to user.
Topology information generation module 650 generates topology letter according to one or more disaster tolerance containers and corresponding disaster tolerance rule Breath.
Abnormality detection module 660, judges whether topology information can cause currently used disaster tolerance rule to be deleted.
In the present embodiment, first look at whether user creates disaster tolerance domain, according to the disaster tolerance of creation if creating Continue to divide disaster tolerance domain on the basis of performance domain in domain;If user does not create disaster tolerance domain, the disaster tolerance of a default is created Domain.
Topology information sending module 670, sends distributed file system for topology information when the judgment result is No In cluster.
In the present embodiment, analysis is carried out abnormality detection to the CRUSH MAP by generating, checking can cause to be made CRUSH RULE is deleted, if can cause the abnormality detections such as the degradation of PG (placement group), if abnormality detection condition knot Fruit cannot then be issued in CEPH cluster to be abnormal, therefore all operations before rollback, not be issued in CEPH cluster, and mention For error message, operation is terminated;CEPH cluster is carried out for the CRUSH MAP by abnormality detection to issue, if issuing failure, It then repeats to continue to issue, user can be provided if issuing successfully based on different hard disk performances and with disaster tolerance domain Storage resource uses.
According to the technical solution of the present embodiment, the cumbersome configuration of CEPH is avoided, enormously simplifies the step of O&M operation It suddenly, is a kind of full-automatic, intelligent, safe visualization CRUSH disaster tolerance configuration technology based on hard disk performance, and It is used widely in production environment.
A kind of computer readable storage medium, computer readable storage medium are additionally provided in one embodiment of the present of invention It is stored with one or more program, one or more program can be executed by one or more processor, following to realize Step:
Obtain the performance of multiple hard disks in distributed file system where object storage device node.
In the present embodiment, disaster tolerance domain is created using a kind of performance based on the hard disk and by user to realize different storages The storage resource of performance and Information Security provides scheme.Firstly, the technical solution of the present embodiment can be managed at WEB (page) It is realized on server, after CEPH cluster is taken over by WEB management server, in each CEPH node, the management system of WEB is all real An existing client-side program, inquired by client-side program each CEPH OSD (Object Store Device, i.e., pair As storing equipment) performance of the hard disk of node deployment, and these data are reported to web server.
It is multiple capability containers by multiple hard disk partitions according to the performance of multiple hard disks.
In the present embodiment, after hard disk performance of the WEB server by the OSD for being collected into all nodes of cluster, according to hard The performance profile of disk creates different CRUSH BUCKET (container), for example, HDD all hard disks be created as one it is Eco-power All hard disks of CRUSH BUCKET, SSD create a high performance CRUSH BUCKET.
According to preset disaster tolerance domain, multiple capability containers are divided into one or more disaster tolerance containers and generate corresponding hold Calamity rule.
In the present embodiment, preset disaster tolerance domain includes but is not limited to the disaster tolerance domain that user divides.It is initially accessed cluster All hosts are all in the same default disaster tolerance domain, and disaster tolerance domains different in interface manual creation by user later will pacify The host that mono- group of Quan Xingwei is put among the same disaster tolerance domain.In the different disaster tolerance domains created based on user, before not On the CRUSH BUCKET of same performance, it is further continued for creating the CRUSH BUCKET of the different performance based on different disaster tolerance domains, and Generate corresponding CRUSH RULE (rule).
Topology information is generated according to one or more disaster tolerance containers and corresponding disaster tolerance rule.
It sends topology information in the cluster of distributed file system.
It in the present embodiment, will be under CRUSH MAP (topology) text above by preset CRUSH TOOL (tool) It is dealt into CEPH cluster.In this way user can have a different performance and data safety storage resource for users to use.
It in the technical scheme of this embodiment, can be with firstly, actual performance by analyzing the hard disk where each OSD Obtain the true storage performance of each OSD;Secondly, difference can be provided by dividing to OSD according to the performance of hard disk For users to use, there will be no the wastes of storage resource in this way for the storage resource of performance, can also give full play to the advantage of CEPH; Again, disaster tolerance domain is created by user oneself, can guarantees that the OSD of different performance is distributed in different disaster tolerance domains in this way.Cause This not will cause data when a disaster tolerance domain is damaged because of power down or other reasons according to the technical solution of the present embodiment Loss, and then ensure that the safety of data.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the device that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or device institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or device.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in a storage medium In (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal (can be mobile phone, computer, service Device, air conditioner or network equipment etc.) execute method described in each embodiment of the present invention.
The embodiment of the present invention is described with above attached drawing, but the invention is not limited to above-mentioned specific Embodiment, the above mentioned embodiment is only schematical, rather than restrictive, those skilled in the art Under the inspiration of the present invention, without breaking away from the scope protected by the purposes and claims of the present invention, it can also make very much Form, all of these belong to the protection of the present invention.

Claims (10)

1. a kind of distributed file system disaster tolerance configuration method, which comprises the following steps:
Obtain the performance of multiple hard disks in distributed file system where object storage device node;
It is multiple capability containers by the multiple hard disk partition according to the performance of the multiple hard disk;
According to preset disaster tolerance domain, the multiple capability container is divided into one or more disaster tolerance containers and generates corresponding hold Calamity rule;
Topology information is generated according to one or more of disaster tolerance containers and corresponding disaster tolerance rule;
The topology information is sent in the cluster of the distributed file system.
2. distributed file system disaster tolerance configuration method according to claim 1, which is characterized in that described according to described more The multiple hard disk partition is multiple capability containers, specifically included by the performance of a hard disk:
Whether it is mechanical hard disk or solid state hard disk according to the multiple hard disk, determines the performance of the multiple hard disk.
3. distributed file system disaster tolerance configuration method according to claim 2, which is characterized in that described according to described more The multiple hard disk partition is multiple capability containers by the performance of a hard disk, further includes:
For the mechanical hard disk in the multiple hard disk, the revolution of the mechanical hard disk is obtained;
According to the revolution of the mechanical hard disk, the performance of the mechanical hard disk is determined.
4. distributed file system disaster tolerance configuration method according to claim 1, which is characterized in that opened up described by described It flutters before information is sent in the cluster of the distributed file system, further includes:
Judge whether the topology information can cause currently used disaster tolerance rule to be deleted, executes institute when the judgment result is No It states and sends the topology information in the cluster of the distributed file system.
5. distributed file system disaster tolerance configuration method according to claim 1, which is characterized in that be distributed in the acquisition Before the performance of multiple hard disks in formula file system where object storage device node, further includes:
Judge whether the hard disk in the distributed file system where each object storage device node updates, and is judging As a result the performance to execute multiple hard disks in the acquisition distributed file system where object storage device node when being.
6. a kind of distributed file system disaster tolerance configuration device characterized by comprising
Hard disk performance obtains module, for obtaining multiple hard disks in distributed file system where object storage device node Performance;
The multiple hard disk partition is multiple performances for the performance according to the multiple hard disk by capability container division module Container;
Disaster tolerance container division module, for according to preset disaster tolerance domain, the multiple capability container to be divided into one or more Disaster tolerance container simultaneously generates corresponding disaster tolerance rule;
Topology information generation module, for generating topology letter according to one or more of disaster tolerance containers and corresponding disaster tolerance rule Breath;
Topology information sending module, for sending the topology information in the cluster of the distributed file system.
7. distributed file system disaster tolerance configuration device according to claim 6, which is characterized in that the hard disk performance obtains Whether modulus root tuber is mechanical hard disk or solid state hard disk according to the multiple hard disk, determines the performance of the multiple hard disk.
8. distributed file system disaster tolerance configuration device according to claim 7, which is characterized in that the hard disk performance obtains Modulus block obtains the revolution of the mechanical hard disk for the mechanical hard disk in the multiple hard disk, and according to the mechanical hard disk Revolution, determine the performance of the mechanical hard disk.
9. distributed file system disaster tolerance configuration device according to claim 6, which is characterized in that further include:
Abnormality detection module is being sentenced for judging whether the topology information can cause currently used disaster tolerance rule to be deleted Disconnected result when being no the topology information provide module and execute and described send the distributed field system for the topology information In the cluster of system.
10. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage have one or Multiple programs, one or more of programs can be executed by one or more processor, to realize in claim 1 to 5 The step of described in any item distributed file system disaster tolerance configuration methods.
CN201710910147.8A 2017-09-29 2017-09-29 Distributed file system disaster tolerance configuration method, device and readable storage medium storing program for executing Pending CN109582509A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710910147.8A CN109582509A (en) 2017-09-29 2017-09-29 Distributed file system disaster tolerance configuration method, device and readable storage medium storing program for executing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710910147.8A CN109582509A (en) 2017-09-29 2017-09-29 Distributed file system disaster tolerance configuration method, device and readable storage medium storing program for executing

Publications (1)

Publication Number Publication Date
CN109582509A true CN109582509A (en) 2019-04-05

Family

ID=65919137

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710910147.8A Pending CN109582509A (en) 2017-09-29 2017-09-29 Distributed file system disaster tolerance configuration method, device and readable storage medium storing program for executing

Country Status (1)

Country Link
CN (1) CN109582509A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222014A (en) * 2019-06-11 2019-09-10 苏州浪潮智能科技有限公司 Distributed file system crush map maintaining method and associated component
CN111026337A (en) * 2019-12-30 2020-04-17 中科星图股份有限公司 Distributed storage method based on machine learning and ceph thought
CN111090629A (en) * 2019-12-24 2020-05-01 上海达梦数据库有限公司 Data file storage method, device, equipment and storage medium
CN112349335A (en) * 2019-08-08 2021-02-09 佛山市顺德区顺达电脑厂有限公司 Method for locating hard disk physical installation position of cluster storage system

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101635638A (en) * 2008-07-25 2010-01-27 中兴通讯股份有限公司 Disaster tolerance system and disaster tolerance method thereof
US7770057B1 (en) * 2005-10-27 2010-08-03 Symantec Operating Corporation System and method for customized disaster recovery reports
CN102521389A (en) * 2011-12-23 2012-06-27 天津神舟通用数据技术有限公司 Postgresql database cluster system mixedly using solid state drives and hard disk drive and optimizing method thereof
CN103023968A (en) * 2012-11-15 2013-04-03 中科院成都信息技术有限公司 Network distributed storage and reading method for file
CN103500147A (en) * 2013-09-27 2014-01-08 浪潮电子信息产业股份有限公司 Embedded and layered storage method of PB-class cluster storage system
CN103929454A (en) * 2013-01-15 2014-07-16 中国移动通信集团四川有限公司 Load balancing storage method and system in cloud computing platform
CN104579765A (en) * 2014-12-27 2015-04-29 北京奇虎科技有限公司 Disaster tolerance method and device for cluster system
CN105095486A (en) * 2015-08-17 2015-11-25 浪潮(北京)电子信息产业有限公司 Cluster database disaster recovery method and device
US20160210204A1 (en) * 2013-12-17 2016-07-21 Hitachi Data Systems Corporation Distributed disaster recovery file sync server system
CN106534308A (en) * 2016-11-14 2017-03-22 中国银联股份有限公司 Method and device for solving data block access hotspot problem in distributed storage system
CN107145406A (en) * 2017-05-14 2017-09-08 四川盛世天成信息技术有限公司 A kind of disaster-tolerant backup method and system based on Clustering

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7770057B1 (en) * 2005-10-27 2010-08-03 Symantec Operating Corporation System and method for customized disaster recovery reports
CN101635638A (en) * 2008-07-25 2010-01-27 中兴通讯股份有限公司 Disaster tolerance system and disaster tolerance method thereof
CN102521389A (en) * 2011-12-23 2012-06-27 天津神舟通用数据技术有限公司 Postgresql database cluster system mixedly using solid state drives and hard disk drive and optimizing method thereof
CN103023968A (en) * 2012-11-15 2013-04-03 中科院成都信息技术有限公司 Network distributed storage and reading method for file
CN103929454A (en) * 2013-01-15 2014-07-16 中国移动通信集团四川有限公司 Load balancing storage method and system in cloud computing platform
CN103500147A (en) * 2013-09-27 2014-01-08 浪潮电子信息产业股份有限公司 Embedded and layered storage method of PB-class cluster storage system
US20160210204A1 (en) * 2013-12-17 2016-07-21 Hitachi Data Systems Corporation Distributed disaster recovery file sync server system
CN104579765A (en) * 2014-12-27 2015-04-29 北京奇虎科技有限公司 Disaster tolerance method and device for cluster system
CN105095486A (en) * 2015-08-17 2015-11-25 浪潮(北京)电子信息产业有限公司 Cluster database disaster recovery method and device
CN106534308A (en) * 2016-11-14 2017-03-22 中国银联股份有限公司 Method and device for solving data block access hotspot problem in distributed storage system
CN107145406A (en) * 2017-05-14 2017-09-08 四川盛世天成信息技术有限公司 A kind of disaster-tolerant backup method and system based on Clustering

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222014A (en) * 2019-06-11 2019-09-10 苏州浪潮智能科技有限公司 Distributed file system crush map maintaining method and associated component
CN112349335A (en) * 2019-08-08 2021-02-09 佛山市顺德区顺达电脑厂有限公司 Method for locating hard disk physical installation position of cluster storage system
CN111090629A (en) * 2019-12-24 2020-05-01 上海达梦数据库有限公司 Data file storage method, device, equipment and storage medium
CN111090629B (en) * 2019-12-24 2024-02-06 上海达梦数据库有限公司 Data file storage method, device, equipment and storage medium
CN111026337A (en) * 2019-12-30 2020-04-17 中科星图股份有限公司 Distributed storage method based on machine learning and ceph thought

Similar Documents

Publication Publication Date Title
US10649838B2 (en) Automatic correlation of dynamic system events within computing devices
CN109582509A (en) Distributed file system disaster tolerance configuration method, device and readable storage medium storing program for executing
CN105589812B (en) Disk fragments method for sorting, device and host
US9697053B2 (en) System and method for managing excessive distribution of memory
CN110784476A (en) Power monitoring active defense method and system based on virtualization dynamic deployment
CN107547273B (en) Method and system for guaranteeing high availability of virtual instance of power system
CN108062202A (en) A kind of file block storage method and system
JP5723990B2 (en) A method and system for defining an equivalent subset of agents to gather information for a fabric.
US10838830B1 (en) Distributed log collector and report generation
CN108121510A (en) OSD choosing methods, method for writing data, device and storage system
CN105701096A (en) Index generation method, data inquiry method, index generation device, data inquiry device and system
CN109495422A (en) Configuration method, device and the computer readable storage medium of virtual firewall
CN106170947A (en) A kind of alarm information processing method, relevant device and system
CN103891206A (en) Method and device for synchronizing network data flow detection status
CN104468282A (en) Cluster monitoring processing system and method
CN106407224A (en) Method and device for file compaction in KV (Key-Value)-Store system
CN108055228A (en) A kind of intelligent grid intruding detection system and method
CN109150869A (en) A kind of exchanger information acquisition analysis system and method
CN113660273B (en) Intrusion detection method and device based on deep learning under super fusion architecture
CN104618304A (en) Data processing method and data processing system
CN107249135A (en) Video data storage systems and its operating method and index server
CN106650425A (en) Method and device for controlling security sandbox
CN108073352A (en) Virtual disk processing method and processing device
CN109460345A (en) The calculation method and system of real time data
CN110061854A (en) A kind of non-boundary network intelligence operation management method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190405