CN109495543A - The management method and device of monitor in a kind of ceph cluster - Google Patents
The management method and device of monitor in a kind of ceph cluster Download PDFInfo
- Publication number
- CN109495543A CN109495543A CN201811204207.5A CN201811204207A CN109495543A CN 109495543 A CN109495543 A CN 109495543A CN 201811204207 A CN201811204207 A CN 201811204207A CN 109495543 A CN109495543 A CN 109495543A
- Authority
- CN
- China
- Prior art keywords
- monitor
- metric value
- stability metric
- backup
- given threshold
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/54—Presence management, e.g. monitoring or registration for receipt of user log-on information, or the connection status of the users
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0893—Assignment of logical groups to network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/16—Threshold monitoring
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Debugging And Monitoring (AREA)
- Hardware Redundancy (AREA)
Abstract
This application discloses a kind of management methods of monitor in ceph cluster, the monitor of ceph cluster includes several primary monitors and several backup monitors, and there is corresponding stability metric value for the maintenance of each monitor, method includes: when the network state for monitoring any monitor becomes DOWN from UP, by the cumulative increment of the stability metric value of any monitor;If any monitor is primary monitor, judge whether the stability metric value of any monitor is more than or equal to the first given threshold;If it is determined that result be it is yes, then select a stability metric value to be less than or equal to the backup monitor of the second given threshold as primary monitor, and using any monitor as backup monitor from several backup monitors.The role of each monitor is adjusted according to the stability metric value of each monitor using the above method, selects more stable monitor as primary monitor, to promote the stability of ceph cluster.
Description
Technical field
This application involves the management methods and dress of monitor in technical field of data storage more particularly to a kind of ceph cluster
It sets.
Background technique
Ceph is a kind of unification designed for outstanding performance, reliability and scalability, distributed file system.
In Ceph cluster, the status information of management, maintenance and publication cluster is collectively responsible for by several monitors (monitor);?
A leader (leader) can be selected in several monitor, member is elected in other common participations in these monitor
(peon) under the leader of the leader, the latest edition of cluster diagram (cluster map) is generated, then sends out the latest edition
The entire objects into Ceph cluster are sent to store equipment (Object-based Storage Device, OSD) and client
(Client).OSD carries out the maintenance of data using cluster map, and Client carries out seeking for data using cluster map
Location.In general Monitor can be individually deployed on physical host, Monitor and memory node can also be deployed in this
On physical host.
When carrying out leader election, a committee (quorum) is first collectively formed by monitor, then the committee
Member selects leader in inside.The a member of each monitor as quorum, for safeguarding the healthy shape of entire ceph cluster
Condition maintains every important information in ceph cluster, is the pivotal player in ceph cluster, the health status of Monitor will
Directly affect the stabilization of entire ceph cluster.
During leader election, Ceph can not externally provide service, until electing leader, and in leader
Leading under formed cluster map master version.If there are monitor to restart in quorum, in network exist concussion,
The unstable factors such as delay will cause and initiate leader election in quorum repeatedly.So, entire monitor cluster can be always
In election state, waste of resource is unfavorable for the stabilization of ceph cluster, and can not externally provide service.
Summary of the invention
The application provides the management method and device of monitor in a kind of ceph cluster, for solving to exist in the related technology
Lead to initiate leader election repeatedly in quorum since monitor frequently occurs exception, it is unstable so as to cause ceph cluster
It is fixed, the problem of service, can not be externally provided.
To achieve the above object, the embodiment of the present application the technical solution adopted is as follows:
In a first aspect, the embodiment of the present application provides a kind of management method of monitor in ceph cluster, above-mentioned ceph collection
The monitor of group includes several primary monitors and several backup monitors, and has corresponding stabilization for the maintenance of each monitor
Metric, the above method include:
When the network state for monitoring any monitor becomes DOWN from UP, stablizing for any of the above-described monitor is measured
It is worth the increment that adds up;
If any of the above-described monitor is primary monitor, judge whether the stability metric value of any of the above-described monitor is greater than
Equal to the first given threshold;
If it is determined that result be it is yes, then select stability metric value to be less than or equal to second from above-mentioned several backup monitors
The backup monitor of given threshold is as primary monitor, and using any of the above-described monitor as backup monitor, wherein above-mentioned
First given threshold is greater than above-mentioned second given threshold.
Optionally, after an increment that the stability metric value of any of the above-described monitor adds up, the above method further include:
Start the corresponding drop timer of any of the above-described monitor, and presses preset attenuation function within the current attenuation period
Decay to the stability metric value of any of the above-described monitor.
Optionally, it is above-mentioned within the current attenuation period by preset attenuation function to the corresponding stabilization of any monitor
The step of metric is decayed include:
Within the current attenuation period, declined with stability metric value of the specified attenuation coefficient to any of the above-described monitor
Subtract;
Wherein, above-mentioned specified attenuation coefficient meets: the stabilization of any of the above-described monitor in current attenuation end cycle
Metric is the half of the stability metric value of any of the above-described monitor when the current attenuation period starts.
Optionally, the above method further include:
Any of the above-described monitor was in the current attenuation period, if monitoring the network state of any of the above-described monitor by UP
Become DOWN, then by the cumulative increment of the stability metric value of any of the above-described monitor, and restarts drop timer, enter
Next damped cycle;
In the current attenuation end cycle of any of the above-described monitor, drop timer is restarted, so that above-mentioned
One monitor enters next damped cycle.
Optionally, the above method further include:
If the stability metric value after any of the above-described monitor decaying is less than or equal to third given threshold, by any of the above-described prison
The stability metric value of visual organ is set to initial value, wherein above-mentioned third given threshold is less than said one increment.
Optionally, a stability metric value is selected to be less than or equal to the second given threshold from above-mentioned several backup monitors
Backup monitor includes: as the step of primary monitor
It determines the stability metric value of each backup monitor, and judges whether there is two settings that stability metric value is less than or equal to
The backup monitor of threshold value;
If it is determined that there are the stability metric values of M backup monitor to be less than or equal to the second given threshold, then it is standby from above-mentioned M
The smallest backup monitor of stability metric value is selected as primary monitor in part monitor;
If it is determined that there are the stability metric value of N number of backup monitor be minimum value in above-mentioned M backup monitor, then from
A backup monitor is randomly choosed in above-mentioned N number of backup monitor as primary monitor.
Second aspect, the embodiment of the present application provide a kind of managing device of monitor in ceph cluster, above-mentioned ceph collection
The monitor of group includes several primary monitors and several backup monitors, and has corresponding stabilization for the maintenance of each monitor
Metric, above-mentioned apparatus include:
Monitoring unit, when for becoming DOWN from UP in the network state for monitoring any monitor, by any of the above-described prison
The cumulative increment of the stability metric value of visual organ;
Judging unit, for judging any of the above-described monitor when determining any of the above-described monitor is primary monitor
Whether stability metric value is more than or equal to the first given threshold;
Selecting unit, for selecting one to stablize measurement from above-mentioned several backup monitors when determining result is to be
Value is less than or equal to the backup monitor of the second given threshold as primary monitor, and supervises any of the above-described monitor as backup
Visual organ, wherein above-mentioned first given threshold is greater than above-mentioned second given threshold.
Optionally, above-mentioned apparatus further includes measurement adjustment unit, is added up by the stability metric value of any of the above-described monitor
After one increment, above-mentioned measurement adjustment unit is used for:
Start the corresponding drop timer of any of the above-described monitor, and presses preset attenuation function within the current attenuation period
Decay to the stability metric value of any of the above-described monitor.
Optionally, above-mentioned that stablizing for any monitor is measured by preset attenuation function within the current attenuation period
When value is decayed, above-mentioned measurement adjustment unit is used for:
Within the current attenuation period, declined with stability metric value of the specified attenuation coefficient to any of the above-described monitor
Subtract;
Wherein, above-mentioned specified attenuation coefficient meets: the stabilization of any of the above-described monitor in current attenuation end cycle
Metric is the half of the stability metric value of any of the above-described monitor when the current attenuation period starts.
Optionally, any of the above-described monitor was in the current attenuation period, if above-mentioned monitoring unit monitor it is any of the above-described
The network state of monitor becomes DOWN from UP, then above-mentioned measurement adjustment unit tires out the stability metric value of any of the above-described monitor
Add an increment, and restart drop timer, so that any of the above-described monitor enters next damped cycle;
At the end of the damped cycle of any of the above-described monitor, above-mentioned measurement adjustment unit restarts drop timer,
So that any of the above-described monitor enters next damped cycle.
Optionally, above-mentioned measurement adjustment unit is also used to:
If the stability metric value after any of the above-described monitor decaying is less than or equal to third given threshold, by any of the above-described prison
The stability metric value of visual organ is set to initial value, wherein above-mentioned third given threshold is less than said one increment.
Optionally, a stability metric value is selected to be less than or equal to the second given threshold from above-mentioned several backup monitors
When backup monitor is as primary monitor, above-mentioned selecting unit is used for:
It determines the stability metric value of each backup monitor, and judges whether there is two settings that stability metric value is less than or equal to
The backup monitor of threshold value;
If it is determined that there are the stability metric values of M backup monitor to be less than or equal to the second given threshold, then it is standby from above-mentioned M
The smallest backup monitor of stability metric value is selected as primary monitor in part monitor;
If it is determined that there are the stability metric value of N number of backup monitor be minimum value in the M backup monitor, then from
A backup monitor is randomly choosed in above-mentioned N number of backup monitor as primary monitor.
The third aspect, the embodiment of the present application also provides a kind of calculating equipment, which includes:
Memory, for storing program instruction;
Processor executes as above for calling the program instruction stored in above-mentioned memory according to the program instruction of acquisition
The step of stating any one of first aspect above-mentioned method.
Fourth aspect, it is above-mentioned computer-readable to deposit the embodiment of the present application also provides a kind of computer readable storage medium
Storage media is stored with computer executable instructions, and above-mentioned computer executable instructions are for making above-mentioned computer execute such as above-mentioned the
The step of any one of one side above method.
Achieved by the application the utility model has the advantages that
In conclusion in the embodiment of the present application, it, will when the network state for monitoring any monitor becomes DOWN from UP
The corresponding stability metric value of any of the above-described monitor adds up an increment;Stability metric value is judged whether there is more than or equal to first
The primary monitor of given threshold;If it is determined that result be it is yes, then from above-mentioned several backup monitors select one stablize measurement
Value is less than or equal to the backup monitor of the second given threshold as primary monitor, and using any of the above-described primary monitor as standby
Part monitor, wherein above-mentioned first given threshold is greater than above-mentioned second given threshold.
Using monitor management method provided by the embodiments of the present application, by the network state of each monitor, dynamic is adjusted
The stability metric value of each monitor in ceph cluster safeguards the health status of each monitor, and according to the healthy shape of each monitor
The role of the dynamic auto each monitor of adjustment of condition, selects more stable monitor as primary monitor, reduces primary monitor
The probability to break down, promotes the stability of quorum, to promote the stability of ceph cluster.
Detailed description of the invention
Fig. 1 is role's configuration schematic diagram of each host in a kind of ceph cluster provided in the embodiment of the present application;
Fig. 2 is the detail flowchart of the management method of monitor in a kind of ceph cluster provided in the embodiment of the present application;
Fig. 3 is the variation schematic diagram of the stability metric value of the monitor provided in the embodiment of the present application;
Fig. 4 is the structural schematic diagram of the managing device of monitor in a kind of ceph cluster provided in the embodiment of the present application;
Fig. 5 is a kind of structural schematic diagram of the calculating equipment provided in the embodiment of the present application.
The embodiments will be further described with reference to the accompanying drawings for realization, functional characteristics and the advantage of the application purpose.
Specific embodiment
Firstly, term "and" in the embodiment of the present application, a kind of only incidence relation for describing affiliated partner, expression can be with
There are three kinds of relationships, for example, A and B, can indicate: individualism A exists simultaneously A and B, these three situations of individualism B.Separately
Outside, character "/" herein typicallys represent the relationship that forward-backward correlation object is a kind of "or".
When the application refers to ordinal numbers such as " first ", " second ", " third " or " the 4th ", unless based on context it is true
The meaning of real order of representation, it is appreciated that being only to distinguish to be used.
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, is not whole embodiments.Based on this
Embodiment in application, every other reality obtained by those of ordinary skill in the art without making creative efforts
Example is applied, the application scope of protection is belonged to.
In the embodiment of the present application, pre-defining in ceph cluster has several primary monitors and backup monitor, and is directed to
Each primary monitor and the maintenance of each backup monitor have corresponding stability metric value.
Wherein, primary monitor is a member in the committee (quorum), and primary monitor is for safeguarding entire ceph collection
The health status of group safeguards every important information in entire ceph cluster;Backup monitor is candidate monitor.One monitoring
The corresponding stability metric value of device can be used for characterizing the frequency that a monitor breaks down whithin a period of time, can be used to table
Levy the stability of a monitor.
Illustratively, as shown in fig.1, in the embodiment of the present application, role's configuration schematic diagram of each host in ceph cluster.
Several primary monitor (e.g., host 10, host 11 ... ..., host n) and several backup monitors are included at least in Ceph cluster
(e.g., host 20, host 21 ... ..., host m) can also include further other hosts (e.g., host 30, host
31 ... ..., host p).Primary monitor and backup monitor both can be used as monitoring node and use, and can also be used as storage section
Point uses, and other hosts in addition to above-mentioned primary monitor and backup monitor are only capable of using as memory node.
The embodiment of the present application proposes a kind of management method of monitor in ceph cluster, for dynamically adjusting in ceph cluster
As the primary monitor of a member in quorum, selects more stable monitor as primary monitor, reduce primary monitoring
The probability that device breaks down, promotes the stability of quorum, to promote the stability of ceph cluster.
Illustratively, as shown in fig.2, in a kind of ceph cluster provided in the embodiment of the present application monitor manager
The detailed process of method is as follows:
Step 200: when the network state for monitoring any monitor becomes DOWN from UP, by any of the above-described monitor
Stability metric value adds up an increment.
In the embodiment of the present application, for monitor each in ceph cluster (including primary monitor and backup monitor) point
Wei Hu there be corresponding stability metric value, the initial value of the stability metric value of each monitor is 0.It is provided by the present application executing
When monitor management method, the network state of each monitor is monitored, the network state of a monitor becomes from UP if monitoring
When DOWN, determine that monitor goes offline, at this time, it may be necessary to the stability metric value of a monitor be adjusted, specifically, this is supervised
The cumulative increment k (e.g., k=100) of the stability metric value of visual organ, that is to say, that the every DOWN mono- of the network of a monitor
It is secondary, the stability metric value of a monitor just add up k.
In practical application, when the network state of a monitor becomes UP from DOWN, the stability of a monitor
Magnitude does not adjust accordingly.
For example, it is assumed that an increment k is 100, when monitoring that the network state of primary monitor 1 becomes DOWN from UP,
The stability metric value of primary monitor 1 is 130, then, it is needed at this time on the basis of the original stability metric value of primary monitor 1
Upper cumulative 100, the stability metric value of primary monitor 1 is 230 after adding up.
In the embodiment of the present application, a kind of optional embodiment is to add up by the stability metric value of any of the above-described monitor
After one increment, start the corresponding drop timer of any of the above-described monitor, and decline within the current attenuation period by preset
Subtraction function decays to the stability metric value of any of the above-described monitor.
That is, having corresponding drop timer, damped cycle for each monitor arrangement in the embodiment of the present application
For t, when the stability metric value of any monitor is not initial value, the corresponding drop timer starting of any monitor should
Any monitor is in a damped cycle (i.e. current attenuation period), and in this prior in the period, any monitor
Stability metric value linearly declines, and fall need to meet claimed below: in this prior when end cycle, any monitor pair
The stability metric value answered drops to designated value.
Specifically, a kind of optional embodiment is, by preset attenuation function to any of the above-described within the current attenuation period
The step of stability metric value of monitor is decayed includes: within the current attenuation period, with specified attenuation coefficient to above-mentioned
The stability metric value of any monitor is decayed;Wherein, above-mentioned specified attenuation coefficient meets: in current attenuation end cycle
When any of the above-described monitor stability metric value be any of the above-described monitor when the current attenuation period starts stability metric value
Half.
For example, it is assumed that damped cycle is t, when the current attenuation period starts, stability metric value is primary monitor 1
180, in order to ensure in current attenuation end cycle, stability metric value can drop to 90, then, in the current attenuation period
It is interior, it needs to decay with specified attenuation coefficient (e.g., ﹣ 90/t) to 180.Obviously, it in the case where damped cycle is fixed, declines
The size for subtracting coefficient is related to the size of the current stability metric value of monitor.
Further, any of the above-described monitor was in the current attenuation period, if the network state of any of the above-described monitor
DOWN is become from UP, then by the cumulative increment of the stability metric value of any of the above-described monitor, and restarts drop timer,
So that any of the above-described monitor enters next damped cycle.
When i.e. a monitor is in damped cycle, if monitoring again, the network state of a monitor becomes from UP
When for DOWN (i.e. a monitor going offline again), need the cumulative increment of the stability metric value of a monitor
Meanwhile drop timer is restarted, so that a monitor reenters a new damped cycle.
For example, it is assumed that an increment k is 100, in the current attenuation period, need with decay coefficient K1By primary monitor 1
Stability metric value be reduced to 115 from 230, be reduced to 150 (not being reduced to 115 also) in the stability metric value of primary monitor 1
When, if monitoring again, the network state of primary monitor 1 becomes DOWN from UP, by the stability metric value of primary monitor 1
Cumulative 100,250 are increased to, and restart the corresponding drop timer of primary monitor 1, terminates the current attenuation period, go forward side by side
Enter next damped cycle, and in next damped cycle, it needs with decay coefficient K2By the stability metric value of primary monitor 1 from
250 are reduced to 125.
In the current attenuation end cycle of any of the above-described monitor, drop timer is restarted, so that above-mentioned
One monitor enters next damped cycle.
For example, it is assumed that corresponding 1 time-out of fade timers of primary monitor 1, the corresponding stability metric value of primary monitor 1
From 400 be reduced to 200 after, fade timers 1 are restarted immediately, so that primary monitor 1 enters the next decaying of decaying
Period.
Step 210: if any of the above-described monitor is primary monitor, judging the stability metric value of any of the above-described monitor
Whether the first given threshold is more than or equal to.
Specifically, in the embodiment of the present application, it, can be with each monitor of Dynamic Maintenance by monitoring the network state of each monitor
Stability metric value, then, any of the above-described monitor stability metric value add up an increment after, if any of the above-described monitor
For primary monitor, then need whether the stability metric value for judging any of the above-described monitor is more than or equal to the first given threshold.If
It is to then follow the steps 220, otherwise, executes step 200.
In practical application, if the stability metric value of a monitor is more than or equal to the first given threshold, it is determined that this
Monitor health status is unstable, is not suitable anymore for as primary monitor, becomes a member in the committee.If a monitor
For primary monitor, then need to select from backup monitor one it is stable, be suitable as the backup monitoring of primary monitor
Device substitutes a primary monitor, and in the addition committee;If a monitor is backup monitor, this is continued to
The role of one monitor is constant.
Of course, it is possible to corresponding first given threshold is set according to different application scene and/or different user demands, this
Apply being not specifically limited herein in embodiment.
Step 220: if it is determined that result be it is yes, then from above-mentioned several backup monitors select a stability metric value be less than
Equal to the second given threshold backup monitor as primary monitor, and using any of the above-described monitor as backup monitor,
Wherein, above-mentioned first given threshold is greater than above-mentioned second given threshold.
Specifically, if it is determined that the stability metric value of a primary monitor is more than or equal to the first given threshold, it is determined that each
The stability metric value of backup monitor, and judge whether there is the backup monitoring for two given thresholds that stability metric value is less than or equal to
Device;If it is determined that there are the stability metric values of M backup monitor to be less than or equal to the second given threshold, then supervised from above-mentioned M backup
The smallest backup monitor of stability metric value is selected in visual organ as primary monitor.
For example, it is assumed that the second given threshold is 150, the stability metric value of backup monitor 1 is 130, backup monitor 2
Stability metric value is 120, and the stability metric value of backup monitor 3 is 0, backup monitor 1, backup monitor 2 and backup monitoring
The stability metric value of device 3 is respectively less than the second given threshold, then, it, then can be with since the stability metric value of backup monitor 3 is minimum
Select backup monitor 3 as primary monitor.
Further, if it is determined that there are the stability metric value of N number of backup monitor being most in above-mentioned M backup monitor
Small value then randomly chooses a backup monitor as primary monitor from above-mentioned N number of backup monitor.
For example, it is assumed that the second given threshold is 150, the stability metric value of backup monitor 1 is 130, backup monitor 2
Stability metric value is 120, backup monitor 3, and the stability metric value of backup monitor 4 and backup monitor 5 is 0, then can be with
Backup monitor 3 is selected, any one backup monitor in backup monitor 4 and backup monitor 5 is as primary monitor.
Of course, it is possible to corresponding second given threshold is set according to different application scene and/or different user demands, this
Apply being not specifically limited herein in embodiment.
Further, if the stability metric value after the decaying of any of the above-described monitor is less than or equal to third given threshold,
The stability metric value of any of the above-described monitor is set to 0 (e.g., initial value 0), wherein above-mentioned third given threshold is less than above-mentioned
One increment.
For example, it is assumed that the corresponding stabilization measurement initial value of primary monitor 1 is 0, an increment is 100, and third sets threshold
Value is 40, and when the stability metric value of primary monitor 1 is reduced to 40, the stability metric value of primary monitor 1 is set to 0.
Of course, it is possible to corresponding third given threshold is set according to different application scene and/or different user demands, this
Apply being not specifically limited herein in embodiment.
It should be appreciated that specific embodiment described herein is only used to explain the application, it is not used to limit the application.
Below with reference to shown in Fig. 3, to the stability metric value of a monitor (e.g., host 10) in the embodiment of the present application
Change procedure elaborates.
Specifically, hypothesis host 10 is primary monitor in the initial stage, it is initial for the preset stable measurement of host 10
Value is 0, monitors that the network state of host 10 becomes DOWN from UP at the T1 moment, by the cumulative k of the stability metric value of host 10, and
Start the corresponding drop timer 10 of host, into the first damped cycle, wherein the when a length of t of damped cycle is decaying
In the process, it monitors that the network state of host 10 becomes DOWN from UP again at the T2 moment, the stability metric value of host 10 is tired out
Add k, restart drop timer 10, into the second damped cycle, in attenuation process, monitors host 10 again in the T3 time
Network state becomes DOWN from UP, by the cumulative k of the stability metric value of host 10, restarts drop timer 10, decays into third
Period monitors that the network state of host 10 becomes DOWN from UP in the T4 time, by the steady of host 10 in attenuation process again
Determine the cumulative k of metric, restart drop timer 10, into the 4th damped cycle, at this point, the stability metric value (X) of host 10 is super
The first given threshold is crossed, the backup for needing that a stability metric value is selected to be less than or equal to the second given threshold from backup monitor
The committee is added in monitor (e.g., host 20), and to substitute host 10, host 10 is not re-used as a member of the committee, at the T5 moment,
At the end of 4th damped cycle, the stability metric value of host 10 becomes X/2 from X, and restarts drop timer 10, declines into the 5th
Subtract the period, at the end of the T6 moment, the 5th damped cycle, the stability metric value of host 10 becomes X/4 from X/2, and restarts decaying
Timer 10, into the 6th damped cycle, at the end of the T7 moment, the 6th damped cycle, the stability metric value of host 10 is from X/4
Become X/8, if third given threshold is X/8, the stability metric value of host 10 is directly set to 0, if third given threshold is big
In X/8, it is less than X/4, then in the attenuation process of the 6th damped cycle, decays to the from X/4 in the stability metric value of host 10
When three given thresholds, the stability metric value of host 10 is set to 0, if third given threshold is less than X/8, restarts drop timer
10, into the 7th damped cycle, until when the stability metric value of host 10 decays to third given threshold, by the stabilization of host 10
Metric is set to 0.
In practical application, the cluster map of ceph cluster includes monitor map, OSD map, PG map, CRUSH
Map and MDS map, in the embodiment of the present application, a kind of optional embodiment is that the cluster map of ceph cluster further includes
Health map, each primary monitor information, each standby monitor information, the stability metric value of each monitor, increment, decaying week
Phase, the first given threshold, the second given threshold, the data such as third given threshold are stored in health map, by ceph cluster
In primary monitor the maintenance mode of cluster map is safeguarded referring to current.
Based on inventive concept same as above method embodiment, a kind of ceph cluster is additionally provided in the embodiment of the present application
The managing device of middle monitor, the monitor of above-mentioned ceph cluster include several primary monitors and several backup monitors, and
There is corresponding stability metric value for the maintenance of each monitor.Illustratively, as shown in fig.4, being provided in the embodiment of the present application
A kind of ceph cluster in monitor managing device structural schematic diagram, which includes:
Monitoring unit 40 will be any of the above-described when for becoming DOWN from UP in the network state for monitoring any monitor
The corresponding stability metric value of monitor adds up an increment;
Judging unit 41, the primary monitoring for being more than or equal to the first given threshold for judging whether there is stability metric value
Device;
Selecting unit 42, for selecting a stability from above-mentioned several backup monitors when determining result is to be
Magnitude be less than or equal to the second given threshold backup monitor as primary monitor, and using any of the above-described primary monitor as
Backup monitor, wherein above-mentioned first given threshold is greater than above-mentioned second given threshold.
Optionally, above-mentioned apparatus further includes measurement adjustment unit 43, is measured by the corresponding stabilization of any of the above-described monitor
It is worth after a cumulative increment, measurement adjustment unit 43 is used for:
Start the corresponding drop timer of any of the above-described monitor, and presses preset attenuation function within the current attenuation period
Decay to the corresponding stability metric value of any of the above-described monitor.
Optionally, it is above-mentioned within the current attenuation period by preset attenuation function to the corresponding stabilization of any monitor
When metric is decayed, measurement adjustment unit 43 is used for:
Within the current attenuation period, the corresponding stability metric value of any of the above-described monitor is carried out with specified attenuation coefficient
Decaying;
Wherein, stability metric value when above-mentioned specified attenuation coefficient meets current attenuation end cycle is current attenuation week
The half of stability metric value when phase starts.
Optionally, if any of the above-described monitor was in the current attenuation period, monitoring unit 40 monitors any of the above-described prison
The network state of visual organ becomes DOWN from UP, then measures adjustment unit 43 and tire out the corresponding stability metric value of any of the above-described monitor
Add an increment, and restart drop timer, so that any of the above-described monitor enters next damped cycle;
If the damped cycle of any of the above-described monitor terminates, measures adjustment unit 43 and restart drop timer, with
So that any of the above-described monitor enters next damped cycle.
Optionally, measurement adjustment unit 43 is also used to:
If the stability metric value after any of the above-described monitor decaying is less than or equal to third given threshold, by any of the above-described prison
The stability metric value of visual organ is set to initial value, wherein above-mentioned third given threshold is less than said one increment.
Optionally, a stability metric value is selected to be less than or equal to the second given threshold from above-mentioned several backup monitors
When backup monitor is as primary monitor, selecting unit 43 is used for:
It determines the stability metric value of each backup monitor, and judges whether there is two settings that stability metric value is less than or equal to
The backup monitor of threshold value;
If it is determined that there are the stability metric values of M backup monitor to be less than or equal to the second given threshold, then above-mentioned M is selected
The smallest backup monitor of stability metric value is as primary monitor in backup monitor;
If it is determined that there are the stability metric value of N number of backup monitor be minimum value in the M backup monitor, then from
A backup monitor is randomly choosed in above-mentioned N number of backup monitor as primary monitor.
The above unit can be arranged to implement one or more integrated circuits of above method, such as: one
Or multiple specific integrated circuits (Application Specific Integrated Circuit, abbreviation ASIC), or, one
Or multi-microprocessor (digital singnal processor, abbreviation DSP), or, one or more field programmable gate
Array (Field Programmable Gate Array, abbreviation FPGA) etc..For another example, when some above unit passes through processing elements
When the form of part scheduler program code is realized, which can be general processor, such as central processing unit (Central
Processing Unit, abbreviation CPU) or it is other can be with the processor of caller code.For another example, these units can integrate
Together, it is realized in the form of system on chip (system-on-a-chip, abbreviation SOC).
The managing device that administration monitor is used in a kind of ceph cluster is also provided in the embodiment of the present application.Illustratively,
As shown in fig.5, the structural schematic diagram of managing device provided by the embodiments of the present application, which is included at least: processor
50 and memory 51, in which:
Memory 50 is for storing program instruction;Processor 51 calls the program instruction that stores in memory 50, according to obtaining
The program instruction obtained executes above method embodiment.Specific implementation is similar with technical effect, and which is not described herein again.
Optionally, the application also provides a kind of calculating equipment, including for execute above method embodiment at least one
Processing element (or chip).
Optionally, the application also provides a kind of program product, such as computer readable storage medium, this is computer-readable to deposit
Storage media is stored with computer executable instructions, which implements for making the computer execute the above method
Example.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only
Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied
Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed
Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit
Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one
In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be personal computer, server or the network equipment etc.) or processor (English: processor) execute this Shen
Please each embodiment the method part steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory
(English: Read-Only Memory, abbreviation: ROM), random access memory (English: Random Access Memory, letter
Claim: RAM), the various media that can store program code such as magnetic or disk.
The above is only preferred embodiment of the present application, are not intended to limit the scope of the patents of the application, all to utilize this Shen
Please equivalent structure or equivalent flow shift made by specification and accompanying drawing content, be applied directly or indirectly in other relevant skills
Art field similarly includes in the scope of patent protection of the application.
Claims (12)
1. the management method of monitor in a kind of ceph cluster, which is characterized in that the monitor of the ceph cluster includes several
Primary monitor and several backup monitors, and have corresponding stability metric value, the method packet for the maintenance of each monitor
It includes:
When the network state for monitoring any monitor becomes DOWN from UP, the stability metric value of any monitor is tired out
Add an increment;
If any monitor is primary monitor, judge whether the stability metric value of any monitor is more than or equal to
First given threshold;
If it is determined that result be it is yes, then select stability metric value to be less than or equal to the second setting from several backup monitors
The backup monitor of threshold value is as primary monitor, and using any monitor as backup monitor, wherein described first
Given threshold is greater than second given threshold.
2. the method as described in claim 1, which is characterized in that the stability metric value of any monitor is one cumulative
After increment, the method also includes:
Start the corresponding drop timer of any monitor, and presses preset attenuation function to institute within the current attenuation period
The stability metric value for stating any monitor is decayed.
3. method according to claim 2, which is characterized in that described to press preset attenuation function pair within the current attenuation period
The step of stability metric value of any monitor is decayed include:
Within the current attenuation period, decayed with stability metric value of the specified attenuation coefficient to any monitor;
Wherein, the specified attenuation coefficient meets: in current attenuation end cycle, any monitor stablizes measurement
Value is the half of the stability metric value of any monitor when the current attenuation period starts.
4. method according to claim 2, which is characterized in that the method also includes:
Any monitor was in the current attenuation period, if monitoring, the network state of any monitor is become from UP
DOWN then by the cumulative increment of the stability metric value of any monitor, and restarts drop timer, so that institute
Any monitor is stated into next damped cycle;
In the current attenuation end cycle of any monitor, drop timer is restarted, so that any prison
Visual organ enters next damped cycle.
5. such as the described in any item methods of claim 2-4, which is characterized in that the method also includes:
If the stability metric value after any monitor decaying is less than or equal to third given threshold, by any monitor
Stability metric value be set to initial value, wherein the third given threshold be less than one increment.
6. method according to any of claims 1-4, which is characterized in that select one from several backup monitors
Stability metric value be less than or equal to the second given threshold backup monitor include: as the step of primary monitor
It determines the stability metric value of each backup monitor, and judges whether there is two given thresholds that stability metric value is less than or equal to
Backup monitor;
If it is determined that there are the stability metric values of M backup monitor to be less than or equal to the second given threshold, then supervised from the M backup
The smallest backup monitor of stability metric value is selected in visual organ as primary monitor;
If it is determined that there are the stability metric value of N number of backup monitor being minimum value in the M backup monitor, then from described
A backup monitor is randomly choosed in N number of backup monitor as primary monitor.
7. the managing device of monitor in a kind of ceph cluster, which is characterized in that the monitor of the ceph cluster includes several
Primary monitor and several backup monitors, and have corresponding stability metric value, described device packet for the maintenance of each monitor
It includes:
Monitoring unit, when for becoming DOWN from UP in the network state for monitoring any monitor, by any monitor
Stability metric value add up an increment;
Judging unit, for judging the stabilization of any monitor when determining any monitor is primary monitor
Whether metric is more than or equal to the first given threshold;
Selecting unit, for selecting a stability metric value small from several backup monitors when determining result is to be
It is monitored in the backup monitor for being equal to the second given threshold as primary monitor, and using any monitor as backup
Device, wherein first given threshold is greater than second given threshold.
8. device as claimed in claim 7, which is characterized in that described device further includes measurement adjustment unit, described will appointed
The stability metric value of one monitor adds up after an increment, and the measurement adjustment unit is used for:
Start the corresponding drop timer of any monitor, and presses preset attenuation function to institute within the current attenuation period
The stability metric value for stating any monitor is decayed.
9. device as claimed in claim 8, which is characterized in that described to press preset attenuation function pair within the current attenuation period
When the stability metric value of any monitor is decayed, the measurement adjustment unit is used for:
Within the current attenuation period, decayed with stability metric value of the specified attenuation coefficient to any monitor;
Wherein, the specified attenuation coefficient meets: in current attenuation end cycle, any monitor stablizes measurement
Value is the half of the stability metric value of any monitor when the current attenuation period starts.
10. device as claimed in claim 8, which is characterized in that
Any monitor was in the current attenuation period, if the monitoring unit monitors the network of any monitor
State becomes DOWN from UP, then the measurement adjustment unit by the stability metric value of any monitor add up an increment,
And drop timer is restarted, so that any monitor enters next damped cycle;
At the end of the damped cycle of any monitor, the measurement adjustment unit restarts drop timer, so that
It obtains any monitor and enters next damped cycle.
11. a kind of calculating equipment characterized by comprising
Memory, for storing program instruction;
Processor is executed according to the program instruction of acquisition as right is wanted for calling the program instruction stored in the memory
It asks such as the step of method described in any one of claims 1 to 6.
12. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage has computer can
It executes instruction, the computer executable instructions are for making the computer execute the side as described in any one of claims 1 to 6
The step of method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811204207.5A CN109495543B (en) | 2018-10-16 | 2018-10-16 | Management method and device for monitors in ceph cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811204207.5A CN109495543B (en) | 2018-10-16 | 2018-10-16 | Management method and device for monitors in ceph cluster |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109495543A true CN109495543A (en) | 2019-03-19 |
CN109495543B CN109495543B (en) | 2021-08-24 |
Family
ID=65690856
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811204207.5A Active CN109495543B (en) | 2018-10-16 | 2018-10-16 | Management method and device for monitors in ceph cluster |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109495543B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111600742A (en) * | 2020-04-02 | 2020-08-28 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Method and system for dynamically switching main monitor of distributed storage system |
CN111756571A (en) * | 2020-05-28 | 2020-10-09 | 苏州浪潮智能科技有限公司 | Cluster node fault processing method, device, equipment and readable medium |
CN112597243A (en) * | 2020-12-22 | 2021-04-02 | 新华三大数据技术有限公司 | Method and device for accelerating synchronous state in Ceph cluster |
CN117743181A (en) * | 2023-12-25 | 2024-03-22 | 杭州云掣科技有限公司 | System for constructing observable control surface |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101106443A (en) * | 2007-08-10 | 2008-01-16 | 中兴通讯股份有限公司 | A system and method for controlling switch of primary and backup board |
CN101132314A (en) * | 2007-09-21 | 2008-02-27 | 中兴通讯股份有限公司 | Method for implementing redundancy backup |
CN102238364A (en) * | 2010-04-22 | 2011-11-09 | 上海国际技贸联合有限公司 | Method for redundancy of key equipment in rail transit television monitoring system |
US20140298091A1 (en) * | 2013-04-01 | 2014-10-02 | Nebula, Inc. | Fault Tolerance for a Distributed Computing System |
CN105119754A (en) * | 2015-09-08 | 2015-12-02 | 烽火通信科技股份有限公司 | System and method for performing virtual master-to-slave shift to keep TCP connection |
US20150363124A1 (en) * | 2012-01-17 | 2015-12-17 | Amazon Technologies, Inc. | System and method for data replication using a single master failover protocol |
US20170288948A1 (en) * | 2016-03-30 | 2017-10-05 | Juniper Networks, Inc. | Failure handling for active-standby redundancy in evpn data center interconnect |
-
2018
- 2018-10-16 CN CN201811204207.5A patent/CN109495543B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101106443A (en) * | 2007-08-10 | 2008-01-16 | 中兴通讯股份有限公司 | A system and method for controlling switch of primary and backup board |
CN101132314A (en) * | 2007-09-21 | 2008-02-27 | 中兴通讯股份有限公司 | Method for implementing redundancy backup |
CN102238364A (en) * | 2010-04-22 | 2011-11-09 | 上海国际技贸联合有限公司 | Method for redundancy of key equipment in rail transit television monitoring system |
US20150363124A1 (en) * | 2012-01-17 | 2015-12-17 | Amazon Technologies, Inc. | System and method for data replication using a single master failover protocol |
US20140298091A1 (en) * | 2013-04-01 | 2014-10-02 | Nebula, Inc. | Fault Tolerance for a Distributed Computing System |
CN105119754A (en) * | 2015-09-08 | 2015-12-02 | 烽火通信科技股份有限公司 | System and method for performing virtual master-to-slave shift to keep TCP connection |
US20170288948A1 (en) * | 2016-03-30 | 2017-10-05 | Juniper Networks, Inc. | Failure handling for active-standby redundancy in evpn data center interconnect |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111600742A (en) * | 2020-04-02 | 2020-08-28 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Method and system for dynamically switching main monitor of distributed storage system |
CN111600742B (en) * | 2020-04-02 | 2023-03-24 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Method and system for dynamically switching main monitor of distributed storage system |
CN111756571A (en) * | 2020-05-28 | 2020-10-09 | 苏州浪潮智能科技有限公司 | Cluster node fault processing method, device, equipment and readable medium |
WO2021238275A1 (en) * | 2020-05-28 | 2021-12-02 | 苏州浪潮智能科技有限公司 | Cluster node fault processing method and apparatus, and device and readable medium |
CN111756571B (en) * | 2020-05-28 | 2022-02-18 | 苏州浪潮智能科技有限公司 | Cluster node fault processing method, device, equipment and readable medium |
US11750437B2 (en) | 2020-05-28 | 2023-09-05 | Inspur Suzhou Intelligent Technology Co., Ltd. | Cluster node fault processing method and apparatus, and device and readable medium |
CN112597243A (en) * | 2020-12-22 | 2021-04-02 | 新华三大数据技术有限公司 | Method and device for accelerating synchronous state in Ceph cluster |
CN112597243B (en) * | 2020-12-22 | 2022-05-27 | 新华三大数据技术有限公司 | Method and device for accelerating synchronous state in Ceph cluster |
CN117743181A (en) * | 2023-12-25 | 2024-03-22 | 杭州云掣科技有限公司 | System for constructing observable control surface |
Also Published As
Publication number | Publication date |
---|---|
CN109495543B (en) | 2021-08-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109495543A (en) | The management method and device of monitor in a kind of ceph cluster | |
CN111049705B (en) | Method and device for monitoring distributed storage system | |
CN108268372B (en) | Mock test processing method and device, storage medium and computer equipment | |
CN108810046A (en) | A kind of method, apparatus and equipment of election leadership person Leader | |
US20120084788A1 (en) | Complex event distributing apparatus, complex event distributing method, and complex event distributing program | |
CN106899681A (en) | The method and server of a kind of information pushing | |
KR20190126406A (en) | Method and apparatus for processing resource requests | |
CN105760230B (en) | A kind of method and device of adjust automatically cloud host operation | |
CN109379238B (en) | CTDB main node election method, device and system of distributed cluster | |
CN110048896B (en) | Cluster data acquisition method, device and equipment | |
CN106445473A (en) | Container deployment method and apparatus | |
CN109597800B (en) | Log distribution method and device | |
CN109284229B (en) | Dynamic adjustment method based on QPS and related equipment | |
CN106960060A (en) | The management method and device of a kind of data-base cluster | |
CN110471749A (en) | Task processing method, device, computer readable storage medium and computer equipment | |
CN106708623B (en) | Object resource processing method, device and system | |
CN113918647A (en) | Distributed database elastic expansion method, device, equipment and storage medium | |
CN107330061B (en) | File deletion method and device based on distributed storage | |
CN106682198B (en) | Method and device for realizing automatic database deployment | |
CN111130834B (en) | Method and device for processing network elasticity strategy | |
CN114267440B (en) | Medical order information processing method and device and computer readable storage medium | |
CN113542775B (en) | Live broadcast keep-alive service system, live broadcast keep-alive management method, server and medium | |
CN115563160A (en) | Data processing method, data processing device, computer equipment and computer readable storage medium | |
CN112162698B (en) | Cache partition reconstruction method, device, equipment and readable storage medium | |
CN113742581A (en) | List generation method and device, electronic equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |