CN109298945A - The monitoring of Ceph distributed storage and tuning management method towards big data platform - Google Patents

The monitoring of Ceph distributed storage and tuning management method towards big data platform Download PDF

Info

Publication number
CN109298945A
CN109298945A CN201811210909.4A CN201811210909A CN109298945A CN 109298945 A CN109298945 A CN 109298945A CN 201811210909 A CN201811210909 A CN 201811210909A CN 109298945 A CN109298945 A CN 109298945A
Authority
CN
China
Prior art keywords
distributed storage
tuning
ceph distributed
monitoring
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811210909.4A
Other languages
Chinese (zh)
Inventor
张彤
李姝�
张永静
郑春
郑春一
李世成
周羽
朱盼盼
高晓琼
左晓辉
司敬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jinghang Computing Communication Research Institute
Original Assignee
Beijing Jinghang Computing Communication Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jinghang Computing Communication Research Institute filed Critical Beijing Jinghang Computing Communication Research Institute
Priority to CN201811210909.4A priority Critical patent/CN109298945A/en
Publication of CN109298945A publication Critical patent/CN109298945A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0823Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0893Assignment of logical groups to network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to cloud storage technical fields, more particularly to a kind of monitoring of Ceph distributed storage and tuning management method towards big data platform, emphasis solves the problems, such as that Ceph distributed memory system monitors and manages complicated for operation and decision data resource and is difficult to excavate.This method is implemented based on monitoring with tuning management system, which includes: operation monitoring management database, monitoring management module, state alarm module, alarm status rule presetting module, Performance tuning module, configuration management module;The present invention can be realized efficient, inexpensive, the unified operation management to Ceph distributed memory system, it is obviously improved the operation management efficiency of system, the technical threshold of Ceph distributed memory system application deployment in production environment is effectively reduced, is conducive to the large-scale promotion application of system.

Description

The monitoring of Ceph distributed storage and tuning management method towards big data platform
Technical field
The invention belongs to cloud storage technical fields, and in particular to a kind of Ceph distributed storage prison towards big data platform Control and tuning management method, emphasis solve the monitoring of Ceph distributed memory system and manage complicated for operation and decision data resource It is difficult to the problem of excavating.
Background technique
Ceph is a kind of very widely used today distributed memory system, with high scalability, high reliability, Gao Xing The characteristics of energy, more copies.Operation and maintenance after Ceph clustered deploy(ment) are a big difficulties that it is promoted and applied, because of its management service It need to be realized by a large amount of, complicated order line, for the system manager of general information system operation maintenance personnel or use unit, The special training for needing to receive a period of time can be grasped, and technical threshold is higher, and operation expense is higher.Meanwhile Ceph The cluster running state monitoring order that official provides can generate the status data of magnanimity, and the technical staff in professional domain is also required to The operating status read and can just analyze cluster is spent a lot of time, lacks intuitive status information and shows, also lack historical data Retention, it is difficult to form statistical data analysis, it is more difficult to carry out access with the existing big data platform of Subscriber Unit and merge, can not Realize uniform service monitoring management.
Ceph distributed memory system is all constrained in the popularization and application of production environment in terms of two above.Therefore, having must A kind of Ceph distributed storage method for managing and monitoring towards big data platform is provided to solve the above problems.
Summary of the invention
(1) technical problems to be solved
The technical problem to be solved by the present invention is how to provide a kind of Ceph distributed storage prison towards big data platform Management method is controlled, to reduce technical threshold, promotes data fusion efficiency, is realized to the comprehensive, convenient, efficient of distributed storage Monitoring and management work.
(2) technical solution
In order to solve the above technical problems, the present invention provide it is a kind of towards big data platform Ceph distributed storage monitoring with Tuning management method, the Ceph distributed storage monitoring are based on the monitoring of Ceph distributed storage and tuning with tuning management method Management method, system is implemented, the system comprises: operation monitoring management database, state alarm module, is accused at monitoring management module Alert state rule presetting module, Performance tuning module, configuration management module;
Described method includes following steps:
Step 1: the Performance tuning module passes through the different number of nodes of selection, the Ceph distributed storage cluster of userbase Deployment implementation is carried out, prefabricated Ceph is formed for OSD_MAX_WRITE_SIZE, OSD_MAP_CACHE_SIZE parameter and is distributed Formula storage cluster Performance tuning template library, at Ceph distributed storage clustered deploy(ment) initial stage according to prefabricated Ceph distributed storage Clustering performance tuning template library carries out template configuration, optimizes the OSD_MAX_ being related to Ceph distributed storage clustering performance WRITE_SIZE, OSD_MAP_CACHE_SIZE parameter are determined rationalizing in range, are generated initial tuning and are instructed to matching Set management module;
Step 2: the configuration management module receives the initial tuning instruction from Performance tuning module, is instructed according to tuning Carry out issuing and configuring for relevant parameter;
Step 3: the alarm status rule presetting module is directed to the ginseng for representing Ceph distributed storage cluster health condition Number, determines the threshold range up and down under its normal condition;
Step 4: the operation monitoring management database extracts interface by data and extracts Ceph distributed storage cluster Running state data, storage form database, provide accurate data support for the monitoring alarm and Performance tuning of cluster;
Step 5: the monitoring management module is by reading the running state data in operation monitoring management database Analysis is taken, Ceph distributed storage cluster health status is monitored in real time, it is strong that extraction represents Ceph distributed storage cluster The parameter sets of health situation generate operation monitor state data according to extracted parameter sets, and are sent to state alarm mould Block;
Step 6: the state alarm module receives the operation monitor state data that monitoring management module generates, and supervises to operation Control status data is analyzed, by the parameter of current representative Ceph distributed storage cluster health condition and alarm status rule The preset threshold range up and down of presetting module is matched, and alarm is triggered if being more than upper and lower threshold range, generates alarm letter It ceases and passes through RESTful interface and be committed to big data platform rapidly and carry out unified alarm, and provide the alarm corresponding emergency Treatment measures prompt;
Step 7: after Ceph distributed storage cluster runs a period of time, Performance tuning module passes through to from monitoring pipe The operation monitor state parameter of reason module carries out calculating analysis, in Ceph distributed storage clustering performance tuning template library The occurrence of OSD_MAX_WRITE_SIZE, OSD_MAP_CACHE_SIZE parameter is reacted according to operation monitor state parameter Current Ceph distributed storage cluster the case where be adjusted, the tuning that update is generated after adjustment is instructed to configuration management mould Block;
Step 8: the configuration management module receives the tuning instruction of the update from Performance tuning module, is referred to according to tuning It enables and carries out issuing and configuring for relevant parameter.
Wherein, in the step 2 and step 8, configuration management module also by man-machine interactive interface receive external command come Whether Ceph distributed storage cluster progress read operation, write operation, control OSD are added the place of Ceph distributed storage cluster Reason.
Wherein, the running state data includes PG quantity, OSD running state data.
Wherein, the running state data includes OSD_MAX_WRITE_SIZE parameter current value, OSD_MAP_CACHE_ SIZE parameter current value.
Wherein, the parameter for representing Ceph distributed storage cluster health condition include: PGs per OSD, OSD whether Storage will full state parameter.
Wherein, in the step 5, monitoring management module, which also passes through RESTful interface, will run the submission of monitor state data Further data mining processing is carried out to big data platform and data are shown.
(3) beneficial effect
Compared with prior art, the present invention can be realized to the efficient, inexpensive, uniformly of Ceph distributed memory system Operation management, be obviously improved the operation management efficiency of system, Ceph distributed memory system be effectively reduced in production environment The technical threshold of application deployment is conducive to the large-scale promotion application of system.
Detailed description of the invention
Fig. 1 is system logic architecture figure described in one embodiment of the invention;
Fig. 2 is monitoring alarm flow chart described in one embodiment of the invention;
Fig. 3 is Performance tuning flow chart described in one embodiment of the invention.
Specific embodiment
To keep the purpose of the present invention, content and advantage clearer, with reference to the accompanying drawings and examples, to of the invention Specific embodiment is described in further detail.
In order to describe conveniently, Fig. 1 is combined to carry out necessary definition and explanation to some terms that the present invention uses first.
Ceph OSD (object storage device node): full name is Object Storage Device, its main function is Storing data, equilibrium data, restores progress heartbeat inspection etc. between data and other OSD at replicate data, and by some variation feelings Condition is reported to Ceph Monitor.
Ceph Monitor (cluster monitoring node): it is the monitor of Ceph cluster, for safeguarding the healthy shape of cluster State, while the figure of the various Map in Ceph cluster is maintain, such as OSD Map, Monitor Map, PG Map and CRUSH Map, this A little Map are referred to as Cluster Map, for managing the distribution of the information such as all members, relationship, attribute in cluster and data Deng.
PG: group, the logic storage unit of Ceph are put in order.
The maximum value (MB) that OSD_MAX_WRITE_SIZE:OSD write-once enters.
OSD_MAP_CACHE_SIZE: retain the cache size (MB) of OSD Map.
PGs per OSD: the PG quantity in single OSD.
To solve problem of the prior art, the present invention provides a kind of Ceph distributed storage monitoring towards big data platform With tuning management method, the Ceph distributed storage monitoring is based on Ceph distributed storage with tuning management method and monitors and adjust Excellent management method, system is implemented, as shown in Figure 1, the system comprises: operation monitoring management database, monitoring management module, shape State alarm module, alarm status rule presetting module, Performance tuning module, configuration management module;
Described method includes following steps:
Step 1: the Performance tuning module passes through the different number of nodes of selection, the Ceph distributed storage cluster of userbase Deployment implementation is carried out, prefabricated Ceph is formed for OSD_MAX_WRITE_SIZE, OSD_MAP_CACHE_SIZE parameter and is distributed Formula storage cluster Performance tuning template library, at Ceph distributed storage clustered deploy(ment) initial stage according to prefabricated Ceph distributed storage Clustering performance tuning template library carries out template configuration, optimizes the OSD_MAX_ being related to Ceph distributed storage clustering performance WRITE_SIZE, OSD_MAP_CACHE_SIZE parameter are determined rationalizing in range, are generated initial tuning and are instructed to matching Set management module;
Step 2: the configuration management module receives the initial tuning instruction from Performance tuning module, is instructed according to tuning Carry out issuing and configuring for relevant parameter;
Step 3: the alarm status rule presetting module is directed to the ginseng for representing Ceph distributed storage cluster health condition Number, determines the threshold range up and down under its normal condition;
Step 4: base support of the operation monitoring management database as whole system is extracted interface by data and is mentioned The running state data of Ceph distributed storage cluster is taken, storage forms database, is the monitoring alarm and Performance tuning of cluster Accurate data support is provided;
Step 5: the monitoring management module is by reading the running state data in operation monitoring management database Analysis is taken, Ceph distributed storage cluster health status is monitored in real time, it is strong that extraction represents Ceph distributed storage cluster The parameter sets of health situation generate operation monitor state data according to extracted parameter sets, and are sent to state alarm mould Block;
Step 6: the state alarm module receives the operation monitor state data that monitoring management module generates, and supervises to operation Control status data is analyzed, by the parameter of current representative Ceph distributed storage cluster health condition and alarm status rule The preset threshold range up and down of presetting module is matched, and alarm is triggered if being more than upper and lower threshold range, generates alarm letter It ceases and passes through RESTful interface and be committed to big data platform rapidly and carry out unified alarm, and provide the alarm corresponding emergency Treatment measures prompt;
Step 7: after Ceph distributed storage cluster runs a period of time, Performance tuning module passes through to from monitoring pipe The operation monitor state parameter of reason module carries out calculating analysis, in Ceph distributed storage clustering performance tuning template library The occurrence of OSD_MAX_WRITE_SIZE, OSD_MAP_CACHE_SIZE parameter is reacted according to operation monitor state parameter Current Ceph distributed storage cluster the case where be adjusted, the tuning that update is generated after adjustment is instructed to configuration management mould Block;
Step 8: the configuration management module receives the tuning instruction of the update from Performance tuning module, is referred to according to tuning It enables and carries out issuing and configuring for relevant parameter.
By the technical solution of above system, it can be achieved that the convenience of Ceph distributed memory system, efficiently, unified monitoring With management, is shown with visual administration interface, patterned information, integrated operation management, be obviously improved operation and maintenance supervising Efficiency meets system in the actual management maintenance needs of production environment comprehensively.
Wherein, in the step 2 and step 8, configuration management module also by man-machine interactive interface receive external command come Whether Ceph distributed storage cluster progress read operation, write operation, control OSD are added the place of Ceph distributed storage cluster Reason.
Wherein, the running state data includes PG quantity, OSD running state data.
Wherein, the running state data includes OSD_MAX_WRITE_SIZE parameter current value, OSD_MAP_CACHE_ SIZE parameter current value.
Wherein, the parameter for representing Ceph distributed storage cluster health condition include: PGs per OSD, OSD whether Storage will full state parameter.
Wherein, in the step 5, monitoring management module, which also passes through RESTful interface, will run the submission of monitor state data Further data mining processing is carried out to big data platform and data are shown.
In addition, the present invention also provides a kind of, the Ceph distributed storage towards big data platform is monitored and tuning management system System, as shown in Figure 1, the system comprises: operation monitoring management database, monitoring management module, state alarm module, alarm shape State rule presetting module, Performance tuning module, configuration management module;
Base support of the operation monitoring management database as whole system, is extracted for extracting interface by data The running state data of Ceph distributed storage cluster, storage form database, mention for the monitoring alarm and Performance tuning of cluster It is supported for accurate data;
The monitoring management module is used for by being read out to the running state data in operation monitoring management database Analysis, monitors Ceph distributed storage cluster health status in real time, and extraction represents Ceph distributed storage cluster health The parameter sets of situation generate operation monitor state data according to extracted parameter sets, and are sent to state alarm module;
The alarm status rule presetting module is used for for the parameter for representing Ceph distributed storage cluster health condition, Determine the threshold range up and down under its normal condition;
The state alarm module is used to receive the operation monitor state data of monitoring management module generation, monitors to operation Status data is analyzed, and the parameter of current representative Ceph distributed storage cluster health condition and alarm status rule is pre- If the preset threshold range up and down of module is matched, alarm is triggered if being more than upper and lower threshold range, generates warning information And big data platform is committed to rapidly by RESTful interface and carries out unified alarm, and is provided at the corresponding emergency of the alarm Reason measure prompt;
The Performance tuning module be used for the Ceph distributed storage cluster by choosing different number of nodes, userbase into Row deployment is implemented, and it is distributed to form prefabricated Ceph for OSD_MAX_WRITE_SIZE, OSD_MAP_CACHE_SIZE parameter Storage cluster Performance tuning template library, at Ceph distributed storage clustered deploy(ment) initial stage according to prefabricated Ceph distributed storage collection Group's Performance tuning template library carries out template configuration, optimizes the OSD_MAX_ being related to Ceph distributed storage clustering performance WRITE_SIZE, OSD_MAP_CACHE_SIZE parameter are determined rationalizing in range, are generated initial tuning and are instructed to matching Set management module;After Ceph distributed storage cluster runs a period of time, Performance tuning module passes through to from monitoring management The operation monitor state parameter of module carries out calculating analysis, to the OSD_ in Ceph distributed storage clustering performance tuning template library The occurrence of MAX_WRITE_SIZE, OSD_MAP_CACHE_SIZE parameter is worked as according to what operation monitor state parameter was reacted The case where preceding Ceph distributed storage cluster, is adjusted, and the tuning that update is generated after adjustment is instructed to configuration management module;
The tuning that the configuration management module is used to receive the initial tuning instruction from Performance tuning module or updates refers to It enables, is instructed according to tuning and carry out issuing and configuring for relevant parameter;
Wherein, the configuration management module is also used to receive external command by man-machine interactive interface come to Ceph distribution Storage cluster carries out read operation, write operation, controls the processing whether OSD is added Ceph distributed storage cluster.
By the technical solution of above system, it can be achieved that the convenience of Ceph distributed memory system, efficiently, unified monitoring With management, is shown with visual administration interface, patterned information, integrated operation management, be obviously improved operation and maintenance supervising Efficiency meets system in the actual management maintenance needs of production environment comprehensively.
Wherein, the running state data includes PG quantity, OSD running state data.
Wherein, the running state data includes OSD_MAX_WRITE_SIZE parameter current value, OSD_MAP_CACHE_ SIZE parameter current value.
Wherein, the parameter for representing Ceph distributed storage cluster health condition include: PGs per OSD, OSD whether Storage will full state parameter.
Wherein, the monitoring management module is also used to submit to greatly by RESTful interface by monitor state data are run Data platform carries out further data mining processing and data are shown.
Embodiment 1
Monitoring alarm embodiment
As shown in Fig. 2, the present embodiment is that the present invention proposes that Ceph distributed storage monitors the monitoring announcement of tuning management system Alert workflow:
Step 1: choosing monitoring management node and carry out system deployment.System after deployment will grab the operation of Ceph cluster Data and status information;
Step 2: the configuration information and status information of cluster are all stored in operation monitoring management database;
Step 3: monitoring management module carries out calculating analysis to the data in database, to obtain the operation shape of Ceph cluster State, and it is supplied to data-mining module and data display module unified in big data platform, for providing unification for user The data of data analysis and high quality are shown;
Step 4: state alarm module is responsible for carrying out further matching primitives to the data of monitoring management module analysis, right While alarm situation is reported to big data platform, emergency trouble shooting measures are submitted into configuration management module;
Step 5: configuration management module issues the emergency processing configuration that alarm module is submitted, to distributed storage collection Group carries out emergency repair, and respective handling information is stored in operation monitoring management database.
Embodiment 2
Performance tuning embodiment
As shown in figure 3, the present embodiment is that the present invention proposes that Ceph distributed storage monitors the performance tune of tuning management system Excellent workflow:
Step 1: choosing monitoring management node and carry out system deployment.System after deployment will grab the operation of Ceph cluster Data and status information;
Step 2: the configuration information and status information of cluster are all stored in monitoring management database;Monitoring management module pair Data in database carry out calculating analysis, generate operation monitoring management data;
Step 3: Performance tuning module obtains current cluster by carrying out comprehensive calculation and analysis to operation monitoring management data Configuration optimal case on the design parameters such as PG number, and tuning template is generated, it is committed to configuration management module;
Step 4: after configuration management module receives template, parameter information each in template being extracted, and is issued to point Cloth storage cluster carries out tuning, while will corresponding supplemental characteristic deposit operation monitoring management database.
Wherein, the work such as the stateful displaying of institute, configuration operation, alarm prompt, tuning selection is passed through by big data platform Unified interface is unified to provide service to user.
Wherein, tuning template library and alarm regulation library can be extended, flat by big data to template library and rule base Platform externally provides unified upgrade maintenance interface.
Wherein, Performance tuning refers in particular to be adjusted the specific configuration parameters of Ceph cluster optimization, is not directed to physical store The performance of equipment optimizes.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, without departing from the technical principles of the invention, several improvement and deformations can also be made, these improvement and deformations Also it should be regarded as protection scope of the present invention.

Claims (6)

1. a kind of monitoring of Ceph distributed storage and tuning management method towards big data platform, which is characterized in that described The monitoring of Ceph distributed storage is monitored based on Ceph distributed storage with tuning management method, system with tuning management method to be implemented, The system comprises: operation monitoring management database, monitoring management module, state alarm module, alarm status rule preset mould Block, Performance tuning module, configuration management module;
Described method includes following steps:
Step 1: the Performance tuning module is carried out by choosing the Ceph distributed storage cluster of different number of nodes, userbase Deployment is implemented, and forms prefabricated Ceph distribution for OSD_MAX_WRITE_SIZE, OSD_MAP_CACHE_SIZE parameter and deposits Accumulation Performance tuning template library, at Ceph distributed storage clustered deploy(ment) initial stage according to prefabricated Ceph distributed storage cluster Performance tuning template library carries out template configuration, optimizes the OSD_MAX_WRITE_ being related to Ceph distributed storage clustering performance SIZE, OSD_MAP_CACHE_SIZE parameter are determined in rationalization range, are generated initial tuning and are instructed to configuration management Module;
Step 2: the configuration management module receives the initial tuning instruction from Performance tuning module, is instructed and is carried out according to tuning Relevant parameter issuing and configuring;
Step 3: the alarm status rule presetting module is directed to the parameter for representing Ceph distributed storage cluster health condition, really Threshold range up and down under its fixed normal condition;
Step 4: the operation monitoring management database extracts the operation that interface extracts Ceph distributed storage cluster by data Status data, storage form database, provide accurate data support for the monitoring alarm and Performance tuning of cluster;
Step 5: the monitoring management module is by being read out point the running state data in operation monitoring management database Analysis, monitors Ceph distributed storage cluster health status in real time, and extraction represents Ceph distributed storage cluster health feelings The parameter sets of condition generate operation monitor state data according to extracted parameter sets, and are sent to state alarm module;
Step 6: the state alarm module receives the operation monitor state data that monitoring management module generates, and monitors shape to operation State data are analyzed, and the parameter of current representative Ceph distributed storage cluster health condition is preset with alarm status rule The preset threshold range up and down of module is matched, and alarm is triggered if being more than upper and lower threshold range, generates warning information simultaneously It is committed to big data platform rapidly by RESTful interface and carries out unified alarm, and provides the alarm corresponding emergency processing Measure prompt;
Step 7: after Ceph distributed storage cluster runs a period of time, Performance tuning module passes through to from monitoring management mould The operation monitor state parameter of block carries out calculating analysis, to the OSD_ in Ceph distributed storage clustering performance tuning template library The occurrence of MAX_WRITE_SIZE, OSD_MAP_CACHE_SIZE parameter is worked as according to what operation monitor state parameter was reacted The case where preceding Ceph distributed storage cluster, is adjusted, and the tuning that update is generated after adjustment is instructed to configuration management module;
Step 8: the configuration management module receives the tuning instruction of the update from Performance tuning module, according to tuning instruct into Row relevant parameter issuing and configuring.
2. the monitoring of Ceph distributed storage and tuning management method towards big data platform as described in claim 1, special Sign is, in the step 2 and step 8, configuration management module also receives external command by man-machine interactive interface come to Ceph Distributed storage cluster carries out read operation, write operation, controls the processing whether OSD is added Ceph distributed storage cluster.
3. the monitoring of Ceph distributed storage and tuning management method towards big data platform as described in claim 1, special Sign is that the running state data includes PG quantity, OSD running state data.
4. the monitoring of Ceph distributed storage and tuning management method towards big data platform as described in claim 1, special Sign is that the running state data includes OSD_MAX_WRITE_SIZE parameter current value, OSD_MAP_CACHE_SIZE ginseng Number current value.
5. the monitoring of Ceph distributed storage and tuning management method towards big data platform as described in claim 1, special Sign is, the parameter for representing Ceph distributed storage cluster health condition include: PGs per OSD, OSD whether store by Full state parameter.
6. the monitoring of Ceph distributed storage and tuning management method towards big data platform as described in claim 1, special Sign is, in the step 5, monitoring management module also passes through RESTful interface and submits to big number for monitor state data are run Further data mining processing is carried out according to platform and data are shown.
CN201811210909.4A 2018-10-17 2018-10-17 The monitoring of Ceph distributed storage and tuning management method towards big data platform Pending CN109298945A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811210909.4A CN109298945A (en) 2018-10-17 2018-10-17 The monitoring of Ceph distributed storage and tuning management method towards big data platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811210909.4A CN109298945A (en) 2018-10-17 2018-10-17 The monitoring of Ceph distributed storage and tuning management method towards big data platform

Publications (1)

Publication Number Publication Date
CN109298945A true CN109298945A (en) 2019-02-01

Family

ID=65157192

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811210909.4A Pending CN109298945A (en) 2018-10-17 2018-10-17 The monitoring of Ceph distributed storage and tuning management method towards big data platform

Country Status (1)

Country Link
CN (1) CN109298945A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442646A (en) * 2019-07-29 2019-11-12 北京易捷思达科技发展有限公司 A kind of ceph data simultaneous module main side write performance optimization system and method
CN111290909A (en) * 2020-01-19 2020-06-16 山东汇贸电子口岸有限公司 System and method for monitoring and alarming ceph cluster
CN111510338A (en) * 2020-03-09 2020-08-07 苏州浪潮智能科技有限公司 Distributed block storage network sub-health test method, device and storage medium
WO2021129367A1 (en) * 2019-12-23 2021-07-01 深圳前海微众银行股份有限公司 Method and apparatus for monitoring distributed storage system
CN113282241A (en) * 2021-05-26 2021-08-20 上海仪电(集团)有限公司中央研究院 Ceph distributed storage-based hard disk weight optimization method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929667A (en) * 2012-10-24 2013-02-13 曙光信息产业(北京)有限公司 Method for optimizing hadoop cluster performance
CN105718351A (en) * 2016-01-08 2016-06-29 北京汇商融通信息技术有限公司 Hadoop cluster-oriented distributed monitoring and management system
CN106100938A (en) * 2016-08-19 2016-11-09 浪潮(北京)电子信息产业有限公司 The monitoring of a kind of distributed cluster system and alarm method and system
CN107454140A (en) * 2017-06-27 2017-12-08 北京溢思得瑞智能科技研究院有限公司 A kind of Ceph cluster automatically dispose method and system based on big data platform
US20180255138A1 (en) * 2017-03-06 2018-09-06 At&T Intellectual Property I, L.P. Reliable data storage for decentralized computer systems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929667A (en) * 2012-10-24 2013-02-13 曙光信息产业(北京)有限公司 Method for optimizing hadoop cluster performance
CN105718351A (en) * 2016-01-08 2016-06-29 北京汇商融通信息技术有限公司 Hadoop cluster-oriented distributed monitoring and management system
CN106100938A (en) * 2016-08-19 2016-11-09 浪潮(北京)电子信息产业有限公司 The monitoring of a kind of distributed cluster system and alarm method and system
US20180255138A1 (en) * 2017-03-06 2018-09-06 At&T Intellectual Property I, L.P. Reliable data storage for decentralized computer systems
CN107454140A (en) * 2017-06-27 2017-12-08 北京溢思得瑞智能科技研究院有限公司 A kind of Ceph cluster automatically dispose method and system based on big data platform

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李琛: "开放分布式文件存储服务的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442646A (en) * 2019-07-29 2019-11-12 北京易捷思达科技发展有限公司 A kind of ceph data simultaneous module main side write performance optimization system and method
CN110442646B (en) * 2019-07-29 2021-01-12 北京易捷思达科技发展有限公司 Write performance optimization system and method for master end of ceph data synchronization module
WO2021129367A1 (en) * 2019-12-23 2021-07-01 深圳前海微众银行股份有限公司 Method and apparatus for monitoring distributed storage system
CN111290909A (en) * 2020-01-19 2020-06-16 山东汇贸电子口岸有限公司 System and method for monitoring and alarming ceph cluster
CN111510338A (en) * 2020-03-09 2020-08-07 苏州浪潮智能科技有限公司 Distributed block storage network sub-health test method, device and storage medium
CN111510338B (en) * 2020-03-09 2022-04-26 苏州浪潮智能科技有限公司 Distributed block storage network sub-health test method, device and storage medium
CN113282241A (en) * 2021-05-26 2021-08-20 上海仪电(集团)有限公司中央研究院 Ceph distributed storage-based hard disk weight optimization method and device
CN113282241B (en) * 2021-05-26 2024-04-09 上海仪电(集团)有限公司中央研究院 Hard disk weight optimization method and device based on Ceph distributed storage

Similar Documents

Publication Publication Date Title
CN109298945A (en) The monitoring of Ceph distributed storage and tuning management method towards big data platform
CN109218109A (en) The monitoring of Ceph distributed storage and tuning management system towards big data platform
CN104809597B (en) Data resource management platform based on data fusion
CN103532744B (en) A kind of intelligent grid information communication integral supporting platform
CN107454140A (en) A kind of Ceph cluster automatically dispose method and system based on big data platform
CN106533754A (en) Fault diagnosis method and expert system for college teaching servers
CN104979912A (en) Monitoring method of photovoltaic power generation system and system thereof
CN106451761B (en) Large-area power failure automatic monitoring and analyzing system based on dynamic data driving
US20210390422A1 (en) Knowledge-Base Information Sensing Method And System For Operations And Maintenance Of Data Center
CN103825755A (en) Power secondary system modeling method and system
CN108985467A (en) Secondary device lean management-control method based on artificial intelligence
CN110032643A (en) A kind of building maintenance work order analysis method, device, storage medium and client
CN107633307A (en) Power supply-distribution system Root alarm detection method, device, terminal and computer-readable storage medium
CN105205185B (en) The method of data interaction and data modeling between monitoring system and management information system
CN112241424A (en) Air traffic control equipment application system and method based on knowledge graph
CN110007905A (en) A kind of generation method and system of the software development scheme based on big data
CN107918560A (en) A kind of server apparatus management method and device
CN111045363B (en) Intelligent operation and maintenance management and control cloud platform of information communication network
CN112116790B (en) CORS early warning monitoring system based on flow frame
CN113206867A (en) Intelligent data acquisition monitoring system and method and timing acquisition service module
CN117076426A (en) Traffic intelligent engine system construction method and device based on flow batch integration
CN106649034A (en) Visual intelligent operation and maintenance method and platform
CN116094174A (en) Knowledge graph-based power grid operation and maintenance monitoring method, system, equipment and medium
CN116187774A (en) Artificial intelligence operation and maintenance management system for data center
CN110647070A (en) Power environment monitoring system for super-large-scale data center

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190201