CN112866020A - Cloud center intelligent alarm processing system and method - Google Patents
Cloud center intelligent alarm processing system and method Download PDFInfo
- Publication number
- CN112866020A CN112866020A CN202110036592.2A CN202110036592A CN112866020A CN 112866020 A CN112866020 A CN 112866020A CN 202110036592 A CN202110036592 A CN 202110036592A CN 112866020 A CN112866020 A CN 112866020A
- Authority
- CN
- China
- Prior art keywords
- alarm
- resource
- index
- indexes
- duty
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012545 processing Methods 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 title claims abstract description 8
- 230000005856 abnormality Effects 0.000 claims abstract description 4
- 238000003672 processing method Methods 0.000 claims description 9
- 230000001960 triggered effect Effects 0.000 claims description 4
- 238000007726 management method Methods 0.000 abstract description 32
- 238000012423 maintenance Methods 0.000 abstract description 13
- 230000009286 beneficial effect Effects 0.000 abstract description 4
- 238000013024 troubleshooting Methods 0.000 abstract description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 239000003086 colorant Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0631—Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0677—Localisation of faults
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention discloses a cloud center intelligent alarm processing system and method, and belongs to the technical field of IT operation and maintenance. The cloud center intelligent alarm processing system comprises resource management, index definition, duty management, alarm rules, an alarm view template, an alarm and an alarm view, wherein the resource management is used for managing resources, and the resources comprise virtual machines, physical machines, switches, storage equipment, virtual equipment, middleware and application software; defining indexes by the index definition, wherein the indexes are items acquired by data and comprise a CPU (central processing unit), a memory and network access flow; the duty management is used for arranging daily duty personnel; when abnormality occurs, triggering alarm in real time according to alarm rules; and generating an alarm view according to the alarm view template, wherein the alarm view template corresponds to the alarm rule one by one. The cloud center intelligent alarm processing system is beneficial to rapidly troubleshooting problems, finding out fault root causes, improving operation and maintenance efficiency and having good popularization and application values.
Description
Technical Field
The invention relates to the technical field of IT operation and maintenance, and particularly provides a cloud center intelligent alarm processing system and method.
Background
Today, with the rapid development of cloud computing, cloud centers in various places also bloom all the time. The cloud management system equipped in the cloud center ensures that various services of the cloud center are normally carried out, and is also equipped with a set of operation and maintenance management system for ensuring the long-term stable operation of the system. The operation and maintenance management system generally performs operation and maintenance management on various devices in the cloud center, and when the system is abnormal, how to timely discover and solve the abnormality becomes a difficult problem of the operation and maintenance management system.
The existing operation and maintenance management system of the cloud center has high dependence on operation and maintenance personnel, and monitoring data of resources are respectively collected and are not connected with each other. When an alarm occurs or a fault occurs, problems need to be manually checked through various searches, and time and labor are wasted.
Disclosure of Invention
The technical task of the invention is to provide an intelligent alarm processing system of a cloud center, which can associate resources of the cloud center and also associate alarms, is beneficial to rapidly troubleshooting problems, finding out fault root causes and improving operation and maintenance efficiency.
The invention further provides an intelligent alarm processing method of the cloud center.
In order to achieve the purpose, the invention provides the following technical scheme:
a cloud center intelligent alarm processing system comprises resource management, index definition, duty management, alarm rules, an alarm view template, an alarm and an alarm view, wherein the resource management is used for managing resources, and the resources comprise virtual machines, physical machines, switches, storage equipment, virtual equipment, middleware and application software; defining indexes by the index definition, wherein the indexes are items acquired by data and comprise a CPU (central processing unit), a memory and network access flow; the duty management is used for arranging daily duty personnel; when abnormality occurs, triggering alarm in real time according to alarm rules; generating an alarm view according to an alarm view template, wherein the alarm view template corresponds to the alarm rule one by one; and triggering an alarm when the index is not in the threshold range, and generating an alarm view according to an alarm rule associated with the alarm information.
Preferably, the attributes of the resource management include a code, a name, a resource type, and an associated resource. These resources are first manually entered into the system or can be automatically discovered and automatically entered into the system. Each resource can have its own personalized attribute, the attributes of the resources are persisted in a database, and each resource can be separately tabulated and maintained. The resource associated resource refers to a resource to which the resource belongs or a resource having a connection relationship, for example, if the virtual machine is located on one physical machine, a resource to which the virtual machine belongs is the physical machine, and an associated resource of the virtual machine is the physical machine. For another example, if the physical machine is connected to the network through the switch, one of the associated resources of the physical machine is the switch. One resource can correspond to multiple associated resources, and the associated resources are configured only by configuring the associated resource type.
Preferably, the attributes of the index defined by the index include a code, a name, a resource to which the index belongs, and a unit. The encoding of the indicator definition must be consistent with the encoding of the indicator of the data acquisition layer.
Preferably, the alarm rule comprises a rule name, an index used by the rule, a threshold value of the index, and a processing suggestion when the index is not in the threshold value range. The alarm rules may define alarm levels: severe, primary, secondary, general, warning. An alarm threshold is set for each level of alarm.
Preferably, the alarm view template comprises a correlation index, an index display form, a time range and a duty.
Wherein, the associated index is an index associated with an index in the alarm rule. If the index in the alarm rule is a.1 and the resource to which the alarm rule belongs is a, the correlation index may be other indexes of the resource a to which the correlation index belongs or indexes of other resources. The other resources are generally associated resources of resource a, that is, resources having an affiliation or connection relationship with resource a. If certain contact exists in the business and the alarm problem can be checked, the correlation index can be set. The configuration associated index is first selected from the other indexes of resource a and the associated resources of resource a. The index display form refers to a display form of the associated index. Whether a graph or a histogram, etc. Time range. The correlation index shows the performance data in which time range. This time frame may be set to the first hour, the first two hours, etc. of generating the alert. And (4) on duty. Refers to the person on duty and the contact address in the time range.
The on-duty management is an indispensable module of the cloud center, and the cloud center reasonably arranges daily on-duty personnel through the on-duty management system. When the cloud center system generates an alarm or the fault cannot be automatically repaired, the person on duty on the day can be found through duty management as soon as possible, the person on duty is contacted, and the problem is solved as soon as possible, so that the healthy operation of the cloud center system is guaranteed. The duty management of the invention mainly maintains the arrangement of daily operation and maintenance duty personnel, and can quickly contact the duty personnel through telephone or mail.
A cloud center intelligent alarm processing method is used for managing resources and configuring associated resources for the resources; defining indexes for each resource, wherein the indexes are consistent with the indexes for data acquisition; the on-duty personnel and the contact way of each day are maintained through on-duty management; by configuring an alarm rule, when the performance data is not in a threshold range, triggering an alarm and giving an alarm processing suggestion; and dynamically generating an alarm view through an alarm view template, displaying a performance chart of the associated index of the index in a specified time range, showing whether an alarm is generated, finding the on-duty personnel and the contact way in the specified time range, and giving a processing suggestion.
Preferably, the associated resources of the resources are configured, the index belongs to one of the resources, an alarm rule is set for the index, the alarm rule corresponds to an alarm view template, and an alarm view is dynamically generated when an alarm is triggered.
Preferably, each resource is configured with a resource type, and the associated resource is configured as a resource having an affiliation or connection relationship with the resource.
Compared with the prior art, the cloud center intelligent alarm processing method has the following outstanding beneficial effects: the cloud center intelligent alarm processing method has the advantages that resources are related, alarms are related, and an alarm view is constructed, so that when the cloud center system is abnormal, the cloud center intelligent alarm processing method is beneficial to quickly positioning problems, finding out fault root causes, improving operation and maintenance efficiency, guaranteeing long-term stable operation of the system, and has good popularization and application values.
Drawings
FIG. 1 is an architecture diagram of a cloud-centric intelligent alarm processing system according to the present invention;
FIG. 2 is a resource index and resource association topological graph of the cloud center intelligent alarm processing system according to the present invention;
fig. 3 is a structural relationship diagram of the cloud center intelligent alarm processing system according to the present invention.
Detailed Description
The cloud center intelligent alarm processing system and method of the present invention will be further described in detail with reference to the accompanying drawings and embodiments.
Examples
As shown in fig. 1, the cloud-centric intelligent alarm processing system of the present invention includes resource management, index definition, duty management, alarm rules, an alarm view template, an alarm and an alarm view.
The resources are managed by resource management, and the resources comprise virtual machines, physical machines, switches, storage devices, virtual devices, middleware and application software. Attributes of a resource for resource management include a code, a name, a resource type, and an associated resource. These resources are first manually entered into the system or can be automatically discovered and automatically entered into the system. Each resource can have its own personalized attribute, the attributes of the resources are persisted in a database, and each resource can be separately tabulated and maintained. The resource associated resource refers to a resource to which the resource belongs or a resource having a connection relationship, for example, if the virtual machine is located on one physical machine, a resource to which the virtual machine belongs is the physical machine, and an associated resource of the virtual machine is the physical machine. For another example, if the physical machine is connected to the network through the switch, one of the associated resources of the physical machine is the switch. One resource can correspond to multiple associated resources, and the associated resources are configured only by configuring the associated resource type.
The index is defined by the index definition, and the index is a data acquisition item, including a CPU, a memory and network access flow. As shown in fig. 2, each resource may correspond to multiple indexes, and resource a corresponds to indexes a.1, a.2, and a.3; resource B corresponds to indexes b.1 and b.2; resource C corresponds to indexes c.1 and c.2; resource D corresponds to index d.1. The basic attributes defined by the index include: code, name, resource and unit. The encoding of the indicator definition must be consistent with the encoding of the indicator of the data acquisition layer.
The on-duty management is used to schedule daily on-duty personnel. The on-duty management is an indispensable module of the cloud center, and the cloud center reasonably arranges daily on-duty personnel through an on-duty management system. When the cloud center system generates an alarm or the fault cannot be automatically repaired, the person on duty on the day can be found through duty management as soon as possible, the person on duty is contacted, and the problem is solved as soon as possible, so that the healthy operation of the cloud center system is guaranteed. The duty management of the invention mainly maintains the arrangement of daily operation and maintenance duty personnel, and can quickly contact the duty personnel through telephone or mail.
And when the abnormity occurs, triggering an alarm in real time according to an alarm rule. The alarm rule includes a rule name, an index used by the rule, a threshold value of the index, and a processing suggestion when the index is not within the threshold value. The alarm rules may define alarm levels: severe, primary, secondary, general, warning. An alarm threshold is set for each level of alarm.
And generating an alarm view according to the alarm view template, wherein the alarm view template corresponds to the alarm rule one by one. The alarm view template comprises associated indexes, an index display form, a time range and a duty. The associated index is an index associated with an index in the alarm rule. If the index in the alarm rule is a.1 and the resource to which the alarm rule belongs is a, the correlation index may be other indexes of the resource a to which the correlation index belongs or indexes of other resources. The other resources are generally associated resources of resource a, that is, resources having an affiliation or connection relationship with resource a. If certain contact exists in the business and the alarm problem can be checked, the correlation index can be set. The configuration associated index is first selected from the other indexes of resource a and the associated resources of resource a. The index display form refers to a display form of the associated index. Whether a graph or a histogram, etc. Time range. The correlation index shows the performance data in which time range. This time frame may be set to the first hour, the first two hours, etc. of generating the alert. And (4) on duty. Refers to the person on duty and the contact address in the time range.
According to the set alarm rule, when a certain index is not in the threshold range, an alarm is triggered, alarm information is displayed in colors according to the alarm level set in the alarm rule, and the threshold and the value of the index when the alarm is generated are displayed at the same time. According to the alarm rule associated with the alarm information and the alarm view template associated with the alarm rule, the alarm view of the alarm information can be dynamically generated, modules of performance data, a duty management system, the alarm rule and the alarm view template which need to be associated with the alarm view are generated, and the structural relationship between the alarm view and each module is shown in the attached figure 3.
The generated alarm view comprises: time range, associated index chart, and alarm processing suggestion.
(1) Time range. If the time range set by the alarm view template is the previous hour of alarm generation, and the alarm generation time is xxxx year y month z day 10 point 10 min 0 s, the time range displayed in the alarm view is xxxx year y month z day 09 point 10 min 0 s to xxxx year y month z day 10 point 10 min 0 s.
(2) And (5) associating the index chart. And (3) reading the performance data of the associated index in the time range indicated by the (1) according to the associated index and the display form set in the alarm view template, and dynamically generating an ecological chart. And if the associated index has a set alarm rule and an alarm is generated in the time range indicated by the step (1), identifying in the chart.
(3) And (4) a person on duty. And if the alarm view template is on duty, reading the on-duty personnel and the contact information thereof in the on-duty system within the time range according to the time range indicated by the step (1).
(4) And (5) warning processing suggestion. And each alarm rule is configured with an alarm processing suggestion, and the alarm processing suggestion displays the processing suggestion of the alarm and the alarm processing suggestion of the index related to the alarm in terms of entries. If the alarm rule configured by the index a.1 is wa.1, the processing suggestion is ha.1. The related indexes in the alarm view template corresponding to wa.1 are a.2, b.1 and b.2, and the processing suggestions in the alarm rules corresponding to the related indexes are ha.2, hb.1 and hb.2 respectively. When wa.1 triggers an alarm, no alarm is generated by a.2 and b.1 and an alarm is generated by b.2 within the time range specified by the alarm view template of wa.1, then the alarm processing suggestions are displayed as ha.1 and hb.2.
The intelligent alarm processing method of the cloud center carries out resource management and allocates associated resources for the resources; defining indexes for each resource, wherein the indexes are consistent with the indexes for data acquisition; the on-duty personnel and the contact way of each day are maintained through on-duty management; by configuring an alarm rule, when the performance data is not in a threshold range, triggering an alarm and giving an alarm processing suggestion; and dynamically generating an alarm view through an alarm view template, displaying a performance chart of the associated index of the index in a specified time range, showing whether an alarm is generated, finding the on-duty personnel and the contact way in the specified time range, and giving a processing suggestion.
The method comprises the steps of configuring related resources of resources, setting an alarm rule for the index, wherein the index belongs to one of the resources, the alarm rule corresponds to an alarm view template, and dynamically generating an alarm view when an alarm is triggered. Each resource is configured with a resource type, and the associated resource is configured to be a resource which has an affiliation relation or a connection relation with the resource.
The above-described embodiments are merely preferred embodiments of the present invention, and general changes and substitutions by those skilled in the art within the technical scope of the present invention are included in the protection scope of the present invention.
Claims (8)
1. The utility model provides a cloud center intelligence warning processing system which characterized in that: the method comprises the steps of resource management, index definition, duty management, alarm rules, an alarm view template, an alarm and an alarm view, wherein the resource management is used for managing resources, and the resources comprise virtual machines, physical machines, switches, storage equipment, virtual equipment, middleware and application software; defining indexes by the index definition, wherein the indexes are items acquired by data and comprise a CPU (central processing unit), a memory and network access flow; the duty management is used for arranging daily duty personnel; when abnormality occurs, triggering alarm in real time according to alarm rules; generating an alarm view according to an alarm view template, wherein the alarm view template corresponds to the alarm rule one by one; and triggering an alarm when the index is not in the threshold range, and generating an alarm view according to an alarm rule associated with the alarm information.
2. The cloud-centric intelligent alarm processing system according to claim 1, wherein: the attributes of the resource managed by the resource include a code, a name, a resource type, and an associated resource.
3. The cloud-centric intelligent alarm processing system according to claim 2, wherein: the attributes of the indexes defined by the indexes comprise codes, names, resources and units.
4. The cloud-centric intelligent alarm processing system according to claim 3, wherein: the alarm rule comprises a rule name, an index used by the rule, a threshold value of the index, and a processing suggestion when the index is not in the threshold value range.
5. The cloud-centric intelligent alarm processing system according to claim 4, wherein: the alarm view template comprises correlation indexes, an index display form, a time range and a duty.
6. A cloud center intelligent alarm processing method is characterized in that: the method carries out resource management and allocates associated resources for the resources; defining indexes for each resource, wherein the indexes are consistent with the indexes for data acquisition; the on-duty personnel and the contact way of each day are maintained through on-duty management; by configuring an alarm rule, when the performance data is not in a threshold range, triggering an alarm and giving an alarm processing suggestion; and dynamically generating an alarm view through an alarm view template, displaying a performance chart of the associated index of the index in a specified time range, showing whether an alarm is generated, finding the on-duty personnel and the contact way in the specified time range, and giving a processing suggestion.
7. The cloud-centric intelligent alarm processing method according to claim 6, characterized in that: configuring the related resource of the resource, wherein the index belongs to one resource, setting an alarm rule for the index, wherein the alarm rule corresponds to an alarm view template, and dynamically generating an alarm view when an alarm is triggered.
8. The cloud-centric intelligent alarm processing method according to claim 7, characterized in that: each resource is configured with a resource type, and the associated resource is configured to be a resource which has an affiliation relation or a connection relation with the resource.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110036592.2A CN112866020A (en) | 2021-01-12 | 2021-01-12 | Cloud center intelligent alarm processing system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110036592.2A CN112866020A (en) | 2021-01-12 | 2021-01-12 | Cloud center intelligent alarm processing system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112866020A true CN112866020A (en) | 2021-05-28 |
Family
ID=76002885
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110036592.2A Pending CN112866020A (en) | 2021-01-12 | 2021-01-12 | Cloud center intelligent alarm processing system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112866020A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103746849A (en) * | 2014-01-14 | 2014-04-23 | 浪潮电子信息产业股份有限公司 | IT (information technology) operation and maintenance management system based on mobile intelligent terminal |
CN104410535A (en) * | 2014-12-23 | 2015-03-11 | 浪潮电子信息产业股份有限公司 | Intelligent monitoring and alarming method for cloud resources |
CN108829558A (en) * | 2018-05-22 | 2018-11-16 | 郑州云海信息技术有限公司 | A kind of intelligent operation management method and system of data center's alarm |
-
2021
- 2021-01-12 CN CN202110036592.2A patent/CN112866020A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103746849A (en) * | 2014-01-14 | 2014-04-23 | 浪潮电子信息产业股份有限公司 | IT (information technology) operation and maintenance management system based on mobile intelligent terminal |
CN104410535A (en) * | 2014-12-23 | 2015-03-11 | 浪潮电子信息产业股份有限公司 | Intelligent monitoring and alarming method for cloud resources |
CN108829558A (en) * | 2018-05-22 | 2018-11-16 | 郑州云海信息技术有限公司 | A kind of intelligent operation management method and system of data center's alarm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108512689A (en) | Micro services business monitoring method and server | |
CN109284251A (en) | Blog management method, device, computer equipment and storage medium | |
CN104407964A (en) | Centralized monitoring system and method based on data center | |
CN107124298A (en) | Alert aggregation method and system | |
CN114253228B (en) | Industrial equipment object modeling method and device based on digital twin | |
CN109240876A (en) | Example monitoring method, computer readable storage medium and terminal device | |
CN112783901A (en) | Internet of things time sequence big data processing method based on Internet of things middleware | |
CN102571413B (en) | Method for resource management under cluster environment | |
CN112579558A (en) | Method, device, storage medium and equipment for displaying topological graph | |
US8850321B2 (en) | Cross-domain business service management | |
CN113590432A (en) | Database inspection method and device | |
CN114157679A (en) | Cloud-native-based distributed application monitoring method, device, equipment and medium | |
CN109858807A (en) | Enterprise operation monitoring method and system | |
CN112866020A (en) | Cloud center intelligent alarm processing system and method | |
CN112784129A (en) | Pump station equipment operation and maintenance data supervision platform | |
CN111833110A (en) | Customer life cycle positioning method and device, electronic equipment and storage medium | |
CN116136801B (en) | Cloud platform data processing method and device, electronic equipment and storage medium | |
CN115269554A (en) | Tree data management method, device, equipment and medium based on multi-service scene | |
CN110471373B (en) | Information processing method, program, and information processing apparatus | |
CN114860851A (en) | Data processing method, device, equipment and storage medium | |
CN113962656A (en) | Power grid data asset management method, system, equipment and storage medium | |
CN109189786B (en) | Method for periodically generating custom report form for network element management system | |
CN113326401A (en) | Method and system for generating field blood margin | |
CN113971500A (en) | Data subdivision management method and device and data management platform | |
CN111597179B (en) | Method and device for automatically cleaning data, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210528 |
|
RJ01 | Rejection of invention patent application after publication |