WO2020029407A1 - Alarm data management method and apparatus, and computer device and storage medium - Google Patents

Alarm data management method and apparatus, and computer device and storage medium Download PDF

Info

Publication number
WO2020029407A1
WO2020029407A1 PCT/CN2018/108271 CN2018108271W WO2020029407A1 WO 2020029407 A1 WO2020029407 A1 WO 2020029407A1 CN 2018108271 W CN2018108271 W CN 2018108271W WO 2020029407 A1 WO2020029407 A1 WO 2020029407A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
alarm
preset
processing result
status
Prior art date
Application number
PCT/CN2018/108271
Other languages
French (fr)
Chinese (zh)
Inventor
李嘉勇
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020029407A1 publication Critical patent/WO2020029407A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display

Definitions

  • the present application relates to the field of Internet technologies, and in particular, to a method, a device, a computer device, and a storage medium for managing alarm data.
  • server clusters are mostly managed by a management platform.
  • the processing and display of alarm data of existing servers usually have the following problems:
  • the alarm information displayed on the page is fuzzy, such as which servers are displayed only If a fault occurs, the cause of the fault cannot be clearly displayed; no traceback function is provided for the recovered alarms, the specific time of the alarm recovery cannot be clearly seen, and the recovered alarm data is not used twice. Therefore, the existing processing method of the alarm letter data of the server cannot meet the needs of users. Therefore, it is necessary to provide a management method of alarm data to solve the above problems.
  • This application provides a method, a device, a computer device, and a storage medium for managing alarm data, and aims to provide timely and accurate alarm information.
  • This application provides a method for managing alarm data, which includes:
  • the status data is valid data, determining alarm data in the status data according to the identification bit information, and storing the alarm data in a first preset data table;
  • This application provides a device for managing alarm data, which includes:
  • a data collection unit configured to periodically poll a server of a server cluster to collect status data of the server, where the status data includes identification bit information;
  • a data judging unit configured to judge whether the status data is valid data
  • a data determining unit configured to determine alarm data in the status data according to the identification bit information if the status data is valid data, and store the alarm data in a first preset data table;
  • a data processing unit configured to periodically poll the first preset data table to obtain the alarm data, process the alarm data using preset processing rules to obtain a data processing result, and save the data processing result to a second In the data table;
  • An event generating unit is configured to periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal so that the terminal The pending events are displayed.
  • the present application also provides a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor.
  • the processor executes the program, the program provided by the application is implemented. Steps of the method for managing alarm data according to any one of the above.
  • the application also provides a computer storage medium, wherein the computer storage medium stores a computer program, and when the computer program is executed by a processor, the processor causes the processor to execute the alarm data described in any of the embodiments provided in the application. Steps of the management approach.
  • the embodiments of the present application provide a method, a device, a computer device, and a storage medium for managing alarm data.
  • the status data of the server is collected by periodically polling the servers in the server cluster, and the status data includes identification bit information.
  • the alarm data in the status data is determined according to the identification bit information, and the alarm data is stored in a first preset data table; the first preset data table is polled periodically.
  • Obtain the alarm data process the alarm data using a preset processing rule to obtain a data processing result, and save the data processing result in a second preset data table; periodically poll the second preset data table to Acquire the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal so that the terminal displays the pending event.
  • This method can not only provide timely and accurate alarms, but also retrospectively analyze the alarm information.
  • FIG. 1 is a schematic flowchart of a method for managing alarm data according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of sub-steps of the method for managing alarm data in FIG. 1;
  • FIG. 3 is a schematic flowchart of sub-steps of a method for managing alarm data in FIG. 1;
  • FIG. 4 is a schematic flowchart of a method for managing alarm data according to another embodiment of the present application.
  • FIG. 5 is a schematic block diagram of a device for managing alarm data according to an embodiment of the present application.
  • FIG. 6 is a schematic block diagram of a device for managing alarm data according to another embodiment of the present application.
  • FIG. 7 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • Embodiments of the present application provide a method, an apparatus, a computer device, and a storage medium for managing alarm data. This email generation method is applied to the server corresponding to the centralized management platform.
  • the centralized management platform is a platform software system developed based on out-of-band devices.
  • the platform software system includes a client and a server.
  • the client is configured in the terminal.
  • the server is configured in the management server.
  • the management server cooperates with the terminal. Achieve centralized management of servers in a server cluster.
  • the out-of-band device may be, for example, a baseboard management controller (BMC).
  • BMC baseboard management controller
  • a PC Personal Opera and maintenance automation platform can be developed based on the IPMI / REDFISH protocol, also known as an out-of-band management platform.
  • REDFISH also supports data center power supply / cooling fields and network switches.
  • FIG. 1 is a schematic flowchart of management of alarm data according to an embodiment of the present application.
  • the management method is applied to a management server, and the management server is configured with a server of a centralized management platform. As shown in FIG. 1, the management method includes steps S101 to S105.
  • S101 Periodically poll a server of a server cluster to collect status data of the server, where the status data includes identification bit information.
  • the duration corresponding to the timing poll can be set according to the actual situation, for example, it can be set according to the processing capacity of the management server, and the specific duration is not limited here, such as 2 minutes or 4 minutes.
  • the status data of the server refers to the status data of hardware components of the server, and the hardware components include hard disks, memory, power supplies, fans, and the like.
  • the status data is collected by a built-in detection tool in the server.
  • the status data includes identification bit information, and the identification bit information includes a normal identifier, an abnormal identifier, or an alarm identifier.
  • the method includes: obtaining a data value corresponding to the status data; and determining whether the status data is valid data according to the data value. It is detected whether the data value of the status data is a valid value. If the data value corresponding to the status data is a valid value, it is determined that the status data is valid data. If the status data is valid data, step S103 is performed. Detecting whether the data value of the status data is a valid value, for example, detecting whether the data value corresponding to the status data is a null value, and if the data value corresponding to the status data is a null value, determining that the status data is invalid data. If the status data is invalid data, the reason corresponding to the invalid data is obtained, and the reasons include: network disconnection, incorrect authentication password, or script error.
  • the causes of invalid data include: checking whether the network is normal when obtaining status data, and if the network is down. Then it will directly cause invalid data and mark this data as network failure; then enter the corresponding password for authentication. If the authentication fails, it will cause invalid data and mark this data as the authentication password error; if the network is normal, call the script Obtain various hardware information values. If the script reports an error, invalid data will be marked as a script error.
  • the valid data includes identification bit information
  • the identification bit information specifically includes identifiers such as normal, abnormal, and alarm. If the identification bit information is an abnormal identifier or an alarm identifier, it indicates that an alarm exists in the valid status data.
  • Data and save the alarm data corresponding to the identification information into a first preset data table, where the first preset data table is stored in a preset database, the preset database is a database corresponding to the management server, and the management database
  • the centralized management platform is configured.
  • S104 Periodically poll the first preset data table to obtain the alarm data, process the alarm data using a preset processing rule to obtain a data processing result, and save the data processing result to a second preset data table. .
  • the duration corresponding to the timing polling can also be set according to the actual situation, and the first preset data table is polled according to the set duration to obtain the alarm data.
  • the preset alarm processing rules are used to process the acquired alarm data to obtain a data processing result, and the data processing result is stored in a second preset data table, which is also stored in a preset database.
  • the preset database is a database corresponding to the management server, and the settings of the first preset data table and the second preset data table are convenient for processing and analyzing the number of alarm data.
  • processing the valid alarm data to obtain a data processing result by using a preset processing rule specifically includes: sub-steps S104a to S104d.
  • S104a Obtain the component type and host ID of the server, and classify the alarm data according to the component category to obtain component alarm data
  • S104b Obtain the server's Host information
  • S104c Repeated alarm processing is performed on the component alarm data according to the component category corresponding to the component alarm data
  • S104d Generate a data processing result based on the component alarm data and the host information after the repeated alarm processing.
  • the component category is a component name corresponding to different hardware components of the server, such as a hard disk, a power supply, a memory, a disk array, and a fan.
  • the host ID is the host SN number. Of course, the host name can also be used.
  • the alarm data is classified according to the component category, such as hard disk alarm data, power alarm data, and memory alarm data. Search and obtain host information of the server according to the host identifier, where the host information includes a host name, a host SN number, a host brand, a KVM IP, a host manufacturer, a host model, and the like. Repeated alarm processing of the component alarm data according to the component category corresponding to the component alarm data refers to removing the repeated alarm processing of the same component.
  • a component alarm data For example, if the alarm data of the hard disk component is collected and an alarm occurs at different times, a component alarm data . Generate the first data processing result according to the component alarm data and the host information that have undergone the repeated alarm processing. Specifically, an alarm data record may be generated according to the component alarm data and the host information, and the alarm data record is the first data processing result. .
  • S105 Periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal so that the terminal displays the pending Handle the event.
  • the duration corresponding to the timing poll can be set according to the actual situation, and the specific duration is not limited here, such as 2 minutes or 4 minutes, etc.
  • the second preset data table is periodically polled to start from Obtaining the data processing result in the second preset data table, and generating a pending event according to the data processing result, wherein the pending event includes host information and the number of alarms, alarm components, and pending Processing time, etc.
  • the pending events are posted on a centralized management platform and displayed in a sequence corresponding to the generation time of the pending events, so that the management personnel can manage the alarm data by processing the pending time.
  • step S105 includes: sub-steps S105a to S105d.
  • S105a generates a to-be-processed event according to the data processing result, where the to-be-processed event includes a ignore control, and the ignore control is used to trigger display of a ignore time for the user to select;
  • S105b Sending the pending event to the terminal causes the terminal to display the pending event through the centralized management platform;
  • S105c receiving a user-selected ignore time sent by the terminal;
  • S105d modifying the first preset data according to the ignore time The identification bit information of the alarm data in the table.
  • a to-be-processed event is generated according to the data processing result, and the to-be-processed event includes host information and the number of alarms and alarm components generated according to component alarm data, where the pending event includes a ignore control.
  • a preset ignore time for user selection is displayed, such as ignore 7 days or ignore 30 days.
  • Modifying the identification bit information in the status data according to the ignore time specifically replacing the identification bit information in the status data from the alarm identification with the normal identification, for a period of 30 days, and repeatedly performing the determination based on the identification information A step of alarm data in the status data.
  • the method further includes: judging whether the alarm data is a disappearing alarm according to the identification bit information in the alarm data. If the alarm data is a disappearing alarm, archive and classify the alarm data according to the host information of the server corresponding to the alarm data; and send the archived classification result to the terminal for display through a report page of the centralized management platform. Specifically, if the event to be processed is processed, for example, the user selects a corresponding ignore time, the identification bit information of the alarm data in the first preset data table is modified according to the processing result, and accordingly, the identification bit in the alarm data is modified. The information determines whether the alarm data is a disappearance alarm. If the alarm data is a disappearing alarm, the alarm data is classified according to the brand, component, place, or use of the server, and different reports are used for displaying according to different archive classification results.
  • the method of the above embodiment polls the servers of the server cluster regularly to collect the status data of the server, where the status data includes identification bit information; when it is determined that the status data is valid data, it is determined according to the identification bit information
  • the alarm data in the status data is stored in a first preset data table; the first preset data table is polled periodically to obtain the alarm data, and the alarm is processed using a preset processing rule Data to obtain a data processing result, and save the data processing result in a second preset data table; periodically poll the second preset data table to obtain the data processing result, and generate a waiting list based on the data processing result Process the event, and send the pending event to a terminal to cause the terminal to display the pending event.
  • This method can not only provide timely and accurate alarms, but also retrospectively analyze the alarm information.
  • FIG. 4 is a schematic flowchart of management of alarm data according to another embodiment of the present application.
  • the management method is applied to a management server, and the management server is configured with a server of a centralized management platform. As shown in FIG. 4, the management method includes steps S201 to S207.
  • S201 Periodically poll a server in a server cluster to collect status data of the server, where the status data includes identification bit information.
  • the state data of the server refers to the state data of hardware components of the server, and the hardware components include hard disks, memory, power supplies, fans, and the like.
  • the status data is collected by a built-in detection tool in the server.
  • the status data includes identification bit information, and the identification bit information includes a normal identifier, an abnormal identifier, or an alarm identifier.
  • the status data is valid data according to the data value of the status data. Detecting whether the data value corresponding to the status data is a null value; if the data value corresponding to the status data is not a null value, determining that the status data is valid data; if the data value corresponding to the status data is a null value, determining the data value The status data is invalid data. It is determined whether the status data is valid data. If the status data is valid data, step S203 is performed; if the status data is invalid data, step S205 is performed.
  • S203 Determine alarm data in the status data according to the identification bit information, and save the alarm data in a first preset data table.
  • the valid data includes identification bit information
  • the identification bit information specifically includes identifiers such as normal, abnormal, and alarm. If the identification bit information is an abnormal identifier or an alarm identifier, it indicates that an alarm exists in the valid status data.
  • Data and save the alarm data corresponding to the identification information into a first preset data table, where the first preset data table is stored in a preset database, the preset database is a database corresponding to the management server, and the management database
  • the centralized management platform is configured.
  • S204 Periodically poll the first preset data table to obtain the alarm data, process the alarm data using a preset processing rule to obtain a data processing result, and save the data processing result to a second preset data table.
  • the first preset data table is cyclically polled periodically according to a set duration to obtain the alarm data.
  • the preset alarm processing rules are used to process the acquired alarm data to obtain a data processing result, and the data processing result is stored in a second preset data table, which is also stored in a preset database.
  • the preset database is a database corresponding to the management server, and the settings of the first preset data table and the second preset data table are convenient for processing and analyzing the number of alarm data.
  • time information corresponding to the status data is invalid data is obtained; log information corresponding to the time information is obtained from the server according to the time information; and the invalid data is obtained according to the log information.
  • the corresponding invalid reason Because the server's log information records the reasons for invalid data, the reasons include: the network is disconnected, the authentication password is wrong, or the script is incorrect.
  • the status data and the invalidation cause are correspondingly stored as the data processing result in the second preset data table to generate a pending event sending terminal display so that the management personnel can The incident is handled in a timely manner.
  • the second preset data table is polled periodically to obtain the data processing result according to the set duration, and a pending event is generated according to the data processing result, and the pending event is sent to the terminal.
  • the terminal displays the to-be-processed event through a centralized management platform, so that a management person performs processing according to the to-be-processed event.
  • the method of the above embodiment processes the collected status data to obtain valid data, then confirms the alarm data in the valid status data, saves the alarm data in a first preset data table, and then saves the processing result to a second preset
  • the data table is used to perform convergence deduplication processing and archiving processing, generate pending events according to the processing results, and publish the pending events on a centralized management platform.
  • the centralized management platform can realize the management of alarm data. It then realizes accurate analysis of fault data and facilitates statistics of fault data.
  • FIG. 5 is a schematic block diagram of a device for managing alarm data according to an embodiment of the present application.
  • the present application further provides an alarm data management device.
  • the device for managing alarm data includes a unit for performing management of the above-mentioned alarm data, and the device may be configured in a server.
  • the alarm data management device 400 includes a data collection unit 401, a data determination unit 402, a data determination unit 403, a data processing unit 404, and an event generation unit 405.
  • a data collection unit 401 is configured to periodically poll a server of a server cluster to collect status data of the server, where the status data includes identification bit information.
  • the data determining unit 402 is configured to determine whether the status data is valid data.
  • a data determining unit 403 is configured to determine alarm data in the status data according to the identification bit information, and store the alarm data in a first preset data table.
  • a data processing unit 404 is configured to periodically poll the first preset data table to obtain the alarm data, process the alarm data using a preset processing rule to obtain a data processing result, and save the data processing result to a second In the preset data table.
  • the data processing unit 404 includes: an identification acquisition subunit 4041, an information acquisition subunit 4042, a repetition processing subunit 4043, and a result generation subunit 4044.
  • an identifier acquisition subunit 4041 is configured to acquire a component type and a host identifier of the server, and the alarm data is classified according to the component category to obtain component alarm data
  • an information acquisition subunit 4042 is configured to obtain The host ID is used to obtain the host information of the server
  • the repeated processing subunit 4043 is configured to perform repeated alarm processing on the component alarm data according to the component category corresponding to the component alarm data
  • the result generation subunit 4044 is configured to The component alarm data of the repeated alarm processing and the host information generate a data processing result.
  • An event generating unit 405 is configured to periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal to enable the The terminal displays the pending event.
  • the event generating unit 405 includes: an event generating sub-unit 4051, an event display sub-unit 4052, a time receiving sub-unit 4053, and an identification modification sub-unit 4054.
  • the event generating subunit 4051 is configured to generate a pending event according to the data processing result, the pending event includes an ignore control, and the ignore control is used to trigger the display of the ignore time for the user to select;
  • the event display sub A unit 4052 is configured to send the pending event to the terminal so that the terminal displays the pending event through a centralized management platform;
  • a time receiving subunit 4053 is configured to receive a user-selected ignore time sent by the terminal; an identifier
  • the modification subunit 4054 is configured to modify the identification bit information of the alarm data in the first preset data table according to the ignore time.
  • the event generating unit 405 is further configured to: determine whether the alarm data is a disappearing alarm according to the identification bit information in the alarm data; if the alarm data is a disappearing alarm, according to the server corresponding to the alarm data
  • the host information archives and classifies the alarm data; and sends the archived classification results to the terminal for display through a report page of the centralized management platform.
  • FIG. 6 is a schematic block diagram of an alarm data management apparatus according to another embodiment of the present application.
  • the present application further provides an alarm data management device.
  • the device for managing alarm data includes a unit for performing management of the above-mentioned alarm data, and the device may be configured in a server.
  • the alarm data management device 500 includes a data acquisition unit 501, a data judgment unit 502, a data determination unit 503, a data processing unit 504, a cause acquisition unit 505, a result storage unit 506, and an event generation unit 507.
  • a data collection unit 501 is configured to periodically poll a server of a server cluster to collect status data of the server, where the status data includes identification bit information.
  • the data determining unit 502 is configured to determine whether the status data is valid data.
  • a data determining unit 503 is configured to determine alarm data in the status data according to the identification bit information, and store the alarm data in a first preset data table.
  • a data processing unit 504 is configured to periodically poll the first preset data table to obtain the alarm data, process the alarm data using a preset processing rule to obtain a data processing result, and save the data processing result to a second In the preset data table.
  • the cause obtaining unit 505 is configured to obtain an invalid cause corresponding to the status data being invalid data.
  • a result saving unit 506 is configured to save the status data and the invalidation cause as the data processing result in the second preset data table.
  • An event generating unit 507 is configured to periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal to enable the The terminal displays the pending event.
  • the above apparatus may be implemented in the form of a computer program, and the computer program may be run on a computer device as shown in FIG. 7.
  • FIG. 7 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • the computer device 700 device may be a server.
  • the computer device 700 includes a processor 720, a memory, and a network interface 750 connected through a system bus 710.
  • the memory may include a non-volatile storage medium 730 and an internal memory 740.
  • the non-volatile storage medium 730 can store an operating system 731 and a computer program 732.
  • the processor 720 can cause the processor 720 to execute any method for managing alarm data.
  • the processor 720 is configured to provide computing and control capabilities to support the operation of the entire computer device 700.
  • the internal memory 740 provides an environment for running the computer program 732 in the non-volatile storage medium 730.
  • the processor 720 can execute any method for managing alarm data.
  • the network interface 750 is used for network communication, such as sending assigned tasks.
  • the structure shown in FIG. 7 is only a block diagram of a part of the structure related to the scheme of the present application, and does not constitute a limitation on the computer equipment 700 to which the scheme of the present application is applied.
  • the specific computer equipment The 700 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
  • the processor 720 is configured to run a program code stored in a memory to implement the process steps of the embodiments of the foregoing methods.
  • the processor 720 may be a central processing unit (CPU), and the processor 720 may also be another general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), Application-specific integrated circuits (Application Specific Integrated Circuits, ASICs), ready-made programmable gate arrays (Field-Programmable Gate Arrays, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor, or the processor may be any conventional processor.
  • FIG. 7 does not constitute a limitation on the computer device 700, and may include more or fewer components than shown in the figure, or combine some components or different components. Layout.
  • the computer program can be stored in a storage medium, which is a computer-readable medium. Read storage media.
  • the computer program may be stored in a storage medium of a computer system and executed by at least one processor in the computer system, so as to implement the process steps of the embodiments including the foregoing methods.
  • the computer-readable storage medium may be various media that can store program codes, such as a magnetic disk, an optical disk, a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a magnetic disk, or an optical disk.
  • program codes such as a magnetic disk, an optical disk, a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a magnetic disk, or an optical disk.
  • the disclosed apparatus and method for managing alarm data may be implemented in other ways.
  • the embodiment of the device for managing alarm data described above is merely exemplary.
  • the division of each unit is only a logical function division, and there may be another division manner in actual implementation.
  • multiple units or components can be combined or integrated into another system, or some features can be ignored or not implemented.
  • the units in the apparatus of the embodiment of the present application may be combined, divided, and deleted according to actual needs.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of this application is essentially a part that contributes to the existing technology, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium Included are instructions for causing a computer device (which may be a personal computer, a terminal, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Alarm Systems (AREA)

Abstract

An alarm data manafgement method and apparatus, and a computer device and a storage medium. The method comprises: regularly polling servers in a server cluster to collect state data of the servers, wherein the state data comprises flag bit information (S101); determining whether the state data is valid data (S102); determining alarm data in the state data according to the flag bit information, and storing the alarm data in a first preset data table (S103); regularly polling the first preset data table to acquire the alarm data, processing the alarm data by means of a preset processing rule to obtain a data processing result, and storing the data processing result in a second preset data table (S104); and regularly polling the second preset data table to acquire the data processing result, generating an event to be processed according to the data processing result, and sending the event to be processed to a terminal, so that the terminal displays the event to be processed (S105).

Description

告警数据的管理方法、装置、计算机设备及存储介质Management method, device, computer equipment and storage medium of alarm data
本申请要求于2018年8月8日提交中国专利局、申请号为201810897093.0、发明名称为“告警数据的管理方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed on August 8, 2018 with the Chinese Patent Office, application number 201810897093.0, and invention name "Management Method, Device, Computer Equipment, and Storage Medium for Alarm Data". Citations are incorporated in this application.
技术领域Technical field
本申请涉及互联网技术领域,尤其涉及一种告警数据的管理方法、装置、计算机设备及存储介质。The present application relates to the field of Internet technologies, and in particular, to a method, a device, a computer device, and a storage medium for managing alarm data.
背景技术Background technique
目前,随着互联网技术的发展,对服务器集群多采用管理平台进行集中管理,但是现有的服务器的告警数据的处理与展示通常有以下问题:页面展示的告警信息比较模糊,比如只展示哪些服务器发生故障,不能清晰地展示出故障的原因;对于已恢复的告警没有提供追溯功能,不能清楚地看到告警恢复的具体时间,没对恢复的告警数据进行二次利用。因此现有的服务器的告警信数据的处理方式,不能满足用户的需求。因此,有必要提供一种告警数据的管理方法以解决上述问题。At present, with the development of Internet technology, server clusters are mostly managed by a management platform. However, the processing and display of alarm data of existing servers usually have the following problems: The alarm information displayed on the page is fuzzy, such as which servers are displayed only If a fault occurs, the cause of the fault cannot be clearly displayed; no traceback function is provided for the recovered alarms, the specific time of the alarm recovery cannot be clearly seen, and the recovered alarm data is not used twice. Therefore, the existing processing method of the alarm letter data of the server cannot meet the needs of users. Therefore, it is necessary to provide a management method of alarm data to solve the above problems.
发明内容Summary of the invention
本申请提供了一种告警数据的管理方法、装置、计算机设备及存储介质,旨在提供及时准确的告警信息。This application provides a method, a device, a computer device, and a storage medium for managing alarm data, and aims to provide timely and accurate alarm information.
本申请提供了一种告警数据的管理方法,其包括:This application provides a method for managing alarm data, which includes:
定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息;Periodically poll the servers of the server cluster to collect status data of the server, where the status data includes identification bit information;
判断所述状态数据是否为有效数据;Determining whether the status data is valid data;
若所述状态数据为有效数据,根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中;If the status data is valid data, determining alarm data in the status data according to the identification bit information, and storing the alarm data in a first preset data table;
定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理 所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中;Periodically polling the first preset data table to obtain the alarm data, processing the alarm data using a preset processing rule to obtain a data processing result, and storing the data processing result in a second preset data table;
定时轮询所述第二预设数据表以获取所述数据处理结果,根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。Periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal so that the terminal displays the pending event .
本申请提供了一种告警数据的管理装置,其包括:This application provides a device for managing alarm data, which includes:
数据采集单元,用于定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息;A data collection unit, configured to periodically poll a server of a server cluster to collect status data of the server, where the status data includes identification bit information;
数据判断单元,用于判断所述状态数据是否为有效数据;A data judging unit, configured to judge whether the status data is valid data;
数据确定单元,用于若所述状态数据为有效数据,根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中;A data determining unit, configured to determine alarm data in the status data according to the identification bit information if the status data is valid data, and store the alarm data in a first preset data table;
数据处理单元,用于定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中;A data processing unit, configured to periodically poll the first preset data table to obtain the alarm data, process the alarm data using preset processing rules to obtain a data processing result, and save the data processing result to a second In the data table;
事件生成单元,用于定时轮询所述第二预设数据表以获取所述数据处理结果,根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。An event generating unit is configured to periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal so that the terminal The pending events are displayed.
本申请还提供了一种计算机设备,其包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述程序时实现本申请提供的任意一项所述的告警数据的管理方法的步骤。The present application also provides a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor. When the processor executes the program, the program provided by the application is implemented. Steps of the method for managing alarm data according to any one of the above.
本申请还提供了一种计算机存储介质,其中所述计算机存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器执行本申请提供的任意实施例所述的告警数据的管理方法的步骤。The application also provides a computer storage medium, wherein the computer storage medium stores a computer program, and when the computer program is executed by a processor, the processor causes the processor to execute the alarm data described in any of the embodiments provided in the application. Steps of the management approach.
本申请实施例提供了告警数据的管理方法、装置、计算机设备及存储介质,通过定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息;在判断出所述状态数据为有效数据时,根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中;定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中;定时轮询所述第二预设数据表以获取所述数据处理结果,根据所述数据处理结 果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。该方法不仅可以提供及时准确的报警,还可以对告警信息进行追溯分析。The embodiments of the present application provide a method, a device, a computer device, and a storage medium for managing alarm data. The status data of the server is collected by periodically polling the servers in the server cluster, and the status data includes identification bit information. When the status data is valid data, the alarm data in the status data is determined according to the identification bit information, and the alarm data is stored in a first preset data table; the first preset data table is polled periodically. Obtain the alarm data, process the alarm data using a preset processing rule to obtain a data processing result, and save the data processing result in a second preset data table; periodically poll the second preset data table to Acquire the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal so that the terminal displays the pending event. This method can not only provide timely and accurate alarms, but also retrospectively analyze the alarm information.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the drawings used in the description of the embodiments are briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present application. For ordinary technicians, other drawings can be obtained based on these drawings without paying creative work.
图1是本申请一实施例提供的一种告警数据的管理方法的示意流程图;FIG. 1 is a schematic flowchart of a method for managing alarm data according to an embodiment of the present application; FIG.
图2是图1中告警数据的管理方法的子步骤示意流程图;FIG. 2 is a schematic flowchart of sub-steps of the method for managing alarm data in FIG. 1; FIG.
图3是图1中告警数据的管理方法的子步骤示意流程图;3 is a schematic flowchart of sub-steps of a method for managing alarm data in FIG. 1;
图4是本申请另一实施例提供的一种告警数据的管理方法的示意流程图;4 is a schematic flowchart of a method for managing alarm data according to another embodiment of the present application;
图5是本申请一实施例提供的一种告警数据的管理装置的示意性框图;5 is a schematic block diagram of a device for managing alarm data according to an embodiment of the present application;
图6是本申请另一实施例提供的一种告警数据的管理装置的示意性框图;6 is a schematic block diagram of a device for managing alarm data according to another embodiment of the present application;
图7是本申请一实施例提供的一种计算机设备的示意性框图。FIG. 7 is a schematic block diagram of a computer device according to an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In the following, the technical solutions in the embodiments of the present application will be clearly and completely described with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
本申请实施例提供了一种告警数据的管理方法、装置、计算机设备和存储介质。该邮件生成方法应用于集中管理平台对应的服务端。Embodiments of the present application provide a method, an apparatus, a computer device, and a storage medium for managing alarm data. This email generation method is applied to the server corresponding to the centralized management platform.
其中,该集中管理平台为基于带外设备而开发的平台软件系统,该平台软件系统括客户端和服务端,其中客户端配置在终端中,服务端配置在管理服务器中,管理服务器和终端配合实现对服务器集群中的服务器进行集中管理。该带外设备可例如为BMC(Baseboard Management Controller,基板管理控制器)。在服务器安装该带外设备的情况下,基于IPMI/REDFISH协议即可开发出PC(Personal Computer,个人计算机)硬件运维自动化平台,也称为带外管理平 台。REDFISH除了支持服务器,还可支持数据中心供电/冷却领域以及网络交换机。其采用基础架构的RESTful API行业标准,使用HTTPS协议和JSON数据格式,更容易与DevOps工具对接,基于IPMI和REDFISH这套业界标准,为开发一套工具以获取PC硬件的物理状态数据,同时也可以对其远程管理提供了可能。若再将此数据及管理功能集中,就可以搭建一套PC硬件运维管理平台,即集中管理平台。The centralized management platform is a platform software system developed based on out-of-band devices. The platform software system includes a client and a server. The client is configured in the terminal. The server is configured in the management server. The management server cooperates with the terminal. Achieve centralized management of servers in a server cluster. The out-of-band device may be, for example, a baseboard management controller (BMC). When the out-of-band device is installed on the server, a PC (Personal Computer) personal operation and maintenance automation platform can be developed based on the IPMI / REDFISH protocol, also known as an out-of-band management platform. In addition to supporting servers, REDFISH also supports data center power supply / cooling fields and network switches. It uses the RESTful API industry standard of the infrastructure, uses the HTTPS protocol and JSON data format, and is easier to interface with DevOps tools. Based on the industry standards of IPMI and REDFISH, it develops a set of tools to obtain the physical state data of PC hardware. It is possible to remotely manage it. If this data and management functions are centralized, a set of PC hardware operation and maintenance management platform can be built, that is, a centralized management platform.
请参阅图1,图1是本申请一实施例提供的一种告警数据的管理的示意流程图。该管理方法应用管理服务器中,该管理服务器配置有集中管理平台的服务端。如图1所示,该管理方法包括步骤S101~S105。Please refer to FIG. 1, which is a schematic flowchart of management of alarm data according to an embodiment of the present application. The management method is applied to a management server, and the management server is configured with a server of a centralized management platform. As shown in FIG. 1, the management method includes steps S101 to S105.
S101、定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息。S101. Periodically poll a server of a server cluster to collect status data of the server, where the status data includes identification bit information.
在本实施例中,该定时轮询所对应的时长可以根据实际情况进行设定,比如可以根据管理服务器的处理能力进行设定等,具体时长在此不做限定,比如2分钟或者4分钟等。服务器的状态数据是指服务器的硬件组件的状态数据,所述硬件组件据包括硬盘、内存、电源和风扇等。所述状态数据由服务器中的自带检测工具进行采集,该状态数据包括标识位信息,所述标识位信息包括:正常标识、异常标识或告警标识等。In this embodiment, the duration corresponding to the timing poll can be set according to the actual situation, for example, it can be set according to the processing capacity of the management server, and the specific duration is not limited here, such as 2 minutes or 4 minutes. . The status data of the server refers to the status data of hardware components of the server, and the hardware components include hard disks, memory, power supplies, fans, and the like. The status data is collected by a built-in detection tool in the server. The status data includes identification bit information, and the identification bit information includes a normal identifier, an abnormal identifier, or an alarm identifier.
S102、判断所述状态数据是否为有效数据。S102. Determine whether the status data is valid data.
在本实施例中,具体是根据状态数据的数据值判断所述状态数据是否为有效数据。具体包括:获取所述状态数据对应的数据值;根据所述数据值判断所述状态数据是否为有效数据。检测该状态数据的数据值是否为有效值,若状态数据对应的数据值为有效值,则判定所述状态数据是有效数据,若所述状态数据为有效数据,执行步骤S103。检测该状态数据的数据值是否为有效值,比如检测状态数据对应的数据值是否为空值,若状态数据对应的数据值为空值,则判定所述状态数据是无效数据。若所述状态数据为无效数据,则获取无效数据所对应的原因,该原因包括:网络不通、认证密码错误或脚本出错等。In this embodiment, it is specifically determined whether the status data is valid data according to the data value of the status data. Specifically, the method includes: obtaining a data value corresponding to the status data; and determining whether the status data is valid data according to the data value. It is detected whether the data value of the status data is a valid value. If the data value corresponding to the status data is a valid value, it is determined that the status data is valid data. If the status data is valid data, step S103 is performed. Detecting whether the data value of the status data is a valid value, for example, detecting whether the data value corresponding to the status data is a null value, and if the data value corresponding to the status data is a null value, determining that the status data is invalid data. If the status data is invalid data, the reason corresponding to the invalid data is obtained, and the reasons include: network disconnection, incorrect authentication password, or script error.
具体地,造成无效数据原因,包括:获取状态数据时检查网络是否正常,如果网络不通。则就直接会造成无效数据,并把这数据标志为网络不通;则进行输入相应的密码进行认证,如果认证失败会造成无效数据,并把这数据标志为认证密码错误;如果网络正常则调用脚本获取各种各样的硬件信息数值,如 果脚本报错了,则会造成无效数据标记为脚本出错。Specifically, the causes of invalid data include: checking whether the network is normal when obtaining status data, and if the network is down. Then it will directly cause invalid data and mark this data as network failure; then enter the corresponding password for authentication. If the authentication fails, it will cause invalid data and mark this data as the authentication password error; if the network is normal, call the script Obtain various hardware information values. If the script reports an error, invalid data will be marked as a script error.
S103、根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中。S103. Determine alarm data in the status data according to the identification bit information, and save the alarm data in a first preset data table.
在本实施例中,所述有效数据中包括标识位信息,该标识位信息具体为正常、异常和告警等标识,如果标识位信息为异常标识或告警标识,则表明所述有效状态数据存在告警数据,并将标识位信息对应的告警数据保存至第一预设数据表中,其中第一预设数据表保存在预设数据库中,所述预设数据库为管理服务器对应的数据库,该管理数据库配置有所述集中管理平台。In this embodiment, the valid data includes identification bit information, and the identification bit information specifically includes identifiers such as normal, abnormal, and alarm. If the identification bit information is an abnormal identifier or an alarm identifier, it indicates that an alarm exists in the valid status data. Data, and save the alarm data corresponding to the identification information into a first preset data table, where the first preset data table is stored in a preset database, the preset database is a database corresponding to the management server, and the management database The centralized management platform is configured.
S104、定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中。S104. Periodically poll the first preset data table to obtain the alarm data, process the alarm data using a preset processing rule to obtain a data processing result, and save the data processing result to a second preset data table. .
在本实施例中,定时轮询对应的时长也可根据实际情况进行设定,根据设定的时长轮询所述第一预设数据表以获取所述告警数据。并采用预设处理规则对获取的告警数据进行处理以得到数据处理结果,并将所述数据处理结果保存至第二预设数据表中,该第二预设数据表也保存在预设数据库中,所述预设数据库为所述管理服务器所对应的数据库,其中,第一预设数据表和第二预设数据表的设置便于告警数据数处理和分析。In this embodiment, the duration corresponding to the timing polling can also be set according to the actual situation, and the first preset data table is polled according to the set duration to obtain the alarm data. The preset alarm processing rules are used to process the acquired alarm data to obtain a data processing result, and the data processing result is stored in a second preset data table, which is also stored in a preset database. The preset database is a database corresponding to the management server, and the settings of the first preset data table and the second preset data table are convenient for processing and analyzing the number of alarm data.
在一实施例中,所述采用预设处理规则处理所述有效告警数据以得到数据处理结果,具体包括:子步骤S104a至S104d。如图2所示,其中S104a、获取所述服务器的组件类型和主机标识,根据所述组件类别将所述告警数据进行分类以得到组件告警数据;S104b、根据所述主机标识获取所述服务器的主机信息;S104c、根据所述组件告警数据对应的组件类别对所述组件告警数据作重复告警处理;S104d、根据经过所述重复告警处理的组件告警数据和所述主机信息生成数据处理结果。In an embodiment, processing the valid alarm data to obtain a data processing result by using a preset processing rule specifically includes: sub-steps S104a to S104d. As shown in FIG. 2, S104a: Obtain the component type and host ID of the server, and classify the alarm data according to the component category to obtain component alarm data; S104b: Obtain the server's Host information; S104c: Repeated alarm processing is performed on the component alarm data according to the component category corresponding to the component alarm data; S104d: Generate a data processing result based on the component alarm data and the host information after the repeated alarm processing.
具体地,该组件类别为服务器的不同硬件组件对应的组件名称,比如硬盘、电源、内存、磁盘阵列和风扇等。主机标识为主机SN号,当然也可以使用主机名。根据所述组件类别将所述告警数据进行分类,比如分为硬盘告警数据、电源告警数据和内存告警数据等。根据所述主机标识搜索并获取所述服务器的主机信息,其中所述主机信息包括主机名、主机SN号、主机品牌、KVM IP、主机厂商和主机型号等。根据所述组件告警数据对应的组件类别对所述组件告警数 据作重复告警处理,是指去除同一组件重复告警处理,比如采集到硬盘组件告警数据在不同时间均出现了报警,则一条组件告警数据。根据经过所述重复告警处理的组件告警数据和所述主机信息生成所述第一数据处理结果,具体可以根据组件告警数据和主机信息生成告警数据记录,该告警数据记录即为第一数据处理结果。Specifically, the component category is a component name corresponding to different hardware components of the server, such as a hard disk, a power supply, a memory, a disk array, and a fan. The host ID is the host SN number. Of course, the host name can also be used. The alarm data is classified according to the component category, such as hard disk alarm data, power alarm data, and memory alarm data. Search and obtain host information of the server according to the host identifier, where the host information includes a host name, a host SN number, a host brand, a KVM IP, a host manufacturer, a host model, and the like. Repeated alarm processing of the component alarm data according to the component category corresponding to the component alarm data refers to removing the repeated alarm processing of the same component. For example, if the alarm data of the hard disk component is collected and an alarm occurs at different times, a component alarm data . Generate the first data processing result according to the component alarm data and the host information that have undergone the repeated alarm processing. Specifically, an alarm data record may be generated according to the component alarm data and the host information, and the alarm data record is the first data processing result. .
S105、定时轮询所述第二预设数据表以获取所述数据处理结果,根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。S105. Periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal so that the terminal displays the pending Handle the event.
在本实施例中,定时轮询所对应的时长可以根据实际情况进行设定,具体时长在此不做限定,比如2分钟或者4分钟等,定时轮询所述第二预设数据表以从所述第二预设数据表中获取所述数据处理结果,根据所述数据处理结果生成待处理事件,其中所述待处理事件包括主机信息以及根据组件告警数据生成的报警数量、报警组件和待处理时间等。将所述待处理事件发布在集中管理平台上并按照待处理事件的生成时间对应顺序进行显示,以便管理人员通过处理该待处理时间实现对所述告警数据进行管理。In this embodiment, the duration corresponding to the timing poll can be set according to the actual situation, and the specific duration is not limited here, such as 2 minutes or 4 minutes, etc., and the second preset data table is periodically polled to start from Obtaining the data processing result in the second preset data table, and generating a pending event according to the data processing result, wherein the pending event includes host information and the number of alarms, alarm components, and pending Processing time, etc. The pending events are posted on a centralized management platform and displayed in a sequence corresponding to the generation time of the pending events, so that the management personnel can manage the alarm data by processing the pending time.
在一实施例中,步骤S105包括:子步骤S105a至S105d。如图3所示,其中S105a、根据所述数据处理结果生成待处理事件,所述待处理事件中包括忽略控件,所述忽略控件用于触发显示供用户选择的忽略时间;S105b、将所述待处理事件发送至终端使所述终端通过集中管理平台显示所述待处理事件;S105c、接收所述终端发送的用户选择的忽略时间;S105d、根据所述忽略时间修改所述第一预设数据表中的告警数据的标识位信息。In an embodiment, step S105 includes: sub-steps S105a to S105d. As shown in FIG. 3, S105a generates a to-be-processed event according to the data processing result, where the to-be-processed event includes a ignore control, and the ignore control is used to trigger display of a ignore time for the user to select; S105b. Sending the pending event to the terminal causes the terminal to display the pending event through the centralized management platform; S105c, receiving a user-selected ignore time sent by the terminal; S105d, modifying the first preset data according to the ignore time The identification bit information of the alarm data in the table.
具体地,根据所述数据处理结果生成待处理事件,所述待处理事件包括主机信息以及根据组件告警数据生成的报警数量和报警组件等,其中所述待处理事件中包括忽略控件。当检测到用户点击所述忽略控件时,显示供用户选择的预设设置的忽略时间,该忽略时间比如忽略7天或忽略30天等。获取所述用户选择的忽略时间,比如用户选择忽略30天。根据所述忽略时间修改所述状态数据中的标识位信息,具体将状态数据中的标识位信息由告警标识替换为正常标识,期限为30天,并重复执行所述根据所述标识位信息确定所述状态数据中的告警数据的步骤。Specifically, a to-be-processed event is generated according to the data processing result, and the to-be-processed event includes host information and the number of alarms and alarm components generated according to component alarm data, where the pending event includes a ignore control. When it is detected that the user clicks the ignore control, a preset ignore time for user selection is displayed, such as ignore 7 days or ignore 30 days. Obtain the ignore time selected by the user, for example, the user chooses to ignore 30 days. Modifying the identification bit information in the status data according to the ignore time, specifically replacing the identification bit information in the status data from the alarm identification with the normal identification, for a period of 30 days, and repeatedly performing the determination based on the identification information A step of alarm data in the status data.
此外,所述根据所述忽略时间修改所述第一预设数据表中的告警数据的标 识位信息之后,还包括:根据所述告警数据中的标识位信息判断所述告警数据是否为消失告警;若所述告警数据为消失告警,则根据所述告警数据对应的服务器的主机信息对所述告警数据归档分类;并将归档分类结果发送至终端通过集中管理平台的报表页面进行展示。具体地,如果对待处理事件进行处理,比如用户选择相应的忽略时间,则根据处理结果返回修改第一预设数据表中的告警数据的标识位信息,由此根据所述告警数据中的标识位信息判断所述告警数据是否为消失告警。若所述告警数据为消失告警,则根据服务器的品牌、组件、地点或用途对所述告警数据进行分类,并按不同归档分类结果使用不同的报表进行显示。In addition, after modifying the identification bit information of the alarm data in the first preset data table according to the ignore time, the method further includes: judging whether the alarm data is a disappearing alarm according to the identification bit information in the alarm data. If the alarm data is a disappearing alarm, archive and classify the alarm data according to the host information of the server corresponding to the alarm data; and send the archived classification result to the terminal for display through a report page of the centralized management platform. Specifically, if the event to be processed is processed, for example, the user selects a corresponding ignore time, the identification bit information of the alarm data in the first preset data table is modified according to the processing result, and accordingly, the identification bit in the alarm data is modified. The information determines whether the alarm data is a disappearance alarm. If the alarm data is a disappearing alarm, the alarm data is classified according to the brand, component, place, or use of the server, and different reports are used for displaying according to different archive classification results.
上述实施例的方法通过定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息;在判断出所述状态数据为有效数据时,根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中;定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中;定时轮询所述第二预设数据表以获取所述数据处理结果,根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。该方法不仅可以提供及时准确的报警,还可以对告警信息进行追溯分析。The method of the above embodiment polls the servers of the server cluster regularly to collect the status data of the server, where the status data includes identification bit information; when it is determined that the status data is valid data, it is determined according to the identification bit information The alarm data in the status data is stored in a first preset data table; the first preset data table is polled periodically to obtain the alarm data, and the alarm is processed using a preset processing rule Data to obtain a data processing result, and save the data processing result in a second preset data table; periodically poll the second preset data table to obtain the data processing result, and generate a waiting list based on the data processing result Process the event, and send the pending event to a terminal to cause the terminal to display the pending event. This method can not only provide timely and accurate alarms, but also retrospectively analyze the alarm information.
请参阅图4,图4是本申请另一实施例提供的一种告警数据的管理的示意流程图。该管理方法应用管理服务器中,该管理服务器配置有集中管理平台的服务端。如图4所示,该管理方法包括步骤S201~S207。Please refer to FIG. 4, which is a schematic flowchart of management of alarm data according to another embodiment of the present application. The management method is applied to a management server, and the management server is configured with a server of a centralized management platform. As shown in FIG. 4, the management method includes steps S201 to S207.
S201、定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息。S201. Periodically poll a server in a server cluster to collect status data of the server, where the status data includes identification bit information.
在本实施例中,服务器的状态数据是指服务器的硬件组件的状态数据,所述硬件组件据包括硬盘、内存、电源和风扇等。所述状态数据由服务器中的自带检测工具进行采集,该状态数据包括标识位信息,所述标识位信息包括:正常标识、异常标识或告警标识等。In this embodiment, the state data of the server refers to the state data of hardware components of the server, and the hardware components include hard disks, memory, power supplies, fans, and the like. The status data is collected by a built-in detection tool in the server. The status data includes identification bit information, and the identification bit information includes a normal identifier, an abnormal identifier, or an alarm identifier.
S202、判断所述状态数据是否为有效数据。S202. Determine whether the status data is valid data.
在本实施例中,具体是根据状态数据的数据值判断所述状态数据是否为有效数据。检测所述状态数据对应的数据值是否为空值,若状态数据对应的数据 值不为空值,则判定所述状态数据是有效数据;若状态数据对应的数据值为空值,则判定所述状态数据是无效数据。判断所述状态数据是否为有效数据,若所述状态数据为有效数据,则执行步骤S203;若所述状态数据为无效数据,则执行步骤S205。In this embodiment, it is specifically determined whether the status data is valid data according to the data value of the status data. Detecting whether the data value corresponding to the status data is a null value; if the data value corresponding to the status data is not a null value, determining that the status data is valid data; if the data value corresponding to the status data is a null value, determining the data value The status data is invalid data. It is determined whether the status data is valid data. If the status data is valid data, step S203 is performed; if the status data is invalid data, step S205 is performed.
S203、根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中。S203. Determine alarm data in the status data according to the identification bit information, and save the alarm data in a first preset data table.
在本实施例中,所述有效数据中包括标识位信息,该标识位信息具体为正常、异常和告警等标识,如果标识位信息为异常标识或告警标识,则表明所述有效状态数据存在告警数据,并将标识位信息对应的告警数据保存至第一预设数据表中,其中第一预设数据表保存在预设数据库中,所述预设数据库为管理服务器对应的数据库,该管理数据库配置有所述集中管理平台。In this embodiment, the valid data includes identification bit information, and the identification bit information specifically includes identifiers such as normal, abnormal, and alarm. If the identification bit information is an abnormal identifier or an alarm identifier, it indicates that an alarm exists in the valid status data. Data, and save the alarm data corresponding to the identification information into a first preset data table, where the first preset data table is stored in a preset database, the preset database is a database corresponding to the management server, and the management database The centralized management platform is configured.
S204、定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中。S204. Periodically poll the first preset data table to obtain the alarm data, process the alarm data using a preset processing rule to obtain a data processing result, and save the data processing result to a second preset data table. .
在本实施例中,根据设定时长定时循环轮询所述第一预设数据表以获取所述告警数据。并采用预设处理规则对获取的告警数据进行处理以得到数据处理结果,并将所述数据处理结果保存至第二预设数据表中,该第二预设数据表也保存在预设数据库中,所述预设数据库为所述管理服务器所对应的数据库,其中,第一预设数据表和第二预设数据表的设置便于告警数据数处理和分析。In this embodiment, the first preset data table is cyclically polled periodically according to a set duration to obtain the alarm data. The preset alarm processing rules are used to process the acquired alarm data to obtain a data processing result, and the data processing result is stored in a second preset data table, which is also stored in a preset database. The preset database is a database corresponding to the management server, and the settings of the first preset data table and the second preset data table are convenient for processing and analyzing the number of alarm data.
S205、获取所述状态数据为无效数据所对应的无效原因。S205. Obtain an invalidation reason corresponding to the status data being invalid data.
在本实施例中,获取所述状态数据为无效数据对应的时间信息;根据所述时间信息从所述服务器中获取与所述时间信息对应的日志信息;根据所述日志信息获取所述无效数据所对应的无效原因。因为服务器的日志信息记录着无效数据的原因,该原因包括:网络不通、认证密码错误或脚本出错等。In this embodiment, time information corresponding to the status data is invalid data is obtained; log information corresponding to the time information is obtained from the server according to the time information; and the invalid data is obtained according to the log information. The corresponding invalid reason. Because the server's log information records the reasons for invalid data, the reasons include: the network is disconnected, the authentication password is wrong, or the script is incorrect.
S206、将所述状态数据和所述无效原因作为所述数据处理结果保存至所述第二预设数据表中。S206. Save the status data and the invalidation cause as the data processing result in the second preset data table.
在本实施例中,将所述状态数据和所述无效原因对应作为所述数据处理结果保存至所述第二预设数据表中以生成待处理事件发送终端显示,以便管理人员根据该待处理事件进行及时处理。In this embodiment, the status data and the invalidation cause are correspondingly stored as the data processing result in the second preset data table to generate a pending event sending terminal display so that the management personnel can The incident is handled in a timely manner.
S207、定时轮询所述第二预设数据表以获取所述数据处理结果,根据所述 数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。S207. Polling the second preset data table periodically to obtain the data processing result, generating a pending event according to the data processing result, and sending the pending event to a terminal to cause the terminal to display the pending Handle the event.
在本实施例中,根据设定时长定时循环轮询所述第二预设数据表以获取所述数据处理结果,根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端通过集中管理平台显示所述待处理事件,以便管理人员根据所述待处理事件进行处理。In this embodiment, the second preset data table is polled periodically to obtain the data processing result according to the set duration, and a pending event is generated according to the data processing result, and the pending event is sent to the terminal. In this way, the terminal displays the to-be-processed event through a centralized management platform, so that a management person performs processing according to the to-be-processed event.
上述实施例的方法将采集的状态数据进行处理得到有效数据,再确认有效状态数据中的告警数据,将所述告警数据保存在第一预设数据表,再将处理结果保存至第二预设数据表中,以便做收敛去重处理和归档处理,根据处理结果生成待处理事件,将所述待处理事件发布在集中管理平台上,通过集中管理平台可实现对告警数据的管理。进而实现了对故障数据的准确分析以及便于故障数据的统计。The method of the above embodiment processes the collected status data to obtain valid data, then confirms the alarm data in the valid status data, saves the alarm data in a first preset data table, and then saves the processing result to a second preset The data table is used to perform convergence deduplication processing and archiving processing, generate pending events according to the processing results, and publish the pending events on a centralized management platform. The centralized management platform can realize the management of alarm data. It then realizes accurate analysis of fault data and facilitates statistics of fault data.
图5是本申请实施例提供的一种告警数据的管理装置的示意性框图。如图5所示,对应于以上告警数据的管理方法,本申请还提供一种告警数据的管理装置。该告警数据的管理装置包括用于执行上述告警数据的管理的单元,该装置可以被配置于服务器中。如图5所示,告警数据的管理装置400包括:数据采集单元401、数据判断单元402、数据确定单元403、数据处理单元404和事件生成单元405。FIG. 5 is a schematic block diagram of a device for managing alarm data according to an embodiment of the present application. As shown in FIG. 5, corresponding to the above alarm data management method, the present application further provides an alarm data management device. The device for managing alarm data includes a unit for performing management of the above-mentioned alarm data, and the device may be configured in a server. As shown in FIG. 5, the alarm data management device 400 includes a data collection unit 401, a data determination unit 402, a data determination unit 403, a data processing unit 404, and an event generation unit 405.
数据采集单元401,用于定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息。A data collection unit 401 is configured to periodically poll a server of a server cluster to collect status data of the server, where the status data includes identification bit information.
数据判断单元402,用于判断所述状态数据是否为有效数据。The data determining unit 402 is configured to determine whether the status data is valid data.
数据确定单元403,用于根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中。A data determining unit 403 is configured to determine alarm data in the status data according to the identification bit information, and store the alarm data in a first preset data table.
数据处理单元404,用于定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中。A data processing unit 404 is configured to periodically poll the first preset data table to obtain the alarm data, process the alarm data using a preset processing rule to obtain a data processing result, and save the data processing result to a second In the preset data table.
其中,数据处理单元404,包括:标识获取子单元4041、信息获取子单元4042、重复处理子单元4043和结果生成子单元4044。具体地,标识获取子单元4041,用于获取所述服务器的组件类型和主机标识,根据所述组件类别将所述告警数据进行分类以得到组件告警数据;信息获取子单元4042,用于根据所述 主机标识获取所述服务器的主机信息;重复处理子单元4043,用于根据所述组件告警数据对应的组件类别对所述组件告警数据作重复告警处理;结果生成子单元4044,用于根据经过所述重复告警处理的组件告警数据和所述主机信息生成数据处理结果。The data processing unit 404 includes: an identification acquisition subunit 4041, an information acquisition subunit 4042, a repetition processing subunit 4043, and a result generation subunit 4044. Specifically, an identifier acquisition subunit 4041 is configured to acquire a component type and a host identifier of the server, and the alarm data is classified according to the component category to obtain component alarm data; an information acquisition subunit 4042 is configured to obtain The host ID is used to obtain the host information of the server; the repeated processing subunit 4043 is configured to perform repeated alarm processing on the component alarm data according to the component category corresponding to the component alarm data; the result generation subunit 4044 is configured to The component alarm data of the repeated alarm processing and the host information generate a data processing result.
事件生成单元405,用于定时轮询所述第二预设数据表以获取所述数据处理结果,根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。An event generating unit 405 is configured to periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal to enable the The terminal displays the pending event.
其中,事件生成单元405,包括:事件生成子单元4051、事件显示子单元4052、时间接收子单元4053和标识修改子单元4054。具体地,事件生成子单元4051,用于根据所述数据处理结果生成待处理事件,所述待处理事件中包括忽略控件,所述忽略控件用于触发显示供用户选择的忽略时间;事件显示子单元4052,用于将所述待处理事件发送至终端使所述终端通过集中管理平台显示所述待处理事件;时间接收子单元4053,用于接收所述终端发送的用户选择的忽略时间;标识修改子单元4054,用于根据所述忽略时间修改所述第一预设数据表中的告警数据的标识位信息。The event generating unit 405 includes: an event generating sub-unit 4051, an event display sub-unit 4052, a time receiving sub-unit 4053, and an identification modification sub-unit 4054. Specifically, the event generating subunit 4051 is configured to generate a pending event according to the data processing result, the pending event includes an ignore control, and the ignore control is used to trigger the display of the ignore time for the user to select; the event display sub A unit 4052 is configured to send the pending event to the terminal so that the terminal displays the pending event through a centralized management platform; a time receiving subunit 4053 is configured to receive a user-selected ignore time sent by the terminal; an identifier The modification subunit 4054 is configured to modify the identification bit information of the alarm data in the first preset data table according to the ignore time.
此外,事件生成单元405,还用于:根据所述告警数据中的标识位信息判断所述告警数据是否为消失告警;若所述告警数据为消失告警,则根据所述告警数据对应的服务器的主机信息对所述告警数据归档分类;并将归档分类结果发送至终端通过集中管理平台的报表页面进行展示。In addition, the event generating unit 405 is further configured to: determine whether the alarm data is a disappearing alarm according to the identification bit information in the alarm data; if the alarm data is a disappearing alarm, according to the server corresponding to the alarm data The host information archives and classifies the alarm data; and sends the archived classification results to the terminal for display through a report page of the centralized management platform.
图6是本申请的另一实施例提供的一种告警数据的管理装置的示意性框图。如图6所示,对应于以上告警数据的管理方法,本申请还提供一种告警数据的管理装置。该告警数据的管理装置包括用于执行上述告警数据的管理的单元,该装置可以被配置于服务器中。如图6所示,告警数据的管理装置500包括:数据采集单元501、数据判断单元502、数据确定单元503、数据处理单元504、原因获取单元505、结果保存单元506和事件生成单元507。FIG. 6 is a schematic block diagram of an alarm data management apparatus according to another embodiment of the present application. As shown in FIG. 6, corresponding to the above alarm data management method, the present application further provides an alarm data management device. The device for managing alarm data includes a unit for performing management of the above-mentioned alarm data, and the device may be configured in a server. As shown in FIG. 6, the alarm data management device 500 includes a data acquisition unit 501, a data judgment unit 502, a data determination unit 503, a data processing unit 504, a cause acquisition unit 505, a result storage unit 506, and an event generation unit 507.
数据采集单元501,用于定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息。A data collection unit 501 is configured to periodically poll a server of a server cluster to collect status data of the server, where the status data includes identification bit information.
数据判断单元502,用于判断所述状态数据是否为有效数据。The data determining unit 502 is configured to determine whether the status data is valid data.
数据确定单元503,用于根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中。A data determining unit 503 is configured to determine alarm data in the status data according to the identification bit information, and store the alarm data in a first preset data table.
数据处理单元504,用于定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中。A data processing unit 504 is configured to periodically poll the first preset data table to obtain the alarm data, process the alarm data using a preset processing rule to obtain a data processing result, and save the data processing result to a second In the preset data table.
原因获取单元505,用于获取所述状态数据为无效数据所对应的无效原因。The cause obtaining unit 505 is configured to obtain an invalid cause corresponding to the status data being invalid data.
结果保存单元506,用于将所述状态数据和所述无效原因作为所述数据处理结果保存至所述第二预设数据表中。A result saving unit 506 is configured to save the status data and the invalidation cause as the data processing result in the second preset data table.
事件生成单元507,用于定时轮询所述第二预设数据表以获取所述数据处理结果,根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。An event generating unit 507 is configured to periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal to enable the The terminal displays the pending event.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的告警数据的管理装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of the description, for the specific working process of the above-mentioned alarm data management device and unit, reference may be made to the corresponding process in the foregoing method embodiment, and details are not described herein again.
上述装置可以实现为一种计算机程序的形式,计算机程序可以在如图7所示的计算机设备上运行。The above apparatus may be implemented in the form of a computer program, and the computer program may be run on a computer device as shown in FIG. 7.
请参阅图7,图7是本申请实施例提供的一种计算机设备的示意性框图。该计算机设备700设备可以是服务器。Please refer to FIG. 7, which is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 700 device may be a server.
参照图7,该计算机设备700包括通过系统总线710连接的处理器720、存储器和网络接口750,其中,存储器可以包括非易失性存储介质730和内存储器740。Referring to FIG. 7, the computer device 700 includes a processor 720, a memory, and a network interface 750 connected through a system bus 710. The memory may include a non-volatile storage medium 730 and an internal memory 740.
该非易失性存储介质730可存储操作系统731和计算机程序732。该计算机程序732被执行时,可使得处理器720执行任意一种告警数据的管理方法。The non-volatile storage medium 730 can store an operating system 731 and a computer program 732. When the computer program 732 is executed, the processor 720 can cause the processor 720 to execute any method for managing alarm data.
该处理器720用于提供计算和控制能力,支撑整个计算机设备700的运行。The processor 720 is configured to provide computing and control capabilities to support the operation of the entire computer device 700.
该内存储器740为非易失性存储介质730中的计算机程序732的运行提供环境,该计算机程序732被处理器720执行时,可使得处理器720执行任意一种告警数据的管理方法。The internal memory 740 provides an environment for running the computer program 732 in the non-volatile storage medium 730. When the computer program 732 is executed by the processor 720, the processor 720 can execute any method for managing alarm data.
该网络接口750用于进行网络通信,如发送分配的任务等。本领域技术人员可以理解,图7中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备700的限定,具体的计算机设备700可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。其中,所述处理器720用于运行存储在存储器中的程序代 码,以实现上述各方法的实施例的流程步骤。The network interface 750 is used for network communication, such as sending assigned tasks. Those skilled in the art can understand that the structure shown in FIG. 7 is only a block diagram of a part of the structure related to the scheme of the present application, and does not constitute a limitation on the computer equipment 700 to which the scheme of the present application is applied. The specific computer equipment The 700 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement. The processor 720 is configured to run a program code stored in a memory to implement the process steps of the embodiments of the foregoing methods.
应当理解,在本申请实施例中,处理器720可以是中央处理单元(Central Processing Unit,CPU),该处理器720还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that, in the embodiment of the present application, the processor 720 may be a central processing unit (CPU), and the processor 720 may also be another general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), Application-specific integrated circuits (Application Specific Integrated Circuits, ASICs), ready-made programmable gate arrays (Field-Programmable Gate Arrays, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor, or the processor may be any conventional processor.
本领域技术人员可以理解,图7中示出的计算机设备700结构并不构成对计算机设备700的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the structure of the computer device 700 shown in FIG. 7 does not constitute a limitation on the computer device 700, and may include more or fewer components than shown in the figure, or combine some components or different components. Layout.
本领域普通技术人员可以理解的是实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,计算机程序可存储于一存储介质中,该存储介质为计算机可读存储介质。如本申请实施例中,该计算机程序可存储于计算机系统的存储介质中,并被该计算机系统中的至少一个处理器执行,以实现包括如上述各方法的实施例的流程步骤。Those of ordinary skill in the art can understand that the implementation of all or part of the processes in the methods of the above embodiments can be completed by a computer program instructing related hardware. The computer program can be stored in a storage medium, which is a computer-readable medium. Read storage media. As in the embodiment of the present application, the computer program may be stored in a storage medium of a computer system and executed by at least one processor in the computer system, so as to implement the process steps of the embodiments including the foregoing methods.
该计算机可读存储介质可以是磁碟、光盘、U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The computer-readable storage medium may be various media that can store program codes, such as a magnetic disk, an optical disk, a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a magnetic disk, or an optical disk.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art may realize that the units and algorithm steps of each example described in combination with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the hardware and software, Interchangeability. In the above description, the composition and steps of each example have been described generally in terms of functions. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
在本申请所提供的几个实施例中,应该理解到,所揭露的告警数据的管理装置和方法,可以通过其它的方式实现。例如,以上所描述的告警数据的管理装置实施例仅仅是示意性的。例如,各个单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method for managing alarm data may be implemented in other ways. For example, the embodiment of the device for managing alarm data described above is merely exemplary. For example, the division of each unit is only a logical function division, and there may be another division manner in actual implementation. For example, multiple units or components can be combined or integrated into another system, or some features can be ignored or not implemented.
本申请实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。The steps in the method of the embodiment of the present application can be adjusted, combined, and deleted according to actual needs.
本申请实施例装置中的单元可以根据实际需要进行合并、划分和删减。The units in the apparatus of the embodiment of the present application may be combined, divided, and deleted according to actual needs.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
该集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,终端,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application is essentially a part that contributes to the existing technology, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium Included are instructions for causing a computer device (which may be a personal computer, a terminal, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above is only a specific implementation of this application, but the scope of protection of this application is not limited to this. Any person skilled in the art can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, and these modifications or replacements should be covered by the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (20)

  1. 一种告警数据的管理方法,其包括:A method for managing alarm data, including:
    定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息;Periodically poll the servers of the server cluster to collect status data of the server, where the status data includes identification bit information;
    判断所述状态数据是否为有效数据;Determining whether the status data is valid data;
    若所述状态数据为有效数据,根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中;If the status data is valid data, determining alarm data in the status data according to the identification bit information, and storing the alarm data in a first preset data table;
    定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中;Periodically polling the first preset data table to obtain the alarm data, processing the alarm data using a preset processing rule to obtain a data processing result, and storing the data processing result in a second preset data table;
    定时轮询所述第二预设数据表以获取所述数据处理结果,根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。Periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal so that the terminal displays the pending event .
  2. 根据权利要求1所述的告警数据的管理方法,其中,所述采用预设处理规则处理所述有效告警数据以得到数据处理结果,包括:The method for managing alarm data according to claim 1, wherein the processing the valid alarm data to obtain a data processing result by using a preset processing rule comprises:
    获取所述服务器的组件类型和主机标识,根据所述组件类别将所述告警数据进行分类以得到组件告警数据;Obtaining the component type and host identifier of the server, and classifying the alarm data according to the component category to obtain component alarm data;
    根据所述主机标识获取所述服务器的主机信息;Acquiring host information of the server according to the host identifier;
    根据所述组件告警数据对应的组件类别对所述组件告警数据作重复告警处理;Performing repeated alarm processing on the component alarm data according to the component category corresponding to the component alarm data;
    根据经过所述重复告警处理的组件告警数据和所述主机信息生成数据处理结果。A data processing result is generated according to the component alarm data and the host information that have undergone the repeated alarm processing.
  3. 根据权利要求1所述的告警数据的管理方法,其中,所述根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件,包括:The method for managing alarm data according to claim 1, wherein the generating a pending event according to a result of the data processing, and sending the pending event to a terminal to cause the terminal to display the pending event include: :
    根据所述数据处理结果生成待处理事件,所述待处理事件中包括忽略控件,所述忽略控件用于触发显示供用户选择的忽略时间;将所述待处理事件发送至终端使所述终端通过集中管理平台显示所述待处理事件;接收所述终端发送的用户选择的忽略时间;根据所述忽略时间修改所述第一预设数据表中的告警数 据的标识位信息。Generate a pending event according to the data processing result, the pending event includes a ignore control, the ignore control is used to trigger the display of the ignore time for the user to choose; sending the pending event to the terminal for the terminal to pass The centralized management platform displays the pending event; receives a user-selected ignore time sent by the terminal; and modifies the identification bit information of the alarm data in the first preset data table according to the ignored time.
  4. 根据权利要求3所述的告警数据的管理方法,其中,所述根据所述忽略时间修改所述第一预设数据表中的告警数据的标识位信息之后,还包括:The method for managing alarm data according to claim 3, wherein after modifying the identification bit information of the alarm data in the first preset data table according to the ignore time, further comprising:
    根据所述告警数据中的标识位信息判断所述告警数据是否为消失告警;若所述告警数据为消失告警,则根据所述告警数据对应的服务器的主机信息对所述告警数据归档分类。Determine whether the alarm data is a disappearance alarm according to the identification bit information in the alarm data; if the alarm data is a disappearance alarm, archive the alarm data according to the host information of the server corresponding to the alarm data.
  5. 根据权利要求1所述的告警数据的管理方法,其中,所述判断所述状态数据是否为有效数据之后,还包括:The method for managing alarm data according to claim 1, wherein after determining whether the status data is valid data, further comprising:
    若所述状态数据为无效数据,获取所述状态数据为无效数据所对应的无效原因;以及将所述状态数据和所述无效原因作为所述数据处理结果保存至所述第二预设数据表中。If the status data is invalid data, obtaining an invalid cause corresponding to the status data as invalid data; and saving the status data and the invalid cause as the data processing result to the second preset data table in.
  6. 根据权利要求5所述的告警数据的管理方法,其中,所述获取所述状态数据为无效数据所对应的无效原因,包括:The method for managing alarm data according to claim 5, wherein the reason for invalidation corresponding to acquiring the status data as invalid data comprises:
    获取所述状态数据为无效数据对应的时间信息;根据所述时间信息从所述服务器中获取与所述时间信息对应的日志信息;根据所述日志信息获取所述无效数据所对应的无效原因。Acquiring time information corresponding to the status data as invalid data; acquiring log information corresponding to the time information from the server according to the time information; and acquiring an invalid cause corresponding to the invalid data according to the log information.
  7. 根据权利要求1所述的告警数据的管理方法,其中,所述判断所述状态数据是否为有效数据,包括:The method for managing alarm data according to claim 1, wherein the determining whether the status data is valid data comprises:
    获取所述状态数据对应的数据值;根据所述数据值判断所述状态数据是否为有效数据。Acquiring a data value corresponding to the status data; and judging whether the status data is valid data according to the data value.
  8. 一种告警数据的管理装置,其包括:An alarm data management device includes:
    数据采集单元,用于定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息;A data collection unit, configured to periodically poll a server of a server cluster to collect status data of the server, where the status data includes identification bit information;
    数据判断单元,用于判断所述状态数据是否为有效数据;A data judging unit, configured to judge whether the status data is valid data;
    数据确定单元,用于若所述状态数据为有效数据,根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中;A data determining unit, configured to determine alarm data in the status data according to the identification bit information if the status data is valid data, and store the alarm data in a first preset data table;
    数据处理单元,用于定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中;A data processing unit, configured to periodically poll the first preset data table to obtain the alarm data, process the alarm data using preset processing rules to obtain a data processing result, and save the data processing result to a second In the data table;
    事件生成单元,用于定时轮询所述第二预设数据表以获取所述数据处理结 果,根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。An event generating unit is configured to periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal so that the terminal The pending events are displayed.
  9. 一种计算机设备,其包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现以下步骤:A computer device includes a memory, a processor, and a computer program stored on the memory and executable on the processor. When the processor executes the computer program, the following steps are implemented:
    定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息;Periodically poll the servers of the server cluster to collect status data of the server, where the status data includes identification bit information;
    判断所述状态数据是否为有效数据;Determining whether the status data is valid data;
    若所述状态数据为有效数据,根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中;If the status data is valid data, determining alarm data in the status data according to the identification bit information, and storing the alarm data in a first preset data table;
    定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中;Periodically polling the first preset data table to obtain the alarm data, processing the alarm data using a preset processing rule to obtain a data processing result, and storing the data processing result in a second preset data table;
    定时轮询所述第二预设数据表以获取所述数据处理结果,根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。Periodically poll the second preset data table to obtain the data processing result, generate a pending event according to the data processing result, and send the pending event to a terminal so that the terminal displays the pending event .
  10. 根据权利要求9所述的计算机设备,其中,所述处理器执行所述计算机程序时实现所述采用预设处理规则处理所述有效告警数据以得到数据处理结果时,具体实现以下步骤:The computer device according to claim 9, wherein when the processor executes the computer program to implement the processing of the valid alarm data using a preset processing rule to obtain a data processing result, the following steps are specifically implemented:
    获取所述服务器的组件类型和主机标识,根据所述组件类别将所述告警数据进行分类以得到组件告警数据;根据所述主机标识获取所述服务器的主机信息;根据所述组件告警数据对应的组件类别对所述组件告警数据作重复告警处理;根据经过所述重复告警处理的组件告警数据和所述主机信息生成数据处理结果。Obtain the component type and host identifier of the server, classify the alarm data to obtain component alarm data according to the component category, obtain host information of the server according to the host identifier, and corresponding to the component alarm data The component category performs repeated alarm processing on the component alarm data; and generates a data processing result according to the component alarm data and the host information that have undergone the repeated alarm processing.
  11. 根据权利要求9所述的计算机设备,其中,所述处理器执行所述计算机程序时实现所述根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件时,具体实现以下步骤:The computer device according to claim 9, wherein when the processor executes the computer program, the processor is configured to generate the to-be-processed event according to the data processing result, and send the to-be-processed event to a terminal so that the terminal When the pending event is displayed, the following steps are specifically implemented:
    根据所述数据处理结果生成待处理事件,所述待处理事件中包括忽略控件,所述忽略控件用于触发显示供用户选择的忽略时间;将所述待处理事件发送至终端使所述终端通过集中管理平台显示所述待处理事件;接收所述终端发送的 用户选择的忽略时间;根据所述忽略时间修改所述第一预设数据表中的告警数据的标识位信息。Generate a pending event according to the data processing result, the pending event includes a ignore control, the ignore control is used to trigger the display of the ignore time for the user to choose; sending the pending event to the terminal for the terminal to pass The centralized management platform displays the pending event; receives a user-selected ignore time sent by the terminal; and modifies the identification bit information of the alarm data in the first preset data table according to the ignored time.
  12. 根据权利要求11所述的计算机设备,其中,所述处理器执行所述计算机程序时实现所述根据所述忽略时间修改所述第一预设数据表中的告警数据的标识位信息之后,还实现以下步骤:The computer device according to claim 11, wherein the processor executes the computer program to implement the modification of the identification bit information of the alarm data in the first preset data table according to the ignore time, and Implement the following steps:
    根据所述告警数据中的标识位信息判断所述告警数据是否为消失告警;若所述告警数据为消失告警,则根据所述告警数据对应的服务器的主机信息对所述告警数据归档分类。Determine whether the alarm data is a disappearance alarm according to the identification bit information in the alarm data; if the alarm data is a disappearance alarm, archive the alarm data according to the host information of the server corresponding to the alarm data.
  13. 根据权利要求9所述的计算机设备,其中,所述处理器执行所述计算机程序时实现所述判断所述状态数据是否为有效数据之后,还实现以下步骤:The computer device according to claim 9, wherein after the processor executes the computer program to implement the determining whether the status data is valid data, the following steps are further implemented:
    若所述状态数据为无效数据,获取所述状态数据为无效数据所对应的无效原因;以及将所述状态数据和所述无效原因作为所述数据处理结果保存至所述第二预设数据表中。If the status data is invalid data, obtaining an invalid cause corresponding to the status data as invalid data; and saving the status data and the invalid cause as the data processing result to the second preset data table in.
  14. 根据权利要求9所述的计算机设备,其中,所述处理器执行所述计算机程序时实现所述判断所述状态数据是否为有效数据时,具体实现以下步骤:获取所述状态数据对应的数据值;根据所述数据值判断所述状态数据是否为有效数据。The computer device according to claim 9, wherein when the processor executes the computer program to implement the determining whether the status data is valid data, the following steps are specifically implemented: obtaining a data value corresponding to the status data ; Determining whether the status data is valid data according to the data value.
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器执行以下步骤:A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, causes the processor to perform the following steps:
    定时轮询服务器集群的服务器以采集所述服务器的状态数据,所述状态数据包括标识位信息;Periodically poll the servers of the server cluster to collect status data of the server, where the status data includes identification bit information;
    判断所述状态数据是否为有效数据;若所述状态数据为有效数据,根据所述标识位信息确定所述状态数据中的告警数据,将所述告警数据保存在第一预设数据表中;定时轮询所述第一预设数据表获取所述告警数据,采用预设处理规则处理所述告警数据以得到数据处理结果,将所述数据处理结果保存至第二预设数据表中;定时轮询所述第二预设数据表以获取所述数据处理结果,根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件。Determining whether the status data is valid data; if the status data is valid data, determining alarm data in the status data according to the identification bit information, and storing the alarm data in a first preset data table; Poll the first preset data table periodically to obtain the alarm data, process the alarm data using a preset processing rule to obtain a data processing result, and save the data processing result in a second preset data table; Polling the second preset data table to obtain the data processing result, generating a pending event according to the data processing result, and sending the pending event to a terminal so that the terminal displays the pending event.
  16. 根据权利要求15所述的计算机可读存储介质,其中,所述计算机程序被处理器执行时使所述处理器执行所述采用预设处理规则处理所述有效告警数 据以得到数据处理结果时,具体执行以下步骤:The computer-readable storage medium of claim 15, wherein when the computer program is executed by a processor, the processor causes the processor to execute the processing of the valid alarm data by using a preset processing rule to obtain a data processing result, Perform the following steps:
    获取所述服务器的组件类型和主机标识,根据所述组件类别将所述告警数据进行分类以得到组件告警数据;根据所述主机标识获取所述服务器的主机信息;根据所述组件告警数据对应的组件类别对所述组件告警数据作重复告警处理;根据经过所述重复告警处理的组件告警数据和所述主机信息生成数据处理结果。Obtain the component type and host identifier of the server, classify the alarm data to obtain component alarm data according to the component category, obtain host information of the server according to the host identifier, and corresponding to the component alarm data The component category performs repeated alarm processing on the component alarm data; and generates a data processing result according to the component alarm data and the host information that have undergone the repeated alarm processing.
  17. 根据权利要求15所述的计算机可读存储介质,其中,所述计算机程序被处理器执行时使所述处理器执行所述根据所述数据处理结果生成待处理事件,将所述待处理事件发送至终端以使所述终端显示所述待处理事件时,具体执行以下步骤:The computer-readable storage medium of claim 15, wherein when the computer program is executed by a processor, the processor executes the processor to generate a pending event according to the data processing result, and sends the pending event To the terminal to make the terminal display the pending event, specifically perform the following steps:
    根据所述数据处理结果生成待处理事件,所述待处理事件中包括忽略控件,所述忽略控件用于触发显示供用户选择的忽略时间;将所述待处理事件发送至终端使所述终端通过集中管理平台显示所述待处理事件;接收所述终端发送的用户选择的忽略时间;根据所述忽略时间修改所述第一预设数据表中的告警数据的标识位信息。Generate a pending event according to the data processing result, the pending event includes a ignore control, the ignore control is used to trigger the display of the ignore time for the user to choose; sending the pending event to the terminal for the terminal to pass The centralized management platform displays the pending event; receives a user-selected ignore time sent by the terminal; and modifies the identification bit information of the alarm data in the first preset data table according to the ignored time.
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述计算机程序被处理器执行时使所述处理器执行所述根据所述忽略时间修改所述第一预设数据表中的告警数据的标识位信息之后,还执行以下步骤:The computer-readable storage medium of claim 17, wherein the computer program, when executed by a processor, causes the processor to execute the modification of the alarm data in the first preset data table according to the ignore time After the identification bit information, perform the following steps:
    根据所述告警数据中的标识位信息判断所述告警数据是否为消失告警;若所述告警数据为消失告警,则根据所述告警数据对应的服务器的主机信息对所述告警数据归档分类。Determine whether the alarm data is a disappearance alarm according to the identification bit information in the alarm data; if the alarm data is a disappearance alarm, archive the alarm data according to the host information of the server corresponding to the alarm data.
  19. 根据权利要求15所述的计算机可读存储介质,其中,所述计算机程序被处理器执行时使所述处理器执行所述判断所述状态数据是否为有效数据之后,还执行以下步骤:若所述状态数据为无效数据,获取所述状态数据为无效数据所对应的无效原因;以及将所述状态数据和所述无效原因作为所述数据处理结果保存至所述第二预设数据表中。The computer-readable storage medium according to claim 15, wherein when the computer program is executed by a processor, the processor executes the judgment to determine whether the status data is valid data, and further performs the following steps: The status data is invalid data, and the invalidation reason corresponding to the status data is invalid data is obtained; and the status data and the invalidation cause are saved as the data processing result in the second preset data table.
  20. 根据权利要求15所述的计算机可读存储介质,其中,所述计算机程序被处理器执行时使所述处理器执行所述判断所述状态数据是否为有效数据时,具体执行以下步骤:获取所述状态数据对应的数据值;根据所述数据值判断所述状态数据是否为有效数据。The computer-readable storage medium according to claim 15, wherein when the computer program is executed by a processor, the processor executes the judgment to determine whether the status data is valid data, and specifically executes the following steps: A data value corresponding to the state data; and judging whether the state data is valid data according to the data value.
PCT/CN2018/108271 2018-08-08 2018-09-28 Alarm data management method and apparatus, and computer device and storage medium WO2020029407A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810897093.0A CN108763038B (en) 2018-08-08 2018-08-08 Alarm data management method and device, computer equipment and storage medium
CN201810897093.0 2018-08-08

Publications (1)

Publication Number Publication Date
WO2020029407A1 true WO2020029407A1 (en) 2020-02-13

Family

ID=63969354

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/108271 WO2020029407A1 (en) 2018-08-08 2018-09-28 Alarm data management method and apparatus, and computer device and storage medium

Country Status (2)

Country Link
CN (1) CN108763038B (en)
WO (1) WO2020029407A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111427749A (en) * 2020-04-01 2020-07-17 山东汇贸电子口岸有限公司 Monitoring tool and method for ironic service in openstack environment
CN112052287A (en) * 2020-09-02 2020-12-08 北京世纪互联宽带数据中心有限公司 Centralized management method, device and system for data center cluster
CN112286949A (en) * 2020-11-20 2021-01-29 深圳市和讯华谷信息技术有限公司 Application list updating method and device, computer equipment and storage medium
CN113468025A (en) * 2021-07-28 2021-10-01 浙江大华技术股份有限公司 Data warning method, system, device and storage medium
CN113542253A (en) * 2021-07-12 2021-10-22 杭州安恒信息技术股份有限公司 Network flow detection method, device, equipment and medium
CN113904913A (en) * 2021-08-19 2022-01-07 济南浪潮数据技术有限公司 Alarm processing method, device, equipment and storage medium based on pipeline
CN114650218A (en) * 2020-12-17 2022-06-21 中移(苏州)软件技术有限公司 Data acquisition method, equipment, system and storage medium
CN114661727A (en) * 2022-04-06 2022-06-24 西安热工研究院有限公司 General method for fan equipment fault and alarm data acquisition
CN114816302A (en) * 2022-05-07 2022-07-29 苏州琞能能源科技有限公司 Battery replacement alarm data processing method, device, equipment and medium
CN114978861A (en) * 2022-05-31 2022-08-30 济南浪潮数据技术有限公司 Operation and maintenance management method, device and medium based on multi-manufacturer equipment alarm
CN117113340A (en) * 2023-10-20 2023-11-24 杭州美创科技股份有限公司 Host computer sag detection method, device, computer equipment and storage medium

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111260170A (en) * 2018-11-30 2020-06-09 重庆小雨点小额贷款有限公司 Agricultural product management method, agricultural product management device, server and storage medium
CN112069163A (en) * 2019-06-10 2020-12-11 阿里巴巴集团控股有限公司 Data processing method and device, electronic equipment and storage medium
CN110287241B (en) * 2019-06-27 2023-09-08 深圳前海微众银行股份有限公司 Method and device for generating alarm data report
CN110321362A (en) * 2019-07-05 2019-10-11 广东利元亨智能装备股份有限公司 Data processing method and device and electronic equipment
CN110675079B (en) * 2019-09-30 2024-06-07 腾讯科技(深圳)有限公司 Fault data processing method and device and computer equipment
CN111339293B (en) * 2020-02-11 2023-08-22 支付宝(杭州)信息技术有限公司 Data processing method and device for alarm event and classifying method for alarm event
CN111475370A (en) * 2020-03-06 2020-07-31 平安科技(深圳)有限公司 Operation and maintenance monitoring method, device and equipment based on data center and storage medium
CN113487074A (en) * 2021-06-28 2021-10-08 平安信托有限责任公司 Accident early warning method, device and equipment based on product ex-warehouse and storage medium
CN113448763B (en) * 2021-07-16 2022-07-26 广东电网有限责任公司 Dynamic expansion grouping alarm service method for full life cycle management
CN113778800B (en) * 2021-09-14 2023-08-18 上海绚显科技有限公司 Error information processing method, device, system, equipment and storage medium
CN114116282B (en) * 2021-11-12 2023-08-18 苏州浪潮智能科技有限公司 Method and device for reporting and repairing network additional storage faults
CN114866399A (en) * 2022-04-13 2022-08-05 京东科技控股股份有限公司 Alarm method and device of application system, electronic equipment and computer storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1308278A (en) * 2001-02-15 2001-08-15 华中科技大学 IP fault-tolerant method for colony server
CN103401698A (en) * 2013-07-02 2013-11-20 北京奇虎科技有限公司 Monitoring system used for alarming server status in server cluster operation
CN105718351A (en) * 2016-01-08 2016-06-29 北京汇商融通信息技术有限公司 Hadoop cluster-oriented distributed monitoring and management system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101296466B (en) * 2008-06-12 2012-12-12 高新兴科技集团股份有限公司 Method for shielding alarm generated by base station
WO2018081110A1 (en) * 2016-10-24 2018-05-03 Wandering WiFi LLC Systems and methods for monitoring battery life
CN106778873B (en) * 2016-12-19 2019-09-27 北京市天元网络技术股份有限公司 A kind of warning information classification method of disposal and device based on white list rule
CN107832200A (en) * 2017-10-24 2018-03-23 平安科技(深圳)有限公司 Alert processing method, device, computer equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1308278A (en) * 2001-02-15 2001-08-15 华中科技大学 IP fault-tolerant method for colony server
CN103401698A (en) * 2013-07-02 2013-11-20 北京奇虎科技有限公司 Monitoring system used for alarming server status in server cluster operation
CN105718351A (en) * 2016-01-08 2016-06-29 北京汇商融通信息技术有限公司 Hadoop cluster-oriented distributed monitoring and management system

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111427749B (en) * 2020-04-01 2023-07-11 山东汇贸电子口岸有限公司 Monitoring tool and method for ironic service in opentack environment
CN111427749A (en) * 2020-04-01 2020-07-17 山东汇贸电子口岸有限公司 Monitoring tool and method for ironic service in openstack environment
CN112052287A (en) * 2020-09-02 2020-12-08 北京世纪互联宽带数据中心有限公司 Centralized management method, device and system for data center cluster
CN112286949A (en) * 2020-11-20 2021-01-29 深圳市和讯华谷信息技术有限公司 Application list updating method and device, computer equipment and storage medium
CN112286949B (en) * 2020-11-20 2024-05-17 深圳市和讯华谷信息技术有限公司 Application list updating method and device, computer equipment and storage medium
CN114650218A (en) * 2020-12-17 2022-06-21 中移(苏州)软件技术有限公司 Data acquisition method, equipment, system and storage medium
CN114650218B (en) * 2020-12-17 2023-12-12 中移(苏州)软件技术有限公司 Data acquisition method, device, system and storage medium
CN113542253A (en) * 2021-07-12 2021-10-22 杭州安恒信息技术股份有限公司 Network flow detection method, device, equipment and medium
CN113468025A (en) * 2021-07-28 2021-10-01 浙江大华技术股份有限公司 Data warning method, system, device and storage medium
CN113904913A (en) * 2021-08-19 2022-01-07 济南浪潮数据技术有限公司 Alarm processing method, device, equipment and storage medium based on pipeline
CN114661727A (en) * 2022-04-06 2022-06-24 西安热工研究院有限公司 General method for fan equipment fault and alarm data acquisition
CN114816302A (en) * 2022-05-07 2022-07-29 苏州琞能能源科技有限公司 Battery replacement alarm data processing method, device, equipment and medium
CN114978861A (en) * 2022-05-31 2022-08-30 济南浪潮数据技术有限公司 Operation and maintenance management method, device and medium based on multi-manufacturer equipment alarm
CN117113340A (en) * 2023-10-20 2023-11-24 杭州美创科技股份有限公司 Host computer sag detection method, device, computer equipment and storage medium
CN117113340B (en) * 2023-10-20 2024-01-23 杭州美创科技股份有限公司 Host computer sag detection method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN108763038B (en) 2022-04-12
CN108763038A (en) 2018-11-06

Similar Documents

Publication Publication Date Title
WO2020029407A1 (en) Alarm data management method and apparatus, and computer device and storage medium
CN113328872B (en) Fault repairing method, device and storage medium
WO2020000745A1 (en) Log management method and apparatus, computer device, and storage medium
WO2021068814A1 (en) Method, apparatus, server, and computer-readable storage medium for monitoring for exception of hardware device
US20150127814A1 (en) Monitoring Server Method
CN117280327B (en) Detecting data center large scale interruptions through near real time/offline data using machine learning models
WO2020000758A1 (en) Server acceptance method and apparatus, computer device, and storage medium
WO2020000760A1 (en) Server management method and device, computer apparatus, and storage medium
CN108897496B (en) Disk array configuration management method and device, computer equipment and storage medium
JP6787340B2 (en) Log analysis system, log analysis method and program
WO2021139322A1 (en) Method and apparatus for processing network device data, computer equipment and storage medium
JP5588295B2 (en) Information processing apparatus and failure recovery method
CN113608964A (en) Cluster automation monitoring method and device, electronic equipment and storage medium
US9021078B2 (en) Management method and management system
US20190296960A1 (en) System and method for event processing order guarantee
CN111625386A (en) Monitoring method and device for power-on overtime of system equipment
US20210334153A1 (en) Remote error detection method adapted for a remote computer device to detect errors that occur in a service computer device
WO2016095716A1 (en) Fault information processing method and related device
JP2016181022A (en) Information processing apparatus, information processing program, information processing method, and data center system
US20230359514A1 (en) Operation-based event suppression
CN115102838B (en) Emergency processing method and device for server downtime risk and electronic equipment
WO2019241199A1 (en) System and method for predictive maintenance of networked devices
CN111258845A (en) Detection of event storms
CN109144765B (en) Report generation method, report generation device, computer equipment and storage medium
CN111414274A (en) Far-end eliminating method for abnormal state of cabinet applied to data center

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18929467

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18929467

Country of ref document: EP

Kind code of ref document: A1