WO2020015061A1 - Monitoring alarm method, device and system for weblogic server, and computer storage medium - Google Patents

Monitoring alarm method, device and system for weblogic server, and computer storage medium Download PDF

Info

Publication number
WO2020015061A1
WO2020015061A1 PCT/CN2018/103336 CN2018103336W WO2020015061A1 WO 2020015061 A1 WO2020015061 A1 WO 2020015061A1 CN 2018103336 W CN2018103336 W CN 2018103336W WO 2020015061 A1 WO2020015061 A1 WO 2020015061A1
Authority
WO
WIPO (PCT)
Prior art keywords
monitoring
alarm
template
host
monitoring template
Prior art date
Application number
PCT/CN2018/103336
Other languages
French (fr)
Chinese (zh)
Inventor
袁小伟
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020015061A1 publication Critical patent/WO2020015061A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine

Definitions

  • the present application relates to the technical field of monitoring and alarming of a WebLogic server, and in particular, to a monitoring and alarming method, device, system, and computer storage medium of a WebLogic server.
  • WebLogic is a Java application server for developing, integrating, deploying, and managing large distributed web applications, network applications, and database applications.
  • Enterprise Java Enterprise
  • Enterprise standard security is introduced into the development, integration, deployment and management of large-scale network applications. It has the advantages of easy development, strong scalability, flexibility and reliability, and is currently the best J2EE on the market. (Java 2 Platform Enterprise Edition).
  • the main purpose of this application is to provide a method, device, system and computer storage medium for monitoring and alarming of a WebLogic server, which aims to improve the timeliness of abnormal monitoring alarms while simplifying the monitoring and deployment of the WebLogic server.
  • the present application provides a method for monitoring and alarming a WebLogic server.
  • the method for monitoring and alarming a WebLogic server is applied to a monitoring and alarming system.
  • the monitoring and alarming system includes a monitoring host, a data collection host, and multiple middleware WebLogic servers.
  • the monitoring host is communicatively connected to the data acquisition host, and the data acquisition host is communicatively connected to multiple WebLogic servers, and the method includes:
  • the monitoring host When the monitoring host receives the monitoring instruction, it obtains the performance index of the corresponding WebLogic server through the data collection host according to the monitoring instruction;
  • the monitoring host determines a corresponding monitoring template according to the monitoring instruction, calculates and integrates the performance indicator through the monitoring template, and detects whether the calculated and integrated performance indicator meets a corresponding preset alarm rule;
  • the monitoring host performs an alarm reminder.
  • the present application further provides a monitoring and alarming device for a WebLogic server.
  • the monitoring and alarming device for a WebLogic server includes:
  • An acquisition module configured to, when the monitoring host receives a monitoring instruction, obtain a performance indicator of a corresponding WebLogic server through the data collection host according to the monitoring instruction;
  • a detection module for the monitoring host to determine a corresponding monitoring template according to the monitoring instruction, calculate and integrate the performance indicator through the monitoring template, and detect whether the calculated and integrated performance indicator meets a corresponding preset alarm rule;
  • the alarm module is configured to: if the calculated and integrated performance index meets a corresponding preset alarm rule, the monitoring host performs an alarm reminder.
  • the present application also provides a monitoring and alarming system of a WebLogic server.
  • the monitoring and alarming system of the WebLogic server includes a monitoring host, a data collection host, and multiple WebLogic servers.
  • the present application further provides a computer storage medium, where the computer storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the WebLogic server as described above is implemented. Steps of the monitoring alert method.
  • the present application provides a method and device for monitoring and alarming a WebLogic server, and a computer storage medium.
  • the method for monitoring and alarming a WebLogic server is applied to a monitoring and alarm system.
  • the monitoring and alarming system includes a monitoring host, a data collection host, and multiple WebLogic servers.
  • the monitoring host has a built-in monitoring template set.
  • the monitoring template set includes multiple monitoring templates.
  • the monitoring template calculates and integrates the obtained performance indicators, and detects whether the calculated and integrated performance indicators meet the corresponding preset alarm rules; if the corresponding preset alarm rules are met, an alarm reminder is performed to remind the corresponding operation and maintenance Personnel perform maintenance inspections in a timely manner. Therefore, this application does not need to install a monitoring module on each WebLogic server, which is more convenient to deploy. At the same time, it is not necessary to go to the corresponding WebLogic server to check the corresponding monitoring data to determine whether an abnormality occurs, thereby improving the timeliness of the abnormality monitoring alarm.
  • FIG. 1 is a schematic structural diagram of a terminal in a hardware operating environment according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a first embodiment of a method for monitoring and alarming a WebLogic server of this application;
  • FIG. 3 is a schematic architecture diagram of a monitoring and alarm system according to an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a second embodiment of a monitoring and alarming method of a WebLogic server of this application;
  • FIG. 5 is a schematic flowchart of a third embodiment of a monitoring and alarming method of a WebLogic server of this application.
  • FIG. 6 is a functional module diagram of a first embodiment of a monitoring and alarming device of a WebLogic server of the present application.
  • FIG. 1 is a schematic structural diagram of a terminal in a hardware operating environment according to an embodiment of the present application.
  • the terminal is a monitoring host.
  • the monitoring host may be a terminal device such as a PC, a tablet computer, or a portable computer.
  • the monitoring host has a built-in monitoring template set.
  • the monitoring template set includes multiple monitoring templates, such as the JVM. CPU usage monitoring template, JDBC connection number monitoring template, blocked thread monitoring template, abnormal request monitoring template, GC monitoring template, and slow request monitoring template.
  • the terminal may include a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
  • the communication bus 1002 is used to implement connection and communication between these components.
  • the user interface 1003 may include a display, an input unit such as a keyboard, and the optional user interface 1003 may further include a standard wired interface and a wireless interface.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a Wi-Fi interface).
  • the memory 1005 may be a high-speed RAM memory or a non-volatile memory. memory), such as disk storage.
  • the memory 1005 may optionally be a storage device independent of the foregoing processor 1001.
  • terminal structure shown in FIG. 1 does not constitute a limitation on the terminal, and may include more or fewer components than shown in the figure, or some components may be combined, or different components may be arranged.
  • the memory 1005 as a computer storage medium in FIG. 1 may include an operating system, a network communication module, and computer-readable instructions.
  • a network communication module may be used to connect to a data collection host and perform data communication with the data collection host; and the processor 1001 may be used to call computer-readable instructions stored in the memory 1005 and execute the implementation of the present invention
  • the example provides the monitoring and alarm method of the WebLogic server.
  • This application provides a method for monitoring and alarming a WebLogic server.
  • FIG. 2 is a schematic flowchart of a first embodiment of a monitoring and alarming method of a WebLogic server of this application.
  • the monitoring and alarming method of the WebLogic server is applied to a monitoring and alarming system.
  • the monitoring and alarming system includes a monitoring host, a data collection host, and multiple WebLogic servers.
  • the monitoring host is in communication with the data collection host, and the data collection host is respectively Communicates with multiple WebLogic servers.
  • the monitoring and alarm methods of the WebLogic server include:
  • step S10 when the monitoring host receives the monitoring instruction, it obtains the performance index of the corresponding WebLogic server through the data collection host according to the monitoring instruction;
  • FIG. 3 is a schematic diagram of a monitoring and alarming system architecture according to an embodiment of the present application.
  • the monitoring and alarming system includes a monitoring host. Data acquisition host and multiple WebLogic servers.
  • the monitoring host is in communication connection with the data acquisition host, and the data acquisition host is in communication connection with multiple WebLogic servers.
  • a monitoring template set is built into the monitoring host, and the monitoring template set includes multiple monitoring templates, for example, the JVM CPU (Java Virtual Machine Central Processing Unit) usage monitoring template, JDBC (Java DataBase Connectivity (Java database connection) connection number monitoring template, blocked thread monitoring template, exception request monitoring template, GC (Garbage Collection (garbage collection) monitoring template and slow request monitoring template.
  • Each monitoring template will be configured with corresponding monitoring alarm rules.
  • the monitoring host is used to obtain the performance indicators of the corresponding WebLogic server from the data collection host according to the received monitoring instructions, and then Calculate and integrate the performance indicators of this WebLogic server through the monitoring template to detect whether there is an abnormal situation;
  • the data collection host is used to receive and store the performance indicators reported by each WebLogic server, and is also used to receive the monitoring instructions from the monitoring host.
  • the monitoring host can directly obtain the performance indicators of each WebLogic server through the data collection host in the system, without having to go to the corresponding WebLogic server to check the corresponding monitoring data to determine whether an abnormality occurs, which can improve Timeliness of anomaly monitoring alarms.
  • the monitoring instruction when the monitoring host receives a monitoring instruction, the monitoring instruction may be sent by the management terminal or a preset timing monitoring instruction.
  • the monitoring instruction may include a monitoring type and a WebLogic server name.
  • the monitoring type can include but is not limited to the JVM CPU usage monitoring, JDBC connection monitoring, blocked thread monitoring, abnormal request monitoring, GC monitoring and slow request monitoring.
  • the WebLogic server name can be the system name or number set in the monitoring alarm system of the server, or IP address.
  • the monitoring instruction may include only the WebLogic server name, and at this time, all abnormal conditions are monitored by default.
  • the monitoring host obtains the performance index of the corresponding WebLogic server from the data collection host according to the monitoring instruction.
  • the performance index may include the number of completed requests, the number of JDBC connections, the number of idle threads, the number of execution threads, and the hogging (exclusive) thread. Number, JVM CPU usage, pending number, and heap usage.
  • the number of completed requests refers to the number of completed requests processed within a preset time, for monitoring the amount of abnormal requests
  • the number of JDBC connections refers to the number of Java database connections, used for monitoring JDBC connections and slow request monitoring
  • the number of idle threads refers to the current idle
  • the number of threads, the number of idle threads is generally> 50
  • the number of execution threads refers to the number of threads currently executing processing requests, used to determine the load problem of WebLogic
  • the number of hogging threads refers to the number of threads that have not completed the request for more than 30s, and For blocking thread monitoring
  • JVM CPU usage which is the Java virtual machine CPU usage, for the JVM CPU usage monitoring
  • pending number refers to the number of waiting requests for slow request monitoring
  • heap usage which is heap memory usage, is used for GC monitoring.
  • Step S20 The monitoring host determines a corresponding monitoring template according to the monitoring instruction, calculates and integrates the performance indicator through the monitoring template, and detects whether the calculated and integrated performance indicator meets a corresponding preset alarm rule;
  • step S30 the monitoring host performs an alarm reminder.
  • the monitoring host determines the corresponding monitoring template according to the monitoring instruction and passes the monitoring instruction.
  • the monitoring template calculates and integrates the obtained performance indicators, and detects whether the calculated and integrated performance indicators meet the corresponding preset alarm rules. If the calculated and integrated performance index meets the corresponding preset alarm rules, an alarm reminder is performed to prompt the corresponding operation and maintenance personnel to perform maintenance inspections in a timely manner, or to notify the corresponding responsible person of the abnormal situation and prompt them to solve it. If the calculated and integrated performance index does not satisfy the corresponding preset alarm rule, no alarm reminder is performed.
  • the monitoring host when it obtains the performance indicators, it can obtain the performance indicators corresponding to the WebLogic server, and then determine the corresponding monitoring template according to the control instructions, and then select the corresponding indicators from the performance indicators for calculation through the monitoring template. Integration.
  • a performance indicator that needs to be obtained may also be determined first according to the monitoring instruction. For example, when the monitoring instruction is for the JVM When monitoring the CPU usage, you can only obtain the JVM CPU usage data for a period of time, and then use the corresponding monitoring template (such as the JVM CPU usage monitoring template) to analyze the JVM for a period of time. Corresponding calculation integration of CPU usage data.
  • the application provides a method for monitoring and alarming a WebLogic server.
  • the method for monitoring and alarming a WebLogic server is applied to a monitoring and alarming system.
  • the monitoring and alarming system includes a monitoring host, a data collection host, and multiple WebLogic servers.
  • the monitoring host has a monitoring template built-in.
  • the monitoring template set includes multiple monitoring templates.
  • this application does not need to install a monitoring module on each WebLogic server, which is more convenient to deploy. At the same time, it is not necessary to go to the corresponding WebLogic server to check the corresponding monitoring data to determine whether an abnormality occurs, thereby improving the timeliness of the abnormality monitoring alarm.
  • step S20 may include:
  • Step S21 if the monitoring template is a JVM A CPU usage monitoring template, or a JDBC connection number monitoring template, or a blocked thread monitoring template, or an abnormal request volume monitoring template, or a GC monitoring template, and the monitoring template is used to calculate a corresponding indicator in the performance indicator within a preset time. An average value of, and detecting whether the average value is greater than a corresponding first preset threshold;
  • step S30 the monitoring host performs an alarm reminder.
  • the monitoring host determines the corresponding monitoring template according to the monitoring instruction
  • the monitoring template is a JVM CPU usage monitoring template, or JDBC connection number monitoring template, or blocked thread monitoring template, or abnormal request volume monitoring template, or GC monitoring template
  • the corresponding indicators in the performance index within a preset time are calculated by the corresponding monitoring template.
  • An average value and detect whether the average value is greater than a corresponding first preset threshold; if the average value is greater than the corresponding first preset threshold, an alarm is issued. If the average value is less than or equal to the corresponding first preset threshold value, no alarm reminder is performed.
  • the preset alarm rule corresponding to the JVM CPU usage monitoring template is JVM within 5 minutes. If the average value of the CPU usage is higher than 80%, an alarm is triggered. If the monitoring template is the JVM CPU usage monitoring template, the JVM is executed through the JVM CPU usage monitoring template. When monitoring the CPU usage, you can select the JVM CPU usage data within 5 minutes from the current performance index and calculate the JVM within 5 minutes. The average value of CPU usage, and then check whether the average value is greater than 80%, and if it is greater than that, an alarm is issued.
  • the preset alarm rule corresponding to the JDBC connection number monitoring template is that an alarm is triggered if the number of JDBC connections exceeds 10 within 5 minutes. If the monitoring template is a JDBC connection number monitoring template, when monitoring the JDBC connection number through the JDBC connection number monitoring template, From the obtained performance indicators, select the JDBC connection data within 5 minutes from the current time, and calculate the average value of the JDBC connection number within 5 minutes, and then check whether the average value is greater than 10, and if it is greater, an alarm reminder is performed.
  • the preset alarm rule corresponding to the blocked thread monitoring template is to trigger an alarm if the number of hogging threads is greater than 5 within 5 minutes. If the monitoring template is a blocked thread monitoring template, you can obtain the In the performance index, the data of the number of hogging threads within 5 minutes from the current time is selected, and the average value of the number of hogging threads within 5 minutes is calculated, and then it is detected whether the average value is greater than 5, and if it is greater, an alarm reminder is performed.
  • the preset alarm rule corresponding to the abnormal request volume monitoring template is to trigger an alarm when the number of completed requests is higher than 10 within 5 minutes. If the monitoring template is an abnormal request volume monitoring template, you can monitor the abnormal request volume through the abnormal request volume monitoring template. From the obtained performance indicators, select the number of completed requests within 5 minutes from the current time, and calculate the average of the number of completed requests within 5 minutes, and then check whether the average is greater than 10, and if it is greater, an alarm reminder is performed.
  • the preset alarm rule corresponding to the GC monitoring template triggers an alarm when the heap usage rate is continuously higher than 80% within 10 minutes.
  • the monitoring template is a GC monitoring template
  • the performance indicators can be obtained when performing GC monitoring through the GC monitoring template. Select the heap usage data within 5 minutes from the current time, and calculate the average value of the heap usage within 5 minutes, and then check whether the average value is greater than 80%, and if it is greater than that, an alarm is issued.
  • step S20 may further include:
  • Step S22 If the monitoring template is a slow request monitoring template, use the slow request monitoring template to detect whether the currently pending pending number in the performance indicator is greater than a second preset threshold, and detect the current in the performance indicator. Whether the number of JDBC connections is greater than a third preset threshold;
  • step S30 is performed: an alarm reminder is performed.
  • the monitoring host determines a corresponding monitoring template according to the monitoring instruction
  • the monitoring template is a slow request monitoring template
  • the preset alarm rule corresponding to the slow request monitoring template is higher than 5 pending and the number of JDBC connections When it exceeds 10, an alarm is triggered.
  • the real-time pending number and JDBC connection number (that is, the current pending number and the current JDBC connection number) can be obtained from the obtained performance indicators, and the current detection is performed. Whether the number of pending is greater than 5 and whether the current number of JDBC connections is greater than 10. If the current pending number is greater than 5, and the current number of JDBC connections is greater than 10, an alarm is issued. If only one of the monitoring alarm conditions is met, that is, only when the current pending number is greater than 5 or the current JDBC connection number is greater than 10, no alarm reminder is performed.
  • the performance indicators of the Weblogic server are calculated and integrated through a monitoring template, such as determining whether to alarm based on the average value of a certain performance indicator within a preset time, or performing a combined monitoring alarm on different performance indicators instead of The alarm is reminded once according to a certain performance index once a preset range, so that this application can achieve more accurate alarms and reduce the alarm false alarm rate.
  • FIG. 4 is a schematic flowchart of a second embodiment of a monitoring and alarming method of a WebLogic server of the present application.
  • step S30 may include:
  • Step S31 if the calculated and integrated performance index satisfies a corresponding preset alarm rule, the monitoring host determines a corresponding alarm level according to the monitoring template;
  • step S32 the monitoring host determines a corresponding alarm mode according to the alarm level, and performs an alarm reminder according to the corresponding alarm mode.
  • the abnormality of the WebLogic server is divided into urgency levels, different alarm levels can be set according to the types of abnormalities monitored by different monitoring templates.
  • the alarm is performed according to the alarm level corresponding to the monitoring template.
  • the alarm levels can include critical (very severe), major (warning), warning (info).
  • each monitoring template and alarm level can be: JVM A CPU usage monitoring template corresponds to a critical level alarm, a JDBC connection number monitoring template corresponds to a major level alarm, a blocked thread monitoring template corresponds to a critical level alarm, an abnormal request volume monitoring template corresponds to a warning level alarm, a GC monitoring template corresponds to a warning level alarm, and a slow request monitoring
  • the template corresponds to critical alarms.
  • the corresponding relationship between each monitoring template and the alarm level is not limited to the above-mentioned relationship, and can be specifically set according to the actual situation.
  • the alarm methods may include alarm forms, alarm times, and alarm objects.
  • the alarm forms include email alerts, SMS alerts, and phone alerts.
  • the critical level only corresponds to the email reminder, the reminder number is 5 times, and the alert object is the first responsible person;
  • Major level corresponds to email reminders and SMS reminders, each with 5 reminders, and the alert object is the first responsible person;
  • warning level corresponds to email reminders and SMS reminders, each with 10 reminders, and the alert objects are the first and second responsible persons.
  • Critical level corresponds to email reminder, SMS reminder and phone reminder, 10 times each for email reminder and SMS reminder, 2 times for phone reminder, and the alert object is the first responsible person and the second responsible person.
  • the first responsible person can be reminded first.
  • the reminder can be changed to remind the second responsible person. For example, if a single alarm continues to send 10 emails and none have been consulted, it will be automatically sent to the second responsible person.
  • the corresponding relationship between the alarm level and the alarm mode can be specifically set according to the actual situation, and is not specifically limited here.
  • step S32 Since some monitored performance indicators meet preset alarm conditions within a certain period of time, they are not abnormal situations. To avoid alarm reminders in such cases, the following steps may be included before step S32:
  • Step S33 The monitoring host determines a corresponding alarm mask time period according to the alarm level, and detects whether the current time is within the alarm mask time period;
  • step S32 the monitoring host determines a corresponding alarm mode according to the alarm level, and performs an alarm reminder according to the corresponding alarm mode;
  • step S34 the monitoring host does not perform an alarm reminder.
  • the corresponding alarm mask time period is determined according to the alarm level, and it is detected whether the current time (that is, the time when the abnormality is detected) is in the alarm mask time period. If the current time is not within the alarm mask period, the subsequent step S32 is performed. If the current time is within the alarm mask time period, no alarm reminder will be performed.
  • the alarm masking time period can be set according to actual needs, or the default setting can be adopted. Of course, in specific embodiments, the corresponding alarm masking time period may also be determined directly according to the monitoring template, without first determining the alarm level corresponding to the monitoring template, and then determining the corresponding alarm masking time period according to the alarm level.
  • different alarm levels are set according to the urgency of the abnormality of the WebLogic server, thereby achieving hierarchical alarms, making the monitoring and alarm system more standardized and humane.
  • FIG. 5 is a schematic flowchart of a third embodiment of a method for monitoring and alarming a WebLogic server of the present application.
  • the method for monitoring and alarming of a WebLogic server further includes:
  • Step S40 The monitoring host records a corresponding alarm event in a preset alarm log according to the alarm condition, and the alarm event includes an abnormal time, an abnormal type, and an alarm mode.
  • the monitoring host in order to facilitate the operation and maintenance personnel to understand the specific situation of the alarm, the monitoring host will also record the alarm situation after the alarm is performed. Specifically, an alarm log is set in the monitoring host in advance to record alarm events. Of course, in this embodiment, since the monitoring host monitors multiple different WebLogic servers, multiple different alarms can be set. Log for separate alarm recording. When the monitoring host reports an alarm, the corresponding alarm event will be recorded in the alarm log.
  • the alarm event may include the abnormal time (the time when the WebLogic server generates the exception), the type of the abnormality, and the alarm method.
  • Step S50 The monitoring host determines a high-frequency alarm time and / or a high-frequency anomaly type within a preset statistical period according to the preset alarm log, and according to the high-frequency alarm time and / or the high-frequency anomaly type Generate corresponding high-frequency alarm reports for the corresponding monitoring data.
  • the monitoring server may also calculate data such as high-frequency alarm time and high-frequency abnormality within a certain period of time according to a preset alarm log.
  • the high-frequency alarm time is the time of frequent occurrence of the alarm. For example, there were 300 alarms in June, and 200 alarms occurred from 9 am to 10 am. The high-frequency alarm time in April was 9 am to 10 am.
  • the statistical period can be set according to the actual situation, such as one month as a statistical cycle, or one week as a statistical cycle; for the "high frequency" standard, it can also be It is set according to the actual situation.
  • the high-frequency exception type is the multiple type of abnormal conditions. For example, there were 300 alarms in June, and 200 exception types were JVM. If the CPU usage is abnormal, the high-frequency exception type in April is JVM The CPU usage is abnormal; similarly, the statistical cycle time and the "high frequency" judgment criteria can also be set according to the actual situation.
  • the monitoring host can also provide relevant reports for operation and maintenance personnel to view; for example, the monitoring host can count the high-frequency alarm time and / Or high-frequency anomaly types, the monitoring data corresponding to these high-frequency alarm times and / or high-frequency anomaly types will be obtained, and corresponding high-frequency alarm reports will be generated for operation and maintenance personnel to view.
  • the report performs maintenance and related optimizations to ensure the normal operation of the WebLogic server.
  • the application also provides a monitoring and alarm device for a WebLogic server.
  • FIG. 6 is a schematic diagram of functional modules of a first embodiment of a monitoring and alarm device of a WebLogic server of the present application.
  • the monitoring and alarm device of the WebLogic server includes:
  • the obtaining module 10 is configured to, when the monitoring host receives the monitoring instruction, obtain the performance index of the corresponding WebLogic server through the data collection host according to the monitoring instruction;
  • a detection module 20 is configured for the monitoring host to determine a corresponding monitoring template according to the monitoring instruction, calculate and integrate the performance indicator through the monitoring template, and detect whether the calculated and integrated performance indicator meets a corresponding preset Alarm rules
  • the alarm module 30 is configured to, if the calculated and integrated performance index meets a corresponding preset alarm rule, the monitoring host performs an alarm reminder.
  • the virtual function modules of the monitoring and alarming device of the WebLogic server are stored in the memory 1005 of the monitoring host shown in FIG. 1 and are used to implement all functions of the computer-readable instructions. When each module is executed by the processor 1001, it can be implemented through The data collection host obtains the performance indicators of each WebLogic server, and then calculates and integrates the performance indicators through a monitoring template, and performs an alarm reminder function when it is detected that the calculated and integrated performance indicators meet the corresponding preset alarm rules.
  • the monitoring template includes a Java virtual machine central processor JVM CPU usage monitoring template, Java database connection JDBC connection number monitoring template, blocked thread monitoring template, abnormal request monitoring template, garbage collection GC monitoring template, and slow request monitoring template.
  • the detection module 20 is specifically configured to be used if the monitoring template is a JVM.
  • the alarm module 30 is specifically configured to, if the average value is greater than a corresponding first preset threshold, the monitoring host performs an alarm reminder.
  • the detection module 20 is specifically configured to detect, if the monitoring template is a slow request monitoring template, whether the current waiting pending number in the performance index is greater than a second preset threshold value through the slow request monitoring template, and Detecting whether the current number of JDBC connections in the performance index is greater than a third preset threshold;
  • the alarm module 30 is specifically configured to: if the current number of pending pending is greater than a second preset threshold and the current number of JDBC connections is greater than a third preset threshold, the monitoring host performs an alarm reminder.
  • the alarm module 30 includes:
  • a level determining unit configured to: if the calculated and integrated performance index meets a corresponding preset alarm rule, the monitoring host determines a corresponding alarm level according to the monitoring template;
  • An alarm reminding unit is used for the monitoring host to determine a corresponding alarm mode according to the alarm level, and to perform an alarm reminder according to the corresponding alarm mode.
  • the alarm module 30 further includes:
  • a time detection unit configured to determine, by the monitoring host, a corresponding alarm mask time period according to the alarm level, and detect whether the current time is within the alarm mask time period;
  • the alarm module 30 is specifically configured to: if the current time is within the alarm blocking time period, the monitoring host does not perform an alarm reminder.
  • monitoring and alarm device of the WebLogic server further includes:
  • a reporting module for the monitoring host to determine a high-frequency alarm time and / or a high-frequency anomaly type within a preset statistical period according to the preset alarm log, and according to the high-frequency alarm time and / or the high-frequency
  • the monitoring data corresponding to the abnormality type generates corresponding high-frequency alarm reports.
  • each module in the monitoring and alarming device of the WebLogic server corresponds to the steps in the embodiment of the monitoring and alarming method of the WebLogic server, and the functions and implementation processes thereof will not be repeated here.
  • the application also provides a monitoring and alarming system of a WebLogic server.
  • the monitoring and alarming system of the WebLogic server includes a monitoring host, a data collection host, and multiple WebLogic servers.
  • the monitoring and alarm system also includes a memory, a processor, and a storage device stored in the storage device.
  • the computer-readable instructions running on the processor are described, and when the computer-readable instructions are executed by the processor, the steps of the method for monitoring and alarming of the WebLogic server according to any one of the above embodiments are implemented.
  • the present application also provides a computer storage medium, and the computer-readable storage medium may be a non-volatile readable storage medium.
  • the computer storage medium stores computer-readable instructions. When the computer-readable instructions are executed by a processor, the steps of the method for monitoring and alarming a WebLogic server according to any one of the foregoing embodiments are implemented.

Abstract

The present application provides a monitoring alarm method for a WebLogic server, the method being applied to a monitoring alarm system. The system comprises a monitoring host, a data collecting host and a plurality of WebLogic servers. The method comprises: when receiving a monitoring instruction, the monitoring host acquiring performance indexes of a corresponding WebLogic server by means of the data collecting host according to the monitoring instruction; determining a corresponding monitoring template according to the monitoring instruction, computing and integrating the performance indexes by means of the monitoring template, and detecting whether computed and integrated performance indexes meet a corresponding pre-set alarm rule; and if the computed and integrated performance indexes meet the corresponding pre-set alarm rule, giving an alarm reminder. The present application further provides a monitoring alarm device and system for a WebLogic server, and a computer storage medium. The present application can improve the timeliness in raising an exception monitoring alarm while realizing the simplification of monitoring deployment of a WebLogic server.

Description

WebLogic服务器的监控告警方法、装置、系统及计算机存储介质  WebLogic server monitoring and alarming method, device, system and computer storage medium Ranch
本申请要求于2018年7月18日提交中国专利局、申请号为201810800086.4、发明名称为“WebLogic服务器的监控告警方法、装置、系统及计算机存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application claims the priority of a Chinese patent application filed on July 18, 2018 with the Chinese Patent Office, application number 201810800086.4, and the invention name "Monitoring and Alarming Method, Device, System, and Computer Storage Medium of WebLogic Server", its entire content Incorporated by reference.
技术领域Technical field
本申请涉及WebLogic服务器的监控告警技术领域,尤其涉及一种WebLogic服务器的监控告警方法、装置、系统及计算机存储介质。The present application relates to the technical field of monitoring and alarming of a WebLogic server, and in particular, to a monitoring and alarming method, device, system, and computer storage medium of a WebLogic server.
背景技术Background technique
WebLogic是用于开发、集成、部署和管理大型分布式Web应用、网络应用和数据库应用的Java应用服务器,可将Java的动态功能和Java Enterprise(Java企业)标准的安全性引入大型网络应用的开发、集成、部署和管理之中,具有开发简便、可扩展性强、灵活性和可靠性较高等优势,是目前市场上最好的J2EE(Java 2 Platform Enterprise Edition,Java2平台企业版)工具之一。WebLogic is a Java application server for developing, integrating, deploying, and managing large distributed web applications, network applications, and database applications. Enterprise (Java Enterprise) standard security is introduced into the development, integration, deployment and management of large-scale network applications. It has the advantages of easy development, strong scalability, flexibility and reliability, and is currently the best J2EE on the market. (Java 2 Platform Enterprise Edition).
在WebLogic系统中,应用程序部署在大量的WebLogic服务器上,因此,WebLogic服务器的运行状态可能会直接影响前台业务的业务受理。因此,对WebLogic服务器的异常监控很重要。目前,对WebLogic服务器进行监控时,需要在各WebLogic服务器上配置对应的监控模块才能实现对其的监控,部署较为复杂。此外,工作人员想要监控某一WebLogic服务器是否运行异常时,需在对应的WebLogic服务器上查看对应的监控数据,才能判断是否出现异常,因此往往不能及时发现WebLogic服务器的异常。因此,现有技术中WebLogic服务器的监控部署较复杂,同时存在异常监控告警不及时的问题。In the WebLogic system, applications are deployed on a large number of WebLogic servers. Therefore, the running status of the WebLogic server may directly affect the business acceptance of front-end business. Therefore, it is important to monitor WebLogic server exceptions. At present, when monitoring a WebLogic server, a corresponding monitoring module needs to be configured on each WebLogic server to implement monitoring thereof, and the deployment is relatively complicated. In addition, when a worker wants to monitor whether a WebLogic server is running abnormally, he needs to check the corresponding monitoring data on the corresponding WebLogic server to determine whether an abnormality occurs. Therefore, the abnormality of the WebLogic server cannot be found in time. Therefore, the monitoring deployment of the WebLogic server in the prior art is complicated, and there is a problem that the abnormal monitoring alarm is not timely.
发明内容Summary of the invention
本申请的主要目的在于提供一种WebLogic服务器的监控告警方法、装置、系统及计算机存储介质,旨在实现在简化WebLogic服务器的监控部署的同时,提高异常监控告警的及时性。The main purpose of this application is to provide a method, device, system and computer storage medium for monitoring and alarming of a WebLogic server, which aims to improve the timeliness of abnormal monitoring alarms while simplifying the monitoring and deployment of the WebLogic server.
为实现上述目的,本申请提供一种WebLogic服务器的监控告警方法,所述WebLogic服务器的监控告警方法应用于监控告警系统,所述监控告警系统包括监控主机、数据采集主机和多台中间件WebLogic服务器,所述监控主机与所述数据采集主机通信连接,所述数据采集主机分别与多台WebLogic服务器通信连接,所述方法包括:To achieve the above purpose, the present application provides a method for monitoring and alarming a WebLogic server. The method for monitoring and alarming a WebLogic server is applied to a monitoring and alarming system. The monitoring and alarming system includes a monitoring host, a data collection host, and multiple middleware WebLogic servers. The monitoring host is communicatively connected to the data acquisition host, and the data acquisition host is communicatively connected to multiple WebLogic servers, and the method includes:
所述监控主机在接收到监控指令时,根据所述监控指令通过所述数据采集主机获取对应的WebLogic服务器的性能指标;When the monitoring host receives the monitoring instruction, it obtains the performance index of the corresponding WebLogic server through the data collection host according to the monitoring instruction;
所述监控主机根据所述监控指令确定对应的监控模板,通过所述监控模板对所述性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则;The monitoring host determines a corresponding monitoring template according to the monitoring instruction, calculates and integrates the performance indicator through the monitoring template, and detects whether the calculated and integrated performance indicator meets a corresponding preset alarm rule;
若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机进行告警提醒。If the calculated and integrated performance index satisfies the corresponding preset alarm rule, the monitoring host performs an alarm reminder.
此外,为实现上述目的,本申请还提供一种WebLogic服务器的监控告警装置,所述WebLogic服务器的监控告警装置包括:In addition, in order to achieve the above object, the present application further provides a monitoring and alarming device for a WebLogic server. The monitoring and alarming device for a WebLogic server includes:
获取模块,用于所述监控主机在接收到监控指令时,根据所述监控指令通过所述数据采集主机获取对应的WebLogic服务器的性能指标;An acquisition module, configured to, when the monitoring host receives a monitoring instruction, obtain a performance indicator of a corresponding WebLogic server through the data collection host according to the monitoring instruction;
检测模块,用于所述监控主机根据所述监控指令确定对应的监控模板,通过所述监控模板对所述性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则;A detection module for the monitoring host to determine a corresponding monitoring template according to the monitoring instruction, calculate and integrate the performance indicator through the monitoring template, and detect whether the calculated and integrated performance indicator meets a corresponding preset alarm rule;
告警模块,用于若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机进行告警提醒。The alarm module is configured to: if the calculated and integrated performance index meets a corresponding preset alarm rule, the monitoring host performs an alarm reminder.
此外,为实现上述目的,本申请还提供一种WebLogic服务器的监控告警系统,所述WebLogic服务器的监控告警系统包括监控主机、数据采集主机和多台WebLogic服务器,还包括存储器、处理器以及存储在所述存储器上并可被所述处理器执行的计算机可读指令,其中所述计算机可读指令被所述处理器执行时,实现如上所述的WebLogic服务器的监控告警方法的步骤。In addition, in order to achieve the above object, the present application also provides a monitoring and alarming system of a WebLogic server. The monitoring and alarming system of the WebLogic server includes a monitoring host, a data collection host, and multiple WebLogic servers. Computer-readable instructions on the memory and executable by the processor, wherein when the computer-readable instructions are executed by the processor, the steps of the method for monitoring and alerting of a WebLogic server as described above are implemented.
此外,为实现上述目的,本申请还提供一种计算机存储介质,所述计算机存储介质上存储有计算机可读指令,其中所述计算机可读指令被处理器执行时,实现如上所述的WebLogic服务器的监控告警方法的步骤。In addition, in order to achieve the above object, the present application further provides a computer storage medium, where the computer storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the WebLogic server as described above is implemented. Steps of the monitoring alert method.
本申请提供一种WebLogic服务器的监控告警方法、装置、系统及计算机存储介质,该WebLogic服务器的监控告警方法应用于监控告警系统,该监控告警系统包括监控主机、数据采集主机和多台WebLogic服务器,该监控主机中内置有监控模板集,该监控模板集中包括多个监控模板,监控主机在接收到监控指令时,可直接通过该系统中的数据采集主机获取对应的WebLogic服务器的性能指标;然后通过监控模板对获取到的性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则;若满足对应的预设告警规则,则进行告警提醒,以提示对应的运维人员及时进行维修检查。因此,本申请无需在各WebLogic服务器上安装监控模块,部署更加方便,同时,无需分别去对应的WebLogic服务器上查看对应的监控数据才能判断是否出现异常,因而可提高异常监控告警的及时性。The present application provides a method and device for monitoring and alarming a WebLogic server, and a computer storage medium. The method for monitoring and alarming a WebLogic server is applied to a monitoring and alarm system. The monitoring and alarming system includes a monitoring host, a data collection host, and multiple WebLogic servers. The monitoring host has a built-in monitoring template set. The monitoring template set includes multiple monitoring templates. When receiving a monitoring instruction, the monitoring host can directly obtain the performance indicators of the corresponding WebLogic server through the data collection host in the system. The monitoring template calculates and integrates the obtained performance indicators, and detects whether the calculated and integrated performance indicators meet the corresponding preset alarm rules; if the corresponding preset alarm rules are met, an alarm reminder is performed to remind the corresponding operation and maintenance Personnel perform maintenance inspections in a timely manner. Therefore, this application does not need to install a monitoring module on each WebLogic server, which is more convenient to deploy. At the same time, it is not necessary to go to the corresponding WebLogic server to check the corresponding monitoring data to determine whether an abnormality occurs, thereby improving the timeliness of the abnormality monitoring alarm.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本申请实施例方案涉及的硬件运行环境的终端结构示意图;FIG. 1 is a schematic structural diagram of a terminal in a hardware operating environment according to an embodiment of the present application;
图2为本申请WebLogic服务器的监控告警方法第一实施例的流程示意图;2 is a schematic flowchart of a first embodiment of a method for monitoring and alarming a WebLogic server of this application;
图3为本申请实施例方案涉及的监控告警系统架构示意图;FIG. 3 is a schematic architecture diagram of a monitoring and alarm system according to an embodiment of the present application; FIG.
图4为本申请WebLogic服务器的监控告警方法第二实施例的流程示意图;4 is a schematic flowchart of a second embodiment of a monitoring and alarming method of a WebLogic server of this application;
图5为本申请WebLogic服务器的监控告警方法第三实施例的流程示意图;5 is a schematic flowchart of a third embodiment of a monitoring and alarming method of a WebLogic server of this application;
图6为本申请WebLogic服务器的监控告警装置第一实施例的功能模块示意图。FIG. 6 is a functional module diagram of a first embodiment of a monitoring and alarming device of a WebLogic server of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional features and advantages of the purpose of this application will be further described with reference to the embodiments and the drawings.
具体实施方式detailed description
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It should be understood that the specific embodiments described herein are only used to explain the application, and are not used to limit the application.
请参照图1,图1为本申请实施例方案涉及的硬件运行环境的终端结构示意图。Please refer to FIG. 1. FIG. 1 is a schematic structural diagram of a terminal in a hardware operating environment according to an embodiment of the present application.
本申请实施例终端为监控主机,该监控主机可以为PC、平板电脑、便携计算机等终端设备,该监控主机中内置有监控模板集,其中,该监控模板集中包括多个监控模板,例如,JVM CPU使用率监控模板、JDBC连接数监控模板、堵塞线程监控模板、异常请求量监控模板、GC监控模板和慢请求监控模板。In the embodiment of the present application, the terminal is a monitoring host. The monitoring host may be a terminal device such as a PC, a tablet computer, or a portable computer. The monitoring host has a built-in monitoring template set. The monitoring template set includes multiple monitoring templates, such as the JVM. CPU usage monitoring template, JDBC connection number monitoring template, blocked thread monitoring template, abnormal request monitoring template, GC monitoring template, and slow request monitoring template.
如图1所示,该终端可以包括:处理器1001,例如CPU,通信总线1002,用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如Wi-Fi接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。As shown in FIG. 1, the terminal may include a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. The communication bus 1002 is used to implement connection and communication between these components. The user interface 1003 may include a display, an input unit such as a keyboard, and the optional user interface 1003 may further include a standard wired interface and a wireless interface. The network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a Wi-Fi interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory. memory), such as disk storage. The memory 1005 may optionally be a storage device independent of the foregoing processor 1001.
本领域技术人员可以理解,图1中示出的终端结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the terminal structure shown in FIG. 1 does not constitute a limitation on the terminal, and may include more or fewer components than shown in the figure, or some components may be combined, or different components may be arranged.
继续参照图1,图1中作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块以及计算机可读指令。在图1所示的终端中,网络通信模块可用于连接数据采集主机,与数据采集主机进行数据通信;而处理器1001可以用于调用存储器1005中存储的计算机可读指令,并执行本发明实施例提供的WebLogic服务器的监控告警方法。With continued reference to FIG. 1, the memory 1005 as a computer storage medium in FIG. 1 may include an operating system, a network communication module, and computer-readable instructions. In the terminal shown in FIG. 1, a network communication module may be used to connect to a data collection host and perform data communication with the data collection host; and the processor 1001 may be used to call computer-readable instructions stored in the memory 1005 and execute the implementation of the present invention The example provides the monitoring and alarm method of the WebLogic server.
基于上述硬件结构,提出本申请WebLogic服务器的监控告警方法的各个实施例。Based on the above hardware structure, various embodiments of the method for monitoring and alarming of the WebLogic server of the present application are proposed.
本申请提供一种WebLogic服务器的监控告警方法。This application provides a method for monitoring and alarming a WebLogic server.
请参照图2,图2为本申请WebLogic服务器的监控告警方法第一实施例的流程示意图。Please refer to FIG. 2, which is a schematic flowchart of a first embodiment of a monitoring and alarming method of a WebLogic server of this application.
在本实施例中,该WebLogic服务器的监控告警方法应用于监控告警系统,该监控告警系统包括监控主机、数据采集主机和多台WebLogic服务器,该监控主机与数据采集主机通信连接,数据采集主机分别与多台WebLogic服务器通信连接,该WebLogic服务器的监控告警方法包括:In this embodiment, the monitoring and alarming method of the WebLogic server is applied to a monitoring and alarming system. The monitoring and alarming system includes a monitoring host, a data collection host, and multiple WebLogic servers. The monitoring host is in communication with the data collection host, and the data collection host is respectively Communicates with multiple WebLogic servers. The monitoring and alarm methods of the WebLogic server include:
步骤S10,所述监控主机在接收到监控指令时,根据所述监控指令通过所述数据采集主机获取对应的WebLogic服务器的性能指标;In step S10, when the monitoring host receives the monitoring instruction, it obtains the performance index of the corresponding WebLogic server through the data collection host according to the monitoring instruction;
在本实施例中,该WebLogic服务器的监控告警方法应用于监控告警系统,具体的,请参照图3,图3为本申请实施例方案涉及的监控告警系统架构示意图,该监控告警系统包括监控主机、数据采集主机和多台WebLogic服务器,该监控主机与数据采集主机通信连接,数据采集主机分别与多台WebLogic服务器通信连接。其中,监控主机中内置有监控模板集,该监控模板集中包括多个监控模板,例如,JVM CPU(Java Virtual Machine Central Processing Unit,Java虚拟机中央处理器)使用率监控模板、JDBC(Java DataBase Connectivity,Java数据库连接)连接数监控模板、堵塞线程监控模板、异常请求量监控模板、GC(Garbage Collection,垃圾回收)监控模板和慢请求监控模板,每个监控模板会配置相应的监控告警规则,监控主机用于根据接收到的监控指令,从数据采集主机处获取对应WebLogic服务器的性能指标,然后通过监控模板对该WebLogic服务器的性能指标进行计算整合,以检测是否存在异常情况;数据采集主机用于接收和存储各WebLogic服务器上报的性能指标,还用于在接收到监控主机的监控指令时,发送对应WebLogic服务器的性能指标至监控主机。通过构建该该监控告警系统,监控主机可直接通过该系统中的数据采集主机获取各WebLogic服务器的性能指标,而无需分别去对应的WebLogic服务器上查看对应的监控数据才能判断是否出现异常,可提高异常监控告警的及时性。In this embodiment, the monitoring and alarming method of the WebLogic server is applied to a monitoring and alarming system. For details, please refer to FIG. 3. FIG. 3 is a schematic diagram of a monitoring and alarming system architecture according to an embodiment of the present application. The monitoring and alarming system includes a monitoring host. Data acquisition host and multiple WebLogic servers. The monitoring host is in communication connection with the data acquisition host, and the data acquisition host is in communication connection with multiple WebLogic servers. A monitoring template set is built into the monitoring host, and the monitoring template set includes multiple monitoring templates, for example, the JVM CPU (Java Virtual Machine Central Processing Unit) usage monitoring template, JDBC (Java DataBase Connectivity (Java database connection) connection number monitoring template, blocked thread monitoring template, exception request monitoring template, GC (Garbage Collection (garbage collection) monitoring template and slow request monitoring template. Each monitoring template will be configured with corresponding monitoring alarm rules. The monitoring host is used to obtain the performance indicators of the corresponding WebLogic server from the data collection host according to the received monitoring instructions, and then Calculate and integrate the performance indicators of this WebLogic server through the monitoring template to detect whether there is an abnormal situation; the data collection host is used to receive and store the performance indicators reported by each WebLogic server, and is also used to receive the monitoring instructions from the monitoring host. Send performance indicators corresponding to the WebLogic server to the monitoring host. By constructing the monitoring alarm system, the monitoring host can directly obtain the performance indicators of each WebLogic server through the data collection host in the system, without having to go to the corresponding WebLogic server to check the corresponding monitoring data to determine whether an abnormality occurs, which can improve Timeliness of anomaly monitoring alarms.
在本实施例中,监控主机在接收到监控指令时,该监控指令可以是管理终端发送的,也可以是预先设定的定时监控指令,该监控指令可以包括监控类型和WebLogic服务器名称。其中,监控类型可以包括但不限于JVM CPU使用率监控、JDBC连接数监控、堵塞线程监控、异常请求量监控、GC监控和慢请求监控,WebLogic服务器名称可以为对该服务器在监控告警系统中所设定的系统名称或编号,或其ip地址。当然,该监控指令可以只包括WebLogic服务器名称,此时则默认监控所有的异常情况。In this embodiment, when the monitoring host receives a monitoring instruction, the monitoring instruction may be sent by the management terminal or a preset timing monitoring instruction. The monitoring instruction may include a monitoring type and a WebLogic server name. Among them, the monitoring type can include but is not limited to the JVM CPU usage monitoring, JDBC connection monitoring, blocked thread monitoring, abnormal request monitoring, GC monitoring and slow request monitoring. The WebLogic server name can be the system name or number set in the monitoring alarm system of the server, or IP address. Of course, the monitoring instruction may include only the WebLogic server name, and at this time, all abnormal conditions are monitored by default.
然后,监控主机根据该监控指令从数据采集主机处获取对应的WebLogic服务器的性能指标,其中,该性能指标可以包括完成请求数,JDBC连接数,空闲线程数,执行线程数,hogging(独占)线程数,JVM CPU使用率,pending数和heap使用率。其中,完成请求数指预设时间内处理完成的请求的数量,用于异常请求量监控;JDBC连接数指java数据库连接数,用于JDBC连接数监控和慢请求监控;空闲线程数指当前空闲线程的数量,空闲线程数一般>50个;执行线程数指当前执行处理请求的线程的数量,用于判断WebLogic的负载问题;hogging线程数指超过30s未完成请求的线程所对应的数量,用于堵塞线程监控;JVM CPU使用率,即Java虚拟机CPU使用率,用于JVM CPU使用率监控;pending(等待)数指等待请求的数量,用于慢请求监控;heap使用率,即堆内存使用率,用于GC监控。Then, the monitoring host obtains the performance index of the corresponding WebLogic server from the data collection host according to the monitoring instruction. The performance index may include the number of completed requests, the number of JDBC connections, the number of idle threads, the number of execution threads, and the hogging (exclusive) thread. Number, JVM CPU usage, pending number, and heap usage. Among them, the number of completed requests refers to the number of completed requests processed within a preset time, for monitoring the amount of abnormal requests; the number of JDBC connections refers to the number of Java database connections, used for monitoring JDBC connections and slow request monitoring; the number of idle threads refers to the current idle The number of threads, the number of idle threads is generally> 50; the number of execution threads refers to the number of threads currently executing processing requests, used to determine the load problem of WebLogic; the number of hogging threads refers to the number of threads that have not completed the request for more than 30s, and For blocking thread monitoring; JVM CPU usage, which is the Java virtual machine CPU usage, for the JVM CPU usage monitoring; pending number refers to the number of waiting requests for slow request monitoring; heap usage, which is heap memory usage, is used for GC monitoring.
步骤S20,所述监控主机根据所述监控指令确定对应的监控模板,通过所述监控模板对所述性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则;Step S20: The monitoring host determines a corresponding monitoring template according to the monitoring instruction, calculates and integrates the performance indicator through the monitoring template, and detects whether the calculated and integrated performance indicator meets a corresponding preset alarm rule;
若经计算整合后的性能指标满足对应的预设告警规则,则执行步骤S30:所述监控主机进行告警提醒。If the calculated and integrated performance index meets the corresponding preset alarm rule, step S30 is performed: the monitoring host performs an alarm reminder.
在本实施例中,监控主机在接收到监控指令,并根据该监控指令通过数据采集主机获取到对应的WebLogic服务器的性能指标之后,监控主机会根据该监控指令确定对应的监控模板,并通过该监控模板对获取到的性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则。若经计算整合后的性能指标满足对应的预设告警规则,则进行告警提醒,以提示对应的运维人员及时进行维修检查,或告知对应的责任人该异常情况,提示其进行解决。若经经计算整合后的性能指标不满足对应的预设告警规则,则不进行告警提醒。In this embodiment, after the monitoring host receives the monitoring instruction and obtains the performance index of the corresponding WebLogic server through the data collection host according to the monitoring instruction, the monitoring host determines the corresponding monitoring template according to the monitoring instruction and passes the monitoring instruction. The monitoring template calculates and integrates the obtained performance indicators, and detects whether the calculated and integrated performance indicators meet the corresponding preset alarm rules. If the calculated and integrated performance index meets the corresponding preset alarm rules, an alarm reminder is performed to prompt the corresponding operation and maintenance personnel to perform maintenance inspections in a timely manner, or to notify the corresponding responsible person of the abnormal situation and prompt them to solve it. If the calculated and integrated performance index does not satisfy the corresponding preset alarm rule, no alarm reminder is performed.
需要说明的是,监控主机在获取性能指标时,可以获取对应WebLogic服务器的各项性能指标,然后再根据控制指令确定对应的监控模板,然后通过监控模板从各项性能指标选取对应的指标进行计算整合。当然,在具体实施例中,也可以先根据监控指令确定所需获取的性能指标,例如,当监控指令为对JVM CPU使用率进行监控时,则可以只获取一段时间内的JVM CPU使用率数据,然后通过对应的监控模板(如JVM CPU使用率监控模板)对一段时间内的JVM CPU使用率数据进行对应的计算整合。It should be noted that when the monitoring host obtains the performance indicators, it can obtain the performance indicators corresponding to the WebLogic server, and then determine the corresponding monitoring template according to the control instructions, and then select the corresponding indicators from the performance indicators for calculation through the monitoring template. Integration. Of course, in a specific embodiment, a performance indicator that needs to be obtained may also be determined first according to the monitoring instruction. For example, when the monitoring instruction is for the JVM When monitoring the CPU usage, you can only obtain the JVM CPU usage data for a period of time, and then use the corresponding monitoring template (such as the JVM CPU usage monitoring template) to analyze the JVM for a period of time. Corresponding calculation integration of CPU usage data.
本申请提供一种WebLogic服务器的监控告警方法,该WebLogic服务器的监控告警方法应用于监控告警系统,该监控告警系统包括监控主机、数据采集主机和多台WebLogic服务器,该监控主机中内置有监控模板集,该监控模板集中包括多个监控模板,监控主机在接收到监控指令时,可直接通过该系统中的数据采集主机获取对应的WebLogic服务器的性能指标;然后通过监控模板对获取到的性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则;若满足对应的预设告警规则,则进行告警提醒,以提示对应的运维人员及时进行维修检查。因此,本申请无需在各WebLogic服务器上安装监控模块,部署更加方便,同时,无需分别去对应的WebLogic服务器上查看对应的监控数据才能判断是否出现异常,因而可提高异常监控告警的及时性。The application provides a method for monitoring and alarming a WebLogic server. The method for monitoring and alarming a WebLogic server is applied to a monitoring and alarming system. The monitoring and alarming system includes a monitoring host, a data collection host, and multiple WebLogic servers. The monitoring host has a monitoring template built-in. The monitoring template set includes multiple monitoring templates. When the monitoring host receives the monitoring instruction, it can directly obtain the performance indicators of the corresponding WebLogic server through the data collection host in the system; and then use the monitoring template to obtain the obtained performance indicators. Perform calculation integration and check whether the calculated performance indicators meet the corresponding preset alarm rules; if the corresponding preset alarm rules are met, an alarm reminder is issued to prompt the corresponding operation and maintenance personnel to perform maintenance inspections in a timely manner. Therefore, this application does not need to install a monitoring module on each WebLogic server, which is more convenient to deploy. At the same time, it is not necessary to go to the corresponding WebLogic server to check the corresponding monitoring data to determine whether an abnormality occurs, thereby improving the timeliness of the abnormality monitoring alarm.
进一步的,由于监控模板包括JVM CPU使用率监控模板、JDBC连接数监控模板、堵塞线程监控模板、异常请求量监控模板、GC监控模板和慢请求监控模板。不同的监控模板,其监控的性能指标不相同,对监控的性能指标的计算整合规则及对应的预设告警规则也不相同。因此,步骤S20可以包括:Further, since the monitoring template includes the JVM CPU usage monitoring template, JDBC connection number monitoring template, blocked thread monitoring template, abnormal request monitoring template, GC monitoring template, and slow request monitoring template. The performance indicators monitored by different monitoring templates are different, and the calculation and integration rules of the monitored performance indicators and the corresponding preset alarm rules are also different. Therefore, step S20 may include:
步骤S21,若所述监控模板为JVM CPU使用率监控模板,或JDBC连接数监控模板,或堵塞线程监控模板,或异常请求量监控模板,或GC监控模板,则通过所述监控模板计算预设时间内所述性能指标中的对应指标的平均值,并检测所述平均值是否大于对应的第一预设阈值;Step S21, if the monitoring template is a JVM A CPU usage monitoring template, or a JDBC connection number monitoring template, or a blocked thread monitoring template, or an abnormal request volume monitoring template, or a GC monitoring template, and the monitoring template is used to calculate a corresponding indicator in the performance indicator within a preset time. An average value of, and detecting whether the average value is greater than a corresponding first preset threshold;
此时,若所述平均值大于对应的第一预设阈值,则执行步骤S30:所述监控主机进行告警提醒。At this time, if the average value is greater than the corresponding first preset threshold, step S30 is executed: the monitoring host performs an alarm reminder.
在本实施例中,监控主机在根据监控指令确定对应的监控模板之后,若该监控模板为JVM CPU使用率监控模板,或JDBC连接数监控模板,或堵塞线程监控模板,或异常请求量监控模板,或GC监控模板,则通过对应的监控模板计算预设时间内该性能指标中的对应指标的平均值,并检测该平均值是否大于对应的第一预设阈值;若该平均值大于对应的第一预设阈值,则进行告警提醒。若该平均值小于或等于对应的第一预设阈值,则不进行告警提醒。In this embodiment, after the monitoring host determines the corresponding monitoring template according to the monitoring instruction, if the monitoring template is a JVM CPU usage monitoring template, or JDBC connection number monitoring template, or blocked thread monitoring template, or abnormal request volume monitoring template, or GC monitoring template, the corresponding indicators in the performance index within a preset time are calculated by the corresponding monitoring template. An average value, and detect whether the average value is greater than a corresponding first preset threshold; if the average value is greater than the corresponding first preset threshold, an alarm is issued. If the average value is less than or equal to the corresponding first preset threshold value, no alarm reminder is performed.
具体的,JVM CPU使用率监控模板对应的预设告警规则为5分钟内JVM CPU使用率的平均值持续高于80%则触发告警,若监控模板为JVM CPU使用率监控模板,在通过JVM CPU使用率监控模板进行JVM CPU使用率监控时,可在获取到的性能指标中选取距当前时间5分钟内的JVM CPU使用率数据,并计算5分钟内JVM CPU使用率的平均值,然后检测该平均值是否大于80%,若大于,则进行告警提醒。Specifically, the preset alarm rule corresponding to the JVM CPU usage monitoring template is JVM within 5 minutes. If the average value of the CPU usage is higher than 80%, an alarm is triggered. If the monitoring template is the JVM CPU usage monitoring template, the JVM is executed through the JVM CPU usage monitoring template. When monitoring the CPU usage, you can select the JVM CPU usage data within 5 minutes from the current performance index and calculate the JVM within 5 minutes. The average value of CPU usage, and then check whether the average value is greater than 80%, and if it is greater than that, an alarm is issued.
JDBC连接数监控模板对应的预设告警规则为5分钟内JDBC连接数超过10个则触发告警,若监控模板为JDBC连接数监控模板,在通过JDBC连接数监控模板进行JDBC连接数监控时,可在获取到的性能指标中选取距当前时间5分钟内的JDBC连接数数据,并计算5分钟内JDBC连接数的平均值,然后检测该平均值是否大于10,若大于,则进行告警提醒。The preset alarm rule corresponding to the JDBC connection number monitoring template is that an alarm is triggered if the number of JDBC connections exceeds 10 within 5 minutes. If the monitoring template is a JDBC connection number monitoring template, when monitoring the JDBC connection number through the JDBC connection number monitoring template, From the obtained performance indicators, select the JDBC connection data within 5 minutes from the current time, and calculate the average value of the JDBC connection number within 5 minutes, and then check whether the average value is greater than 10, and if it is greater, an alarm reminder is performed.
堵塞线程监控模板对应的预设告警规则为5分钟内hogging线程数大于5个则触发告警,若监控模板为堵塞线程监控模板,在通过堵塞线程监控模板进行堵塞线程监控时,可在获取到的性能指标中选取距当前时间5分钟内的hogging线程数数据,并计算5分钟内hogging线程数的平均值,然后检测该平均值是否大于5,若大于,则进行告警提醒。The preset alarm rule corresponding to the blocked thread monitoring template is to trigger an alarm if the number of hogging threads is greater than 5 within 5 minutes. If the monitoring template is a blocked thread monitoring template, you can obtain the In the performance index, the data of the number of hogging threads within 5 minutes from the current time is selected, and the average value of the number of hogging threads within 5 minutes is calculated, and then it is detected whether the average value is greater than 5, and if it is greater, an alarm reminder is performed.
异常请求量监控模板对应的预设告警规则为5分钟内完成请求数高于10则触发告警,若监控模板为异常请求量监控模板,在通过异常请求量监控模板进行异常请求量监控时,可在获取到的性能指标中选取距当前时间5分钟内的完成请求数数据,并计算5分钟内完成请求数的平均值,然后检测该平均值是否大于10,若大于,则进行告警提醒。The preset alarm rule corresponding to the abnormal request volume monitoring template is to trigger an alarm when the number of completed requests is higher than 10 within 5 minutes. If the monitoring template is an abnormal request volume monitoring template, you can monitor the abnormal request volume through the abnormal request volume monitoring template. From the obtained performance indicators, select the number of completed requests within 5 minutes from the current time, and calculate the average of the number of completed requests within 5 minutes, and then check whether the average is greater than 10, and if it is greater, an alarm reminder is performed.
GC监控模板对应的预设告警规则为10分钟内heap使用率持续高于80%则触发告警,若监控模板为GC监控模板,在通过GC监控模板进行GC监控时,可在获取到的性能指标中选取距当前时间5分钟内的heap使用率数据,并计算5分钟内heap使用率的平均值,然后检测该平均值是否大于80%,若大于,则进行告警提醒。The preset alarm rule corresponding to the GC monitoring template triggers an alarm when the heap usage rate is continuously higher than 80% within 10 minutes. If the monitoring template is a GC monitoring template, the performance indicators can be obtained when performing GC monitoring through the GC monitoring template. Select the heap usage data within 5 minutes from the current time, and calculate the average value of the heap usage within 5 minutes, and then check whether the average value is greater than 80%, and if it is greater than that, an alarm is issued.
此外,步骤S20还可以包括:In addition, step S20 may further include:
步骤S22,若所述监控模板为慢请求监控模板,则通过所述慢请求监控模板检测所述性能指标中的当前等待pending数量是否大于第二预设阈值,并检测所述性能指标中的当前JDBC连接数是否大于第三预设阈值;Step S22: If the monitoring template is a slow request monitoring template, use the slow request monitoring template to detect whether the currently pending pending number in the performance indicator is greater than a second preset threshold, and detect the current in the performance indicator. Whether the number of JDBC connections is greater than a third preset threshold;
此时,若当前等待pending数量大于第二预设阈值,且当前JDBC连接数大于第三预设阈值,则执行步骤S30:进行告警提醒。At this time, if the number of pending pending is greater than the second preset threshold and the current number of JDBC connections is greater than the third preset threshold, step S30 is performed: an alarm reminder is performed.
在本实施例中,监控主机在根据监控指令确定对应的监控模板之后,若该监控模板为慢请求监控模板,由于慢请求监控模板对应的预设告警规则为pending数量高于5且JDBC连接数超过10则触发告警,在通过慢请求监控模板进行慢请求监控时,可在获取到的性能指标中获取实时的pending数量和JDBC连接数(即当前pending数量和当前JDBC连接数),并检测当前pending数量是否大于5,同时检测当前JDBC连接数是否大于10。若当前pending数量大于5,同时当前JDBC连接数大于10,则进行告警提醒。若只满足其中一个监控告警条件,即只满足当前pending数量大于5,或当前JDBC连接数大于10时,则不进行告警提醒。In this embodiment, after the monitoring host determines a corresponding monitoring template according to the monitoring instruction, if the monitoring template is a slow request monitoring template, since the preset alarm rule corresponding to the slow request monitoring template is higher than 5 pending and the number of JDBC connections When it exceeds 10, an alarm is triggered. When performing slow request monitoring through the slow request monitoring template, the real-time pending number and JDBC connection number (that is, the current pending number and the current JDBC connection number) can be obtained from the obtained performance indicators, and the current detection is performed. Whether the number of pending is greater than 5 and whether the current number of JDBC connections is greater than 10. If the current pending number is greater than 5, and the current number of JDBC connections is greater than 10, an alarm is issued. If only one of the monitoring alarm conditions is met, that is, only when the current pending number is greater than 5 or the current JDBC connection number is greater than 10, no alarm reminder is performed.
在本实施例中,通过监控模板对Weblogic服务器的性能指标进行计算整合,如根据预设时间内某一性能指标的平均值确定是否告警,或对不同性能指标之间进行组合监控告警,而不是单一地根据某一个性能指标一旦预设范围就进行告警提醒,从而本申请可实现更为精准的告警,降低告警误报率。In this embodiment, the performance indicators of the Weblogic server are calculated and integrated through a monitoring template, such as determining whether to alarm based on the average value of a certain performance indicator within a preset time, or performing a combined monitoring alarm on different performance indicators instead of The alarm is reminded once according to a certain performance index once a preset range, so that this application can achieve more accurate alarms and reduce the alarm false alarm rate.
进一步的,参照图4,图4为本申请WebLogic服务器的监控告警方法第二实施例的流程示意图。Further, referring to FIG. 4, FIG. 4 is a schematic flowchart of a second embodiment of a monitoring and alarming method of a WebLogic server of the present application.
基于上述图2所示的第一实施例,本实施例中,步骤S30可以包括:Based on the first embodiment shown in FIG. 2 above, in this embodiment, step S30 may include:
步骤S31,若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机根据所述监控模板确定对应的告警级别;Step S31: if the calculated and integrated performance index satisfies a corresponding preset alarm rule, the monitoring host determines a corresponding alarm level according to the monitoring template;
步骤S32,所述监控主机根据所述告警级别确定对应的告警方式,并根据对应的告警方式进行告警提醒。In step S32, the monitoring host determines a corresponding alarm mode according to the alarm level, and performs an alarm reminder according to the corresponding alarm mode.
在本实施例中,由于WebLogic服务器的异常情况有紧急程度之分,因此可根据各不同的监控模板所监控的异常类型设定不同的告警级别。当通过某一监控模板监测到异常情况需要进行告警提醒时,则根据该监控模板对应的告警级别来进行告警提醒。其中,告警级别可以包括critical(非常严重的)、major(重要的)、warning(警告)、info(提示)。其中,各监控模板与告警级别的对应关系可以为:JVM CPU使用率监控模板对应critical级别告警,JDBC连接数监控模板对应major级别告警,堵塞线程监控模板对应critical级别告警,异常请求量监控模板对应warning级别告警,GC监控模板对应warning级别告警,慢请求监控模板对应critical级别告警。当然,各监控模板与告警级别之间的对应关系并不局限于上述关系,也可根据实际情况进行具体设定。In this embodiment, since the abnormality of the WebLogic server is divided into urgency levels, different alarm levels can be set according to the types of abnormalities monitored by different monitoring templates. When an abnormal situation is detected through a monitoring template and an alarm is required, the alarm is performed according to the alarm level corresponding to the monitoring template. The alarm levels can include critical (very severe), major (warning), warning (info). The corresponding relationship between each monitoring template and alarm level can be: JVM A CPU usage monitoring template corresponds to a critical level alarm, a JDBC connection number monitoring template corresponds to a major level alarm, a blocked thread monitoring template corresponds to a critical level alarm, an abnormal request volume monitoring template corresponds to a warning level alarm, a GC monitoring template corresponds to a warning level alarm, and a slow request monitoring The template corresponds to critical alarms. Of course, the corresponding relationship between each monitoring template and the alarm level is not limited to the above-mentioned relationship, and can be specifically set according to the actual situation.
然后,根据告警级别确定对应的告警方式,并根据对应的告警方式进行告警提醒。其中,不同的告警级别对应不同的告警方式,告警方式可以包括告警形式、告警次数和告警对象,告警形式包括邮件告警、短信告警和电话告警。例如,critical级别只对应邮件提醒,提醒次数5次,告警对象为第一责任人; major级别对应邮件提醒和短信提醒,提醒次数各5次,告警对象为第一责任人;warning级别对应邮件提醒和短信提醒,提醒次数各10次,告警对象为第一责任人和第二责任人;critical级别对应邮件提醒、短信提醒和电话提醒,邮件提醒和短信提醒次数各10次,电话提醒2次,告警对象为第一责任人和第二责任人。当然,在实际情况中,也可以只提醒1次,若未查阅邮件或未查看短信或未接电话,则继续进行提醒。此外,可以优先提醒第一责任人,当提醒达到一定次数,则可以转为提醒第二责任人,例如,如果单个告警持续发出10封邮件,均未查阅,则会自动向第二责任人发送邮件,或如果第一责任人没有接听电话则会呼叫第二责任人。告警级别与告警方式之间的对应关系可以根据实际情况进行具体设定,此处不作具体限定。Then, the corresponding alarm mode is determined according to the alarm level, and the alarm is reminded according to the corresponding alarm mode. Among them, different alarm levels correspond to different alarm methods. The alarm methods may include alarm forms, alarm times, and alarm objects. The alarm forms include email alerts, SMS alerts, and phone alerts. For example, the critical level only corresponds to the email reminder, the reminder number is 5 times, and the alert object is the first responsible person; Major level corresponds to email reminders and SMS reminders, each with 5 reminders, and the alert object is the first responsible person; warning level corresponds to email reminders and SMS reminders, each with 10 reminders, and the alert objects are the first and second responsible persons. ; Critical level corresponds to email reminder, SMS reminder and phone reminder, 10 times each for email reminder and SMS reminder, 2 times for phone reminder, and the alert object is the first responsible person and the second responsible person. Of course, in the actual situation, you can only remind once, if you haven't checked your email or SMS or missed calls, you will continue to be reminded. In addition, the first responsible person can be reminded first. When the reminder reaches a certain number of times, it can be changed to remind the second responsible person. For example, if a single alarm continues to send 10 emails and none have been consulted, it will be automatically sent to the second responsible person. Mail, or call the second owner if the first owner does not answer the call. The corresponding relationship between the alarm level and the alarm mode can be specifically set according to the actual situation, and is not specifically limited here.
由于某些监控的性能指标在某一时间段内虽然满足预设告警条件,但并不属于异常情况,为避免该种情况下仍进行告警提醒,在步骤S32之前,还可以包括以下步骤:Since some monitored performance indicators meet preset alarm conditions within a certain period of time, they are not abnormal situations. To avoid alarm reminders in such cases, the following steps may be included before step S32:
步骤S33,所述监控主机根据所述告警级别确定对应的告警屏蔽时间段,并检测当前时间是否在所述告警屏蔽时间段内;Step S33: The monitoring host determines a corresponding alarm mask time period according to the alarm level, and detects whether the current time is within the alarm mask time period;
若当前时间不在所述告警屏蔽时间段内,则执行步骤S32:所述监控主机根据所述告警级别确定对应的告警方式,并根据对应的告警方式进行告警提醒;If the current time is not within the alarm mask period, step S32 is performed: the monitoring host determines a corresponding alarm mode according to the alarm level, and performs an alarm reminder according to the corresponding alarm mode;
若当前时间在所述告警屏蔽时间段内,则执行步骤S34:所述监控主机不进行告警提醒。If the current time is within the alarm masking time period, step S34 is performed: the monitoring host does not perform an alarm reminder.
在本实施例中,在根据监控模板确定对应的告警级别之后,然后根据该告警级别确定对应的告警屏蔽时间段,并检测当前时间(即检测到异常时的时间)是否在该告警屏蔽时间段内,若当前时间不在该告警屏蔽时间段内,则继续执行后续步骤S32。若当前时间在该告警屏蔽时间段内,则不进行告警提醒。其中,该告警屏蔽时间段可根据实际需要来设置,也可以采用默认设置。当然,在具体实施例中,还可以直接根据所述监控模板确定对应的告警屏蔽时间段,无需先确定监控模板对应的告警等级,再根据告警等级确定对应的告警屏蔽时间段。In this embodiment, after the corresponding alarm level is determined according to the monitoring template, the corresponding alarm mask time period is determined according to the alarm level, and it is detected whether the current time (that is, the time when the abnormality is detected) is in the alarm mask time period. If the current time is not within the alarm mask period, the subsequent step S32 is performed. If the current time is within the alarm mask time period, no alarm reminder will be performed. The alarm masking time period can be set according to actual needs, or the default setting can be adopted. Of course, in specific embodiments, the corresponding alarm masking time period may also be determined directly according to the monitoring template, without first determining the alarm level corresponding to the monitoring template, and then determining the corresponding alarm masking time period according to the alarm level.
在本实施例中,根据WebLogic服务器的异常情况有紧急程度设定不同的告警级别,从而实现分级告警,使得该监控告警系统更加规范化和人性化。In this embodiment, different alarm levels are set according to the urgency of the abnormality of the WebLogic server, thereby achieving hierarchical alarms, making the monitoring and alarm system more standardized and humane.
进一步的,请参照图5,图5为本申请WebLogic服务器的监控告警方法第三实施例的流程示意图。Further, please refer to FIG. 5, which is a schematic flowchart of a third embodiment of a method for monitoring and alarming a WebLogic server of the present application.
基于上述图2所示的第一实施例,在步骤S30之后,该WebLogic服务器的监控告警方法还包括:Based on the first embodiment shown in FIG. 2 above, after step S30, the method for monitoring and alarming of a WebLogic server further includes:
步骤S40,所述监控主机根据告警情况在预设告警日志中记录对应的告警事件,所述告警事件包括异常时间、异常类型和告警方式。Step S40: The monitoring host records a corresponding alarm event in a preset alarm log according to the alarm condition, and the alarm event includes an abnormal time, an abnormal type, and an alarm mode.
在本实施例中,为了方便运维人员了解告警的具体情况,监控主机在进行了告警之后,还会对告警情况进行记录。具体的,监控主机中预先设置有告警日志,用以对告警事件进行记录;当然,在本实施例中,由于监控主机是对多个不同的WebLogic服务器进行监控,则可以设置多个不同的告警日志以分别进行告警记录。当监控主机进行了告警之后,将会在告警日志中记录对应的告警事件,该告警事件可以包括异常时间(WebLogic服务器生成异常的时间)、异常类型、告警方式等内容。In this embodiment, in order to facilitate the operation and maintenance personnel to understand the specific situation of the alarm, the monitoring host will also record the alarm situation after the alarm is performed. Specifically, an alarm log is set in the monitoring host in advance to record alarm events. Of course, in this embodiment, since the monitoring host monitors multiple different WebLogic servers, multiple different alarms can be set. Log for separate alarm recording. When the monitoring host reports an alarm, the corresponding alarm event will be recorded in the alarm log. The alarm event may include the abnormal time (the time when the WebLogic server generates the exception), the type of the abnormality, and the alarm method.
步骤S50,所述监控主机根据所述预设告警日志确定预设统计周期内的高频告警时间和/或高频异常类型,并根据所述高频告警时间和/或所述高频异常类型分别对应的监控数据生成对应的高频告警报告。Step S50: The monitoring host determines a high-frequency alarm time and / or a high-frequency anomaly type within a preset statistical period according to the preset alarm log, and according to the high-frequency alarm time and / or the high-frequency anomaly type Generate corresponding high-frequency alarm reports for the corresponding monitoring data.
进一步的,为了方便运维人员从宏观上了解各WebLogic服务器异常导致的告警情况,监控服务器还可以根据预设告警日志统计出一定时间段内的高频告警时间、高频异常类型等数据。其中,高频告警时间为告警的多发时间,例如6月份共告警300次,有200次告警发生在上午的9点到10点,则4月份的高频告警时间为上午9点到10点;当然,在实际中,统计的周期时长可以是根据实际情况进行设置,如可以是以一个月为一个统计周期,还可以是以一周为一个统计周期;而对于“高频”的标准,也可以是根据实际情况进行设置,如可以将某一次数阈值作为“高频”的判断标准,也可以是以某一占比阈值作为“高频”的判断标准。高频异常类型则为异常情况的多发类型,例如6月份共告警300次,有200次异常类型为JVM CPU使用率异常,则4月份的高频异常类型为JVM CPU使用率异常;类似的,统计的周期时长以及“高频”的判断标准也可以是根据实际情况进行设置。而在得到这些宏观的统计分析结果时,监控主机还可以给出相关的报告,以供运维人员查看;例如,监控主机可以根据预设告警日志统计某一时间段的高频告警时间和/或高频异常类型时,将会获取到这些高频告警时间和/或高频异常类型所对应的监控数据,并生成对应的高频告警报告以供运维人员查看,运维人员可根据该报告进行维修和相关的优化,保证WebLogic服务器的正常进行。Further, in order to facilitate the operation and maintenance personnel to understand the alarm conditions caused by the abnormality of each WebLogic server from a macro perspective, the monitoring server may also calculate data such as high-frequency alarm time and high-frequency abnormality within a certain period of time according to a preset alarm log. Among them, the high-frequency alarm time is the time of frequent occurrence of the alarm. For example, there were 300 alarms in June, and 200 alarms occurred from 9 am to 10 am. The high-frequency alarm time in April was 9 am to 10 am. Of course, in practice, the statistical period can be set according to the actual situation, such as one month as a statistical cycle, or one week as a statistical cycle; for the "high frequency" standard, it can also be It is set according to the actual situation. For example, a certain number of thresholds can be used as the "high frequency" judgment standard, or a certain percentage threshold can be used as the "high frequency" judgment standard. The high-frequency exception type is the multiple type of abnormal conditions. For example, there were 300 alarms in June, and 200 exception types were JVM. If the CPU usage is abnormal, the high-frequency exception type in April is JVM The CPU usage is abnormal; similarly, the statistical cycle time and the "high frequency" judgment criteria can also be set according to the actual situation. When obtaining these macroscopic statistical analysis results, the monitoring host can also provide relevant reports for operation and maintenance personnel to view; for example, the monitoring host can count the high-frequency alarm time and / Or high-frequency anomaly types, the monitoring data corresponding to these high-frequency alarm times and / or high-frequency anomaly types will be obtained, and corresponding high-frequency alarm reports will be generated for operation and maintenance personnel to view. The report performs maintenance and related optimizations to ensure the normal operation of the WebLogic server.
本申请还提供一种WebLogic服务器的监控告警装置。The application also provides a monitoring and alarm device for a WebLogic server.
参照图6,图6为本申请WebLogic服务器的监控告警装置第一实施例的功能模块示意图。Referring to FIG. 6, FIG. 6 is a schematic diagram of functional modules of a first embodiment of a monitoring and alarm device of a WebLogic server of the present application.
在本实施例中,所述WebLogic服务器的监控告警装置包括:In this embodiment, the monitoring and alarm device of the WebLogic server includes:
获取模块10,用于所述监控主机在接收到监控指令时,根据所述监控指令通过所述数据采集主机获取对应的WebLogic服务器的性能指标;The obtaining module 10 is configured to, when the monitoring host receives the monitoring instruction, obtain the performance index of the corresponding WebLogic server through the data collection host according to the monitoring instruction;
检测模块20,用于所述监控主机根据所述监控指令确定对应的监控模板,通过所述监控模板对所述性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则;A detection module 20 is configured for the monitoring host to determine a corresponding monitoring template according to the monitoring instruction, calculate and integrate the performance indicator through the monitoring template, and detect whether the calculated and integrated performance indicator meets a corresponding preset Alarm rules
告警模块30,用于若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机进行告警提醒。The alarm module 30 is configured to, if the calculated and integrated performance index meets a corresponding preset alarm rule, the monitoring host performs an alarm reminder.
其中,上述WebLogic服务器的监控告警装置的各虚拟功能模块存储于图1所示监控主机的存储器1005中,用于实现计算机可读指令的所有功能;各模块被处理器1001执行时,可实现通过数据采集主机获取各WebLogic服务器的性能指标,然后通过监控模板对性能指标进行计算整合,并在检测到经计算整合后的性能指标满足对应的预设告警规则时进行告警提醒的功能。The virtual function modules of the monitoring and alarming device of the WebLogic server are stored in the memory 1005 of the monitoring host shown in FIG. 1 and are used to implement all functions of the computer-readable instructions. When each module is executed by the processor 1001, it can be implemented through The data collection host obtains the performance indicators of each WebLogic server, and then calculates and integrates the performance indicators through a monitoring template, and performs an alarm reminder function when it is detected that the calculated and integrated performance indicators meet the corresponding preset alarm rules.
进一步的,所述监控模板包括Java虚拟机中央处理器JVM CPU使用率监控模板、Java数据库连接JDBC连接数监控模板、堵塞线程监控模板、异常请求量监控模板、垃圾回收GC监控模板和慢请求监控模板。Further, the monitoring template includes a Java virtual machine central processor JVM CPU usage monitoring template, Java database connection JDBC connection number monitoring template, blocked thread monitoring template, abnormal request monitoring template, garbage collection GC monitoring template, and slow request monitoring template.
进一步的,所述检测模块20具体用于若所述监控模板为JVM CPU使用率监控模板,或JDBC连接数监控模板,或堵塞线程监控模板,或异常请求量监控模板,或GC监控模板,则通过所述监控模板计算预设时间内所述性能指标中的对应指标的平均值,并检测所述平均值是否大于对应的第一预设阈值;Further, the detection module 20 is specifically configured to be used if the monitoring template is a JVM. A CPU usage monitoring template, or a JDBC connection number monitoring template, or a blocked thread monitoring template, or an abnormal request volume monitoring template, or a GC monitoring template, and the monitoring template is used to calculate a corresponding indicator in the performance indicator within a preset time An average value of, and detecting whether the average value is greater than a corresponding first preset threshold;
所述告警模块30具体用于若所述平均值大于对应的第一预设阈值,则所述监控主机进行告警提醒。The alarm module 30 is specifically configured to, if the average value is greater than a corresponding first preset threshold, the monitoring host performs an alarm reminder.
进一步的,所述检测模块20具体用于若所述监控模板为慢请求监控模板,则通过所述慢请求监控模板检测所述性能指标中的当前等待pending数量是否大于第二预设阈值,并检测所述性能指标中的当前JDBC连接数是否大于第三预设阈值;Further, the detection module 20 is specifically configured to detect, if the monitoring template is a slow request monitoring template, whether the current waiting pending number in the performance index is greater than a second preset threshold value through the slow request monitoring template, and Detecting whether the current number of JDBC connections in the performance index is greater than a third preset threshold;
所述告警模块30具体用于若当前等待pending数量大于第二预设阈值,且当前JDBC连接数大于第三预设阈值,则所述监控主机进行告警提醒。The alarm module 30 is specifically configured to: if the current number of pending pending is greater than a second preset threshold and the current number of JDBC connections is greater than a third preset threshold, the monitoring host performs an alarm reminder.
进一步的,所述告警模块30包括:Further, the alarm module 30 includes:
级别确定单元,用于若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机根据所述监控模板确定对应的告警级别;A level determining unit, configured to: if the calculated and integrated performance index meets a corresponding preset alarm rule, the monitoring host determines a corresponding alarm level according to the monitoring template;
告警提醒单元,用于所述监控主机根据所述告警级别确定对应的告警方式,并根据对应的告警方式进行告警提醒。An alarm reminding unit is used for the monitoring host to determine a corresponding alarm mode according to the alarm level, and to perform an alarm reminder according to the corresponding alarm mode.
进一步的,所述告警模块30还包括:Further, the alarm module 30 further includes:
时间检测单元,用于所述监控主机根据所述告警级别确定对应的告警屏蔽时间段,并检测当前时间是否在所述告警屏蔽时间段内;A time detection unit, configured to determine, by the monitoring host, a corresponding alarm mask time period according to the alarm level, and detect whether the current time is within the alarm mask time period;
所述告警模块30具体用于若当前时间在所述告警屏蔽时间段内,则所述监控主机不进行告警提醒。The alarm module 30 is specifically configured to: if the current time is within the alarm blocking time period, the monitoring host does not perform an alarm reminder.
进一步的,所述WebLogic服务器的监控告警装置还包括:Further, the monitoring and alarm device of the WebLogic server further includes:
记录模块,用于所述监控主机根据告警情况在预设告警日志中记录对应的告警事件,所述告警事件包括异常时间、异常类型和告警方式;A recording module for the monitoring host to record a corresponding alarm event in a preset alarm log according to an alarm situation, where the alarm event includes an abnormal time, an abnormal type, and an alarm mode;
报告模块,用于所述监控主机根据所述预设告警日志确定预设统计周期内的高频告警时间和/或高频异常类型,并根据所述高频告警时间和/或所述高频异常类型分别对应的监控数据生成对应的高频告警报告。A reporting module for the monitoring host to determine a high-frequency alarm time and / or a high-frequency anomaly type within a preset statistical period according to the preset alarm log, and according to the high-frequency alarm time and / or the high-frequency The monitoring data corresponding to the abnormality type generates corresponding high-frequency alarm reports.
其中,上述WebLogic服务器的监控告警装置中各个模块的功能实现与上述WebLogic服务器的监控告警方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。The function implementation of each module in the monitoring and alarming device of the WebLogic server corresponds to the steps in the embodiment of the monitoring and alarming method of the WebLogic server, and the functions and implementation processes thereof will not be repeated here.
本申请还提供一种WebLogic服务器的监控告警系统,该WebLogic服务器的监控告警系统包括监控主机、数据采集主机和多台WebLogic服务器,还包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被所述处理器执行时实现如以上任一项实施例所述的WebLogic服务器的监控告警方法的步骤。The application also provides a monitoring and alarming system of a WebLogic server. The monitoring and alarming system of the WebLogic server includes a monitoring host, a data collection host, and multiple WebLogic servers. The monitoring and alarm system also includes a memory, a processor, and a storage device stored in the storage device. The computer-readable instructions running on the processor are described, and when the computer-readable instructions are executed by the processor, the steps of the method for monitoring and alarming of the WebLogic server according to any one of the above embodiments are implemented.
本申请WebLogic服务器的监控告警系统的具体实施例与上述WebLogic服务器的监控告警方法各实施例基本相同,在此不作赘述。The specific embodiments of the monitoring and alarming system of the WebLogic server of this application are basically the same as the above embodiments of the monitoring and alarming method of the WebLogic server, and will not be repeated here.
本申请还提供一种计算机存储介质,所述计算机可读存储介质可以为非易失性可读存储介质。该计算机存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如以上任一项实施例所述的WebLogic服务器的监控告警方法的步骤。The present application also provides a computer storage medium, and the computer-readable storage medium may be a non-volatile readable storage medium. The computer storage medium stores computer-readable instructions. When the computer-readable instructions are executed by a processor, the steps of the method for monitoring and alarming a WebLogic server according to any one of the foregoing embodiments are implemented.
本申请计算机存储介质的具体实施例与上述WebLogic服务器的监控告警方法各实施例基本相同,在此不作赘述。The specific embodiments of the computer storage medium of this application are basically the same as the above embodiments of the method of monitoring and alarming of the WebLogic server, and are not described herein again.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only preferred embodiments of the present application, and thus do not limit the patent scope of the present application. Any equivalent structure or equivalent process transformation made by using the contents of the specification and drawings of the present application, or directly or indirectly used in other related technical fields Are included in the scope of patent protection of this application.

Claims (20)

  1. 一种WebLogic服务器的监控告警方法,其特征在于,所述WebLogic服务器的监控告警方法应用于监控告警系统,所述监控告警系统包括监控主机、数据采集主机和多台中间件WebLogic服务器,所述监控主机与所述数据采集主机通信连接,所述数据采集主机分别与多台WebLogic服务器通信连接,所述方法包括以下步骤: A monitoring and alarming method of a WebLogic server, characterized in that the monitoring and alarming method of the WebLogic server is applied to a monitoring and alarming system, and the monitoring and alarming system includes a monitoring host, a data collection host, and multiple middleware WebLogic servers. The host is communicatively connected to the data acquisition host, and the data acquisition host is communicatively connected to multiple WebLogic servers, and the method includes the following steps:
    所述监控主机在接收到监控指令时,根据所述监控指令通过所述数据采集主机获取对应的WebLogic服务器的性能指标;When the monitoring host receives the monitoring instruction, it obtains the performance index of the corresponding WebLogic server through the data collection host according to the monitoring instruction;
    所述监控主机根据所述监控指令确定对应的监控模板,通过所述监控模板对所述性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则;The monitoring host determines a corresponding monitoring template according to the monitoring instruction, calculates and integrates the performance indicator through the monitoring template, and detects whether the calculated and integrated performance indicator meets a corresponding preset alarm rule;
    若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机进行告警提醒。If the calculated and integrated performance index satisfies the corresponding preset alarm rule, the monitoring host performs an alarm reminder.
  2. 如权利要求1所述的WebLogic服务器的监控告警方法,其特征在于,所述监控模板包括Java虚拟机中央处理器JVM CPU使用率监控模板、Java数据库连接JDBC连接数监控模板、堵塞线程监控模板、异常请求量监控模板、垃圾回收GC监控模板和慢请求监控模板。The method for monitoring and alarming a WebLogic server according to claim 1, wherein the monitoring template comprises a Java virtual machine central processor (JVM) CPU usage monitoring template, Java database connection JDBC connection number monitoring template, blocked thread monitoring template, abnormal request monitoring template, garbage collection GC monitoring template, and slow request monitoring template.
  3. 如权利要求2所述的WebLogic服务器的监控告警方法,其特征在于,所述通过所述监控模板对所述性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则的步骤,包括:The method for monitoring and alarming of a WebLogic server according to claim 2, wherein the performance indicator is calculated and integrated through the monitoring template, and it is detected whether the calculated and integrated performance indicator meets a corresponding preset alarm The steps of the rule include:
    若所述监控模板为JVM CPU使用率监控模板,或JDBC连接数监控模板,或堵塞线程监控模板,或异常请求量监控模板,或GC监控模板,则通过所述监控模板计算预设时间内所述性能指标中的对应指标的平均值,并检测所述平均值是否大于对应的第一预设阈值;If the monitoring template is a JVM A CPU usage monitoring template, or a JDBC connection number monitoring template, or a blocked thread monitoring template, or an abnormal request volume monitoring template, or a GC monitoring template, and the monitoring template is used to calculate a corresponding indicator in the performance indicator within a preset time. An average value of, and detecting whether the average value is greater than a corresponding first preset threshold;
    所述若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机进行告警提醒的步骤,包括:If the calculated and integrated performance index satisfies a corresponding preset alarm rule, the step of performing alarm reminding by the monitoring host includes:
    若所述平均值大于对应的第一预设阈值,则所述监控主机进行告警提醒。If the average value is greater than the corresponding first preset threshold, the monitoring host performs an alarm reminder.
  4. 如权利要求2所述的WebLogic服务器的监控告警方法,其特征在于,所述通过所述监控模板对所述性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则的步骤,还包括:The method for monitoring and alarming of a WebLogic server according to claim 2, wherein the performance indicator is calculated and integrated through the monitoring template, and it is detected whether the calculated and integrated performance indicator meets a corresponding preset alarm The steps of the rule also include:
    若所述监控模板为慢请求监控模板,则通过所述慢请求监控模板检测所述性能指标中的当前等待pending数量是否大于第二预设阈值,并检测所述性能指标中的当前JDBC连接数是否大于第三预设阈值;If the monitoring template is a slow request monitoring template, use the slow request monitoring template to detect whether the number of currently pending pending in the performance indicator is greater than a second preset threshold, and detect the current number of JDBC connections in the performance indicator. Whether it is greater than a third preset threshold;
    所述若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机进行告警提醒的步骤,包括:If the calculated and integrated performance index satisfies a corresponding preset alarm rule, the step of performing alarm reminding by the monitoring host includes:
    若当前等待pending数量大于第二预设阈值,且当前JDBC连接数大于第三预设阈值,则所述监控主机进行告警提醒。If the number of pending pending is greater than the second preset threshold and the current number of JDBC connections is greater than the third preset threshold, the monitoring host performs an alarm reminder.
  5. 如权利要求1所述的WebLogic服务器的监控告警方法,其特征在于,所述若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机进行告警提醒的步骤,包括:The method for monitoring and alarming a WebLogic server according to claim 1, wherein, if the calculated and integrated performance index satisfies a corresponding preset alarm rule, the step of performing an alarm reminding by the monitoring host comprises:
    若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机根据所述监控模板确定对应的告警级别;If the calculated and integrated performance index satisfies a corresponding preset alarm rule, the monitoring host determines a corresponding alarm level according to the monitoring template;
    所述监控主机根据所述告警级别确定对应的告警方式,并根据对应的告警方式进行告警提醒。The monitoring host determines a corresponding alarm mode according to the alarm level, and performs an alarm reminder according to the corresponding alarm mode.
  6. 如权利要求5所述的WebLogic服务器的监控告警方法,其特征在于,所述监控主机根据所述告警级别确定对应的告警方式,并根据对应的告警方式进行告警提醒的步骤之前,还包括:The method for monitoring and alarming a WebLogic server according to claim 5, wherein before the step of the monitoring host determining a corresponding alarm mode according to the alarm level and performing an alarm reminder according to the corresponding alarm mode, further comprising:
    所述监控主机根据所述告警级别确定对应的告警屏蔽时间段,并检测当前时间是否在所述告警屏蔽时间段内;The monitoring host determines a corresponding alarm mask period according to the alarm level, and detects whether the current time is within the alarm mask period;
    若当前时间不在所述告警屏蔽时间段内,则执行步骤:所述监控主机根据所述告警级别确定对应的告警方式,并根据对应的告警方式进行告警提醒;If the current time is not within the alarm masking time period, performing the steps: the monitoring host determines a corresponding alarm mode according to the alarm level, and performs an alarm reminder according to the corresponding alarm mode;
    若当前时间在所述告警屏蔽时间段内,则所述监控主机不进行告警提醒。If the current time is within the alarm masking time period, the monitoring host does not perform an alarm reminder.
  7. 如权利要求1所述的WebLogic服务器的监控告警方法,其特征在于,所述WebLogic服务器的监控告警方法还包括:The method of monitoring and alarming a WebLogic server according to claim 1, wherein the method of monitoring and alarming a WebLogic server further comprises:
    所述监控主机根据告警情况在预设告警日志中记录对应的告警事件,所述告警事件包括异常时间、异常类型和告警方式;The monitoring host records a corresponding alarm event in a preset alarm log according to an alarm condition, and the alarm event includes an abnormal time, an abnormal type, and an alarm mode;
    所述监控主机根据所述预设告警日志确定预设统计周期内的高频告警时间和/或高频异常类型,并根据所述高频告警时间和/或所述高频异常类型分别对应的监控数据生成对应的高频告警报告。The monitoring host determines a high-frequency alarm time and / or a high-frequency anomaly type within a preset statistical period according to the preset alarm log, and respectively corresponding to the high-frequency alarm time and / or the high-frequency anomaly type. The monitoring data generates corresponding high-frequency alarm reports.
  8. 一种WebLogic服务器的监控告警装置,其特征在于,所述WebLogic服务器的监控告警装置包括:A monitoring and alarm device for a WebLogic server, characterized in that the monitoring and alarm device for a WebLogic server includes:
    获取模块,用于所述监控主机在接收到监控指令时,根据所述监控指令通过所述数据采集主机获取对应的WebLogic服务器的性能指标;An acquisition module, configured to, when the monitoring host receives a monitoring instruction, obtain a performance indicator of a corresponding WebLogic server through the data collection host according to the monitoring instruction;
    检测模块,用于所述监控主机根据所述监控指令确定对应的监控模板,通过所述监控模板对所述性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则;A detection module for the monitoring host to determine a corresponding monitoring template according to the monitoring instruction, calculate and integrate the performance indicator through the monitoring template, and detect whether the calculated and integrated performance indicator meets a corresponding preset alarm rule;
    告警模块,用于若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机进行告警提醒。The alarm module is configured to: if the calculated and integrated performance index meets a corresponding preset alarm rule, the monitoring host performs an alarm reminder.
  9. 如权利要求8所述的WebLogic服务器的监控告警装置,其特征在于,所述监控模板包括Java虚拟机中央处理器JVM CPU使用率监控模板、Java数据库连接JDBC连接数监控模板、堵塞线程监控模板、异常请求量监控模板、垃圾回收GC监控模板和慢请求监控模板。 The monitoring and alarming device for a WebLogic server according to claim 8, wherein the monitoring template comprises a Java virtual machine central processor (JVM) CPU usage monitoring template, Java database connection JDBC connection number monitoring template, blocked thread monitoring template, abnormal request monitoring template, garbage collection GC monitoring template, and slow request monitoring template. Ranch
  10. 如权利要求9所述的WebLogic服务器的监控告警装置,其特征在于,所述检测模块具体用于若所述监控模板为JVM CPU使用率监控模板,或JDBC连接数监控模板,或堵塞线程监控模板,或异常请求量监控模板,或GC监控模板,则通过所述监控模板计算预设时间内所述性能指标中的对应指标的平均值,并检测所述平均值是否大于对应的第一预设阈值;The monitoring and alarming device for a WebLogic server according to claim 9, wherein the detection module is specifically configured to use a JVM if the monitoring template is a JVM A CPU usage monitoring template, or a JDBC connection number monitoring template, or a blocked thread monitoring template, or an abnormal request volume monitoring template, or a GC monitoring template, and the monitoring template is used to calculate a corresponding indicator in the performance indicator within a preset time. An average value of, and detecting whether the average value is greater than a corresponding first preset threshold;
    所述告警模块具体用于若所述平均值大于对应的第一预设阈值,则所述监控主机进行告警提醒。The alarm module is specifically configured to: if the average value is greater than a corresponding first preset threshold, the monitoring host performs an alarm reminder.
  11. 如权利要求9所述的WebLogic服务器的监控告警装置,其特征在于,所述检测模块还具体用于若所述监控模板为慢请求监控模板,则通过所述慢请求监控模板检测所述性能指标中的当前等待pending数量是否大于第二预设阈值,并检测所述性能指标中的当前JDBC连接数是否大于第三预设阈值;The monitoring and alarm device for a WebLogic server according to claim 9, wherein the detection module is further specifically configured to detect the performance index through the slow request monitoring template if the monitoring template is a slow request monitoring template Whether the current number of pending pending in the server is greater than a second preset threshold, and detecting whether the current number of JDBC connections in the performance index is greater than a third preset threshold;
    所述告警模块还具体用于若当前等待pending数量大于第二预设阈值,且当前JDBC连接数大于第三预设阈值,则所述监控主机进行告警提醒。The alarm module is further specifically configured to: if the current number of pending pending is greater than a second preset threshold and the current number of JDBC connections is greater than a third preset threshold, the monitoring host performs an alarm reminder.
  12. 如权利要求8所述的WebLogic服务器的监控告警装置,其特征在于,所述WebLogic服务器的监控告警装置还包括:The monitoring and alarm device of the WebLogic server according to claim 8, wherein the monitoring and alarm device of the WebLogic server further comprises:
    级别确定单元,用于若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机根据所述监控模板确定对应的告警级别;A level determining unit, configured to: if the calculated and integrated performance index meets a corresponding preset alarm rule, the monitoring host determines a corresponding alarm level according to the monitoring template;
    告警提醒单元,用于所述监控主机根据所述告警级别确定对应的告警方式,并根据对应的告警方式进行告警提醒。An alarm reminding unit is used for the monitoring host to determine a corresponding alarm mode according to the alarm level, and to perform an alarm reminder according to the corresponding alarm mode.
  13. 一种WebLogic服务器的监控告警系统,其特征在于,所述WebLogic服务器的监控告警系统包括监控主机、数据采集主机和多台WebLogic服务器,还包括存储器、处理器以及存储在所述存储器上并可被所述处理器执行的计算机可读指令,其中所述计算机可读指令被所述处理器执行时,实现以下步骤:A monitoring and alarm system of a WebLogic server, characterized in that the monitoring and alarm system of the WebLogic server includes a monitoring host, a data acquisition host, and multiple WebLogic servers, and further includes a memory, a processor, and a memory that can be stored on the memory and can be Computer-readable instructions executed by the processor, wherein when the computer-readable instructions are executed by the processor, the following steps are implemented:
    所述监控主机在接收到监控指令时,根据所述监控指令通过所述数据采集主机获取对应的WebLogic服务器的性能指标;When the monitoring host receives the monitoring instruction, it obtains the performance index of the corresponding WebLogic server through the data collection host according to the monitoring instruction;
    所述监控主机根据所述监控指令确定对应的监控模板,通过所述监控模板对所述性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则;The monitoring host determines a corresponding monitoring template according to the monitoring instruction, calculates and integrates the performance indicator through the monitoring template, and detects whether the calculated and integrated performance indicator meets a corresponding preset alarm rule;
    若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机进行告警提醒。If the calculated and integrated performance index satisfies the corresponding preset alarm rule, the monitoring host performs an alarm reminder.
  14. 如权利要求13所述的WebLogic服务器的监控告警系统,其特征在于,所述监控模板包括Java虚拟机中央处理器JVM CPU使用率监控模板、Java数据库连接JDBC连接数监控模板、堵塞线程监控模板、异常请求量监控模板、垃圾回收GC监控模板和慢请求监控模板。The monitoring and alarm system of the WebLogic server according to claim 13, wherein the monitoring template comprises a Java virtual machine central processor (JVM) CPU usage monitoring template, Java database connection JDBC connection number monitoring template, blocked thread monitoring template, abnormal request monitoring template, garbage collection GC monitoring template, and slow request monitoring template.
  15. 如权利要求14所述的WebLogic服务器的监控告警系统,其特征在于,所述计算机可读指令被所述处理器执行时,还实现以下步骤:The monitoring and alarm system of the WebLogic server according to claim 14, wherein when the computer-readable instructions are executed by the processor, the following steps are further implemented:
    若所述监控模板为JVM CPU使用率监控模板,或JDBC连接数监控模板,或堵塞线程监控模板,或异常请求量监控模板,或GC监控模板,则通过所述监控模板计算预设时间内所述性能指标中的对应指标的平均值,并检测所述平均值是否大于对应的第一预设阈值;If the monitoring template is a JVM A CPU usage monitoring template, or a JDBC connection number monitoring template, or a blocked thread monitoring template, or an abnormal request volume monitoring template, or a GC monitoring template, and the monitoring template is used to calculate a corresponding indicator in the performance indicator within a preset time. An average value of, and detecting whether the average value is greater than a corresponding first preset threshold;
    若所述平均值大于对应的第一预设阈值,则所述监控主机进行告警提醒。If the average value is greater than the corresponding first preset threshold, the monitoring host performs an alarm reminder.
  16. 如权利要求14所述的WebLogic服务器的监控告警系统,其特征在于,所述计算机可读指令被所述处理器执行时,还实现以下步骤:The monitoring and alarm system of the WebLogic server according to claim 14, wherein when the computer-readable instructions are executed by the processor, the following steps are further implemented:
    若所述监控模板为慢请求监控模板,则通过所述慢请求监控模板检测所述性能指标中的当前等待pending数量是否大于第二预设阈值,并检测所述性能指标中的当前JDBC连接数是否大于第三预设阈值;If the monitoring template is a slow request monitoring template, use the slow request monitoring template to detect whether the number of currently pending pending in the performance indicator is greater than a second preset threshold, and detect the current number of JDBC connections in the performance indicator. Whether it is greater than a third preset threshold;
    若当前等待pending数量大于第二预设阈值,且当前JDBC连接数大于第三预设阈值,则所述监控主机进行告警提醒。If the number of pending pending is greater than the second preset threshold and the current number of JDBC connections is greater than the third preset threshold, the monitoring host performs an alarm reminder.
  17. 一种计算机存储介质,其特征在于,所述计算机存储介质上存储有计算机可读指令,其中所述计算机可读指令被处理器执行时,实现以下步骤:A computer storage medium is characterized in that computer readable instructions are stored on the computer storage medium, and when the computer readable instructions are executed by a processor, the following steps are implemented:
    所述监控主机在接收到监控指令时,根据所述监控指令通过所述数据采集主机获取对应的WebLogic服务器的性能指标;When the monitoring host receives the monitoring instruction, it obtains the performance index of the corresponding WebLogic server through the data collection host according to the monitoring instruction;
    所述监控主机根据所述监控指令确定对应的监控模板,通过所述监控模板对所述性能指标进行计算整合,并检测经计算整合后的性能指标是否满足对应的预设告警规则;The monitoring host determines a corresponding monitoring template according to the monitoring instruction, calculates and integrates the performance indicator through the monitoring template, and detects whether the calculated and integrated performance indicator meets a corresponding preset alarm rule;
    若经计算整合后的性能指标满足对应的预设告警规则,则所述监控主机进行告警提醒。If the calculated and integrated performance index satisfies the corresponding preset alarm rule, the monitoring host performs an alarm reminder.
  18. 如权利要求17所述的计算机存储介质,其特征在于,所述监控模板包括Java虚拟机中央处理器JVM CPU使用率监控模板、Java数据库连接JDBC连接数监控模板、堵塞线程监控模板、异常请求量监控模板、垃圾回收GC监控模板和慢请求监控模板。The computer storage medium of claim 17, wherein the monitoring template comprises a Java virtual machine central processing unit (JVM) CPU usage monitoring template, Java database connection JDBC connection number monitoring template, blocked thread monitoring template, abnormal request monitoring template, garbage collection GC monitoring template, and slow request monitoring template.
  19. 如权利要求18所述的计算机存储介质,其特征在于,所述计算机可读指令被处理器执行时,还实现以下步骤:The computer storage medium of claim 18, wherein when the computer-readable instructions are executed by a processor, the following steps are further implemented:
    若所述监控模板为JVM CPU使用率监控模板,或JDBC连接数监控模板,或堵塞线程监控模板,或异常请求量监控模板,或GC监控模板,则通过所述监控模板计算预设时间内所述性能指标中的对应指标的平均值,并检测所述平均值是否大于对应的第一预设阈值;If the monitoring template is a JVM A CPU usage monitoring template, or a JDBC connection number monitoring template, or a blocked thread monitoring template, or an abnormal request volume monitoring template, or a GC monitoring template, and the monitoring template is used to calculate a corresponding indicator in the performance indicator within a preset time. An average value of, and detecting whether the average value is greater than a corresponding first preset threshold;
    若所述平均值大于对应的第一预设阈值,则所述监控主机进行告警提醒。If the average value is greater than the corresponding first preset threshold, the monitoring host performs an alarm reminder.
  20. 如权利要求18所述的计算机存储介质,其特征在于,所述计算机可读指令被处理器执行时,还实现以下步骤:The computer storage medium of claim 18, wherein when the computer-readable instructions are executed by a processor, the following steps are further implemented:
    若所述监控模板为慢请求监控模板,则通过所述慢请求监控模板检测所述性能指标中的当前等待pending数量是否大于第二预设阈值,并检测所述性能指标中的当前JDBC连接数是否大于第三预设阈值;If the monitoring template is a slow request monitoring template, use the slow request monitoring template to detect whether the number of currently pending pending in the performance indicator is greater than a second preset threshold, and detect the current number of JDBC connections in the performance indicator. Whether it is greater than a third preset threshold;
    若当前等待pending数量大于第二预设阈值,且当前JDBC连接数大于第三预设阈值,则所述监控主机进行告警提醒。If the number of pending pending is greater than the second preset threshold and the current number of JDBC connections is greater than the third preset threshold, the monitoring host performs an alarm reminder.
PCT/CN2018/103336 2018-07-18 2018-08-30 Monitoring alarm method, device and system for weblogic server, and computer storage medium WO2020015061A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810800086.4 2018-07-18
CN201810800086.4A CN109726072B (en) 2018-07-18 2018-07-18 WebLogic server monitoring and alarming method, device and system and computer storage medium

Publications (1)

Publication Number Publication Date
WO2020015061A1 true WO2020015061A1 (en) 2020-01-23

Family

ID=66294570

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/103336 WO2020015061A1 (en) 2018-07-18 2018-08-30 Monitoring alarm method, device and system for weblogic server, and computer storage medium

Country Status (2)

Country Link
CN (1) CN109726072B (en)
WO (1) WO2020015061A1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111324511A (en) * 2020-02-24 2020-06-23 北京达佳互联信息技术有限公司 Alarm rule generation method and device, electronic equipment and storage medium
CN111324520A (en) * 2020-03-06 2020-06-23 五八有限公司 Service interface monitoring method and device, electronic equipment and storage medium
CN111352806A (en) * 2020-03-31 2020-06-30 中国工商银行股份有限公司 Log data monitoring method and device
CN111414351A (en) * 2020-03-20 2020-07-14 中国建设银行股份有限公司 Performance diagnosis method and device of MySQ L database
CN111444074A (en) * 2020-03-27 2020-07-24 北京贝斯平云科技有限公司 Data monitoring method and device, electronic equipment and readable storage medium
CN111737231A (en) * 2020-06-23 2020-10-02 平安普惠企业管理有限公司 Database automatic analysis method and device, computer equipment and storage medium
CN111782433A (en) * 2020-06-30 2020-10-16 北京百度网讯科技有限公司 Exception troubleshooting method, device, electronic equipment and storage medium
CN112149975A (en) * 2020-09-11 2020-12-29 杭州东方通信软件技术有限公司 APM monitoring system and method based on artificial intelligence
CN112254842A (en) * 2020-10-10 2021-01-22 广联达科技股份有限公司 Temperature monitoring method and system for mass concrete and electronic equipment
CN112684748A (en) * 2020-11-16 2021-04-20 航天信息股份有限公司 Monitoring method and system compatible with various monitored devices
CN113064800A (en) * 2021-04-19 2021-07-02 上海安畅网络科技股份有限公司 Early warning method and device, electronic equipment and readable storage medium
CN114973615A (en) * 2022-05-12 2022-08-30 北京软通智慧科技有限公司 Method and device for monitoring emergency, electronic equipment and storage medium
CN117395132A (en) * 2023-12-13 2024-01-12 江西云眼视界科技股份有限公司 Distributed alarm monitoring method, system, storage medium and electronic equipment

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112054913B (en) * 2019-06-05 2023-07-18 厦门网宿有限公司 Data monitoring system and method
CN110543402B (en) * 2019-09-09 2023-09-19 上海新炬网络技术有限公司 Automatic monitoring and dynamic adjusting method based on WebLogic middleware core parameters
CN110727586A (en) * 2019-09-16 2020-01-24 平安科技(深圳)有限公司 Host anomaly monitoring method and device, storage medium and server
CN111290909A (en) * 2020-01-19 2020-06-16 山东汇贸电子口岸有限公司 System and method for monitoring and alarming ceph cluster
CN111431738B (en) * 2020-03-10 2022-12-16 广州嘉为科技有限公司 Alarm monitoring method based on Internet operation and maintenance
CN111585785B (en) * 2020-03-27 2023-07-21 中国平安人寿保险股份有限公司 Method and device for shielding alarm information, computer equipment and storage medium
CN113806166A (en) * 2021-08-25 2021-12-17 合众人寿保险股份有限公司 Object monitoring method and device, storage medium and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140040174A1 (en) * 2012-08-01 2014-02-06 Empire Technology Development Llc Anomaly detection for cloud monitoring
CN103744771A (en) * 2014-01-28 2014-04-23 中国工商银行股份有限公司 Method, equipment and system for monitoring host performance benchmark deviations, equipment and system
CN104301159A (en) * 2014-11-13 2015-01-21 中国建设银行股份有限公司 Monitoring method and system of server cluster
CN104954184A (en) * 2015-06-15 2015-09-30 四川长虹电器股份有限公司 Monitoring and alarming method and system for cloud background server cluster
CN107294764A (en) * 2017-04-26 2017-10-24 中国科学院信息工程研究所 Intelligent supervision method and intelligent monitoring system
CN108259270A (en) * 2018-01-11 2018-07-06 郑州云海信息技术有限公司 A kind of data center's system for unified management design method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7376534B2 (en) * 2004-05-21 2008-05-20 Bea Systems, Inc. Watches and notifications
CN104811327A (en) * 2014-01-26 2015-07-29 中国移动通信集团江西有限公司 Monitoring warning voice automatic notification method and device
CN106161085B (en) * 2016-06-20 2019-05-03 深圳前海微众银行股份有限公司 The monitoring system and method for messaging bus
CN106649040A (en) * 2016-12-26 2017-05-10 上海新炬网络信息技术有限公司 Automatic monitoring method and device for performance of Weblogic middleware

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140040174A1 (en) * 2012-08-01 2014-02-06 Empire Technology Development Llc Anomaly detection for cloud monitoring
CN103744771A (en) * 2014-01-28 2014-04-23 中国工商银行股份有限公司 Method, equipment and system for monitoring host performance benchmark deviations, equipment and system
CN104301159A (en) * 2014-11-13 2015-01-21 中国建设银行股份有限公司 Monitoring method and system of server cluster
CN104954184A (en) * 2015-06-15 2015-09-30 四川长虹电器股份有限公司 Monitoring and alarming method and system for cloud background server cluster
CN107294764A (en) * 2017-04-26 2017-10-24 中国科学院信息工程研究所 Intelligent supervision method and intelligent monitoring system
CN108259270A (en) * 2018-01-11 2018-07-06 郑州云海信息技术有限公司 A kind of data center's system for unified management design method

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111324511B (en) * 2020-02-24 2023-10-31 北京达佳互联信息技术有限公司 Alarm rule generation method and device, electronic equipment and storage medium
CN111324511A (en) * 2020-02-24 2020-06-23 北京达佳互联信息技术有限公司 Alarm rule generation method and device, electronic equipment and storage medium
CN111324520A (en) * 2020-03-06 2020-06-23 五八有限公司 Service interface monitoring method and device, electronic equipment and storage medium
CN111414351A (en) * 2020-03-20 2020-07-14 中国建设银行股份有限公司 Performance diagnosis method and device of MySQ L database
CN111444074A (en) * 2020-03-27 2020-07-24 北京贝斯平云科技有限公司 Data monitoring method and device, electronic equipment and readable storage medium
CN111352806A (en) * 2020-03-31 2020-06-30 中国工商银行股份有限公司 Log data monitoring method and device
CN111352806B (en) * 2020-03-31 2024-04-26 中国工商银行股份有限公司 Log data monitoring method and device
CN111737231A (en) * 2020-06-23 2020-10-02 平安普惠企业管理有限公司 Database automatic analysis method and device, computer equipment and storage medium
CN111782433A (en) * 2020-06-30 2020-10-16 北京百度网讯科技有限公司 Exception troubleshooting method, device, electronic equipment and storage medium
CN112149975A (en) * 2020-09-11 2020-12-29 杭州东方通信软件技术有限公司 APM monitoring system and method based on artificial intelligence
CN112254842A (en) * 2020-10-10 2021-01-22 广联达科技股份有限公司 Temperature monitoring method and system for mass concrete and electronic equipment
CN112684748A (en) * 2020-11-16 2021-04-20 航天信息股份有限公司 Monitoring method and system compatible with various monitored devices
CN113064800B (en) * 2021-04-19 2023-03-24 上海安畅网络科技股份有限公司 Early warning method and device, electronic equipment and readable storage medium
CN113064800A (en) * 2021-04-19 2021-07-02 上海安畅网络科技股份有限公司 Early warning method and device, electronic equipment and readable storage medium
CN114973615A (en) * 2022-05-12 2022-08-30 北京软通智慧科技有限公司 Method and device for monitoring emergency, electronic equipment and storage medium
CN117395132A (en) * 2023-12-13 2024-01-12 江西云眼视界科技股份有限公司 Distributed alarm monitoring method, system, storage medium and electronic equipment
CN117395132B (en) * 2023-12-13 2024-02-20 江西云眼视界科技股份有限公司 Distributed alarm monitoring method, system, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN109726072B (en) 2022-01-14
CN109726072A (en) 2019-05-07

Similar Documents

Publication Publication Date Title
WO2020015061A1 (en) Monitoring alarm method, device and system for weblogic server, and computer storage medium
WO2020048047A1 (en) System fault warning method, apparatus, and device, and storage medium
JP5474177B2 (en) Distributed application monitoring
WO2020024376A1 (en) Method and device for processing operation and maintenance monitoring alarm
CN101222361B (en) Alarm frequency monitor and alarm processing method
WO2020010677A1 (en) Method for acquiring consecutive missing values, data analysis device, terminal, and storage medium
CN202854567U (en) Monitoring system
WO2020119369A1 (en) Intelligent it operation and maintenance fault positioning method, apparatus and device, and readable storage medium
WO2020015064A1 (en) System fault processing method, apparatus, device and storage medium
WO2020119384A1 (en) Medical insurance abnormity detection method, apparatus and device based on big data analysis, and medium
WO2020019405A1 (en) Database monitoring method, device and apparatus, and computer storage medium
CN103440190A (en) Equipment failure warning method, device and CIM system
WO2020119118A1 (en) Abnormal data processing method, apparatus and device, and computer readable storage medium
CN104243192B (en) Fault handling method and system
CN114978883B (en) Network wakeup management method and device, electronic equipment and storage medium
CN112256548B (en) Abnormal data monitoring method and device, server and storage medium
WO2015076493A1 (en) System and method for detecting predictive failure
JP2014182646A (en) Apparatus for monitoring execution of software, method and program
CN113094225B (en) Abnormal log monitoring method and device and electronic equipment
CN115102838B (en) Emergency processing method and device for server downtime risk and electronic equipment
WO2023106504A1 (en) Method, device, and computer-readable recording medium for machine learning-based observation level measurement using server system log, and for risk level calculation according to same measurement
CN114531338A (en) Monitoring alarm and tracing method and system based on call chain data
JPH0595355A (en) Network monitor system
JPH10336276A (en) Network managing system
JP2007052756A (en) Learning type diagnostic database applied to trouble diagnosis in wireless device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18927169

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18927169

Country of ref document: EP

Kind code of ref document: A1