CN111581060A - Prometheus-based log alarm system and method and related equipment - Google Patents

Prometheus-based log alarm system and method and related equipment Download PDF

Info

Publication number
CN111581060A
CN111581060A CN202010392449.2A CN202010392449A CN111581060A CN 111581060 A CN111581060 A CN 111581060A CN 202010392449 A CN202010392449 A CN 202010392449A CN 111581060 A CN111581060 A CN 111581060A
Authority
CN
China
Prior art keywords
log data
log
unit
alarm
prometheus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010392449.2A
Other languages
Chinese (zh)
Other versions
CN111581060B (en
Inventor
吴光华
刘志祥
陈丹
刘勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kingdee Software China Co Ltd
Original Assignee
Kingdee Software China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kingdee Software China Co Ltd filed Critical Kingdee Software China Co Ltd
Priority to CN202010392449.2A priority Critical patent/CN111581060B/en
Publication of CN111581060A publication Critical patent/CN111581060A/en
Application granted granted Critical
Publication of CN111581060B publication Critical patent/CN111581060B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the application discloses a Prometheus-based log alarm system, which comprises a log alarm module, a log alarm module and a log alarm module, wherein the log alarm module is used for sending a log alarm signal to a Prometheus server; the system comprises a log module, a monitoring module and an Elasticissearch module; the monitoring module comprises a data acquisition unit, a Prometous service unit and an alarm unit; the log module is used for acquiring keywords and sending the keywords to the data acquisition unit; the data acquisition unit is used for periodically retrieving the elastic search module according to the keyword so as to obtain first log data corresponding to the keyword, and is also used for providing http service for transmitting the first log data; the Prometheus service unit is used for acquiring and storing the first log data through http service; the alarm unit is used for alarming if the number of the first log data in the Prometheus service unit is larger than a first threshold value.

Description

Prometheus-based log alarm system and method and related equipment
Technical Field
The present application relates to the field of automatic alarm, and in particular, to a Prometheus-based log alarm system, method, and related device.
Background
The ElasticSearch is a Lucene-based search server. It provides a distributed multi-user capable full-text search engine based on RESTful web interface. The Elasticsearch was developed in Java and published as open source under the Apache licensing terms, and is currently a popular enterprise-level search engine. The design is used in cloud computing, can achieve real-time search, and is stable, reliable, quick, and convenient to install and use.
Elastalert is a log alarm framework based on the ElasticSearch system. The working principle of the method is to combine the Elasticissearch with the components of the rule type and the alarm type, periodically query the Elasticissearch and transmit data to the rule type, and the type determines when a matching item is found. When a match occurs, Elastalert will assign one or more alarms, which take action based on the match. Prometheus is a set of open source monitoring, alarm, time series database combinations. Prometheus can require very few external dependencies, is extremely simple to install and has a very large number of system integrations such as: docker HAProxy Nginx JMX, and the like. As this advances, more and more companies and organizations are accepting the adoption of Prometheus.
If two alarm systems, namely Elastalert and Prometheus, exist in one company at the same time, the problems of repeated alarm, complex management and the like are inevitable.
Disclosure of Invention
The application provides a Prometheus-based log alarm system, a Prometheus-based log alarm method and related equipment, which can combine elastic search and Prometheus to reduce management burden.
The application provides a Prometheus-based log alarm system in a first aspect.
The system comprises; the system comprises a log module, a monitoring module and an Elasticissearch module;
the monitoring module comprises a data acquisition unit, a Prometous service unit and an alarm unit;
the log module is used for acquiring keywords and sending the keywords to the data acquisition unit;
the data acquisition unit is used for periodically retrieving the elastic search module according to the keyword so as to obtain first log data corresponding to the keyword, and is also used for providing http service for transmitting the first log data;
the Prometheus service unit is used for acquiring and storing the first log data through http service;
the alarm unit is used for alarming if the number of the first log data in the Prometheus service unit is larger than a first threshold value.
Optionally, the data acquisition unit includes a data acquisition subunit, a data format conversion subunit and an http service transmission subunit;
the data acquisition sub-unit is specifically used for periodically retrieving the elastic search module according to the keyword to obtain first log data in a first format corresponding to the keyword;
the data format conversion subunit is used for converting the format of the first log data to obtain the first log data in a second format which is suitable for the Prometous service unit;
the data acquisition unit is specifically used for providing an http service for transmitting the first log data in the second format through the http service transmission subunit.
Optionally, the Prometheus service unit is specifically configured to obtain and store a first amount of first log data through an http service at a second periodicity;
the alarm unit is specifically configured to alarm if the sum of the first number is greater than a first threshold within a duration nxm, where M is the duration of the first periodicity, the period duration of the second periodicity is nxm, and N is an integer greater than 1.
Optionally, the log module is further configured to obtain period information of a second periodicity, where the period information includes N, and send the period information to the data acquisition unit;
the data acquisition unit body is used for providing http service for transmitting the first log data at a second periodicity with NxM as a period according to the period information.
The second aspect of the application provides a Prometheus-based log alarm method.
The method comprises the following steps: the monitoring module acquires keywords;
the monitoring module periodically retrieves an elastic search module according to the keyword to obtain first log data corresponding to the keyword;
the monitoring module provides http service for transmitting the first log data;
the monitoring module acquires and stores first log data through http service;
and if the quantity of the first log data in the Prometheus service unit is larger than a first threshold value, the monitoring module alarms.
Optionally, the monitoring module, according to the keyword, first periodically retrieving the Elasticsearch module to obtain the first log data corresponding to the keyword includes:
the monitoring module periodically retrieves an elastic search module according to the keyword to obtain first log data in a first format corresponding to the keyword;
the method further comprises the following steps:
the monitoring module converts the format of the first log data to obtain the first log data in a second format which is suitable for a Prometheus service unit;
the monitoring module acquires the first log data through the http service, and the method comprises the following steps:
the first log data in the second format is obtained through the http service.
Optionally, the providing, by the monitoring module, an http service for transmitting the first log data includes:
the monitoring module is used for providing http service for transmitting a first amount of first log data at a second periodicity;
if the number of the first log data in the Prometheus service unit is larger than a first threshold, the monitoring module alarms, including:
if the sum of the first number is greater than a first threshold value within the time length NxM, the monitoring module alarms, wherein M is the periodic time length of the first periodicity, the periodic time length of the second periodicity is NxM, and N is an integer greater than 1.
Optionally, the method further comprises: the monitoring module acquires periodic information of a second periodicity, wherein the periodic information comprises N;
the monitoring module takes NxM as a period according to the period information, and the monitoring module provides http service for transmitting the first log data at a second periodicity.
The third aspect of the application provides a Prometheus-based log alarm device.
The device comprises: a first obtaining unit configured to obtain a keyword;
the second acquisition unit is used for periodically searching the elastic search module according to the key word so as to acquire first log data corresponding to the key word;
the processing unit is used for providing http service for transmitting the first log data;
a third acquiring unit configured to acquire the first log data through an http service;
and the alarm unit is used for giving an alarm if the number of the first log data in the third acquisition unit is greater than a first threshold value.
Optionally, the second obtaining unit is specifically configured to, according to the keyword, retrieve the Elasticsearch module periodically to obtain the first log data in the first format corresponding to the keyword.
The device still includes: a conversion unit, configured to convert a format of the first log data to obtain the first log data in a second format that is adapted to the Prometheus service unit.
The third obtaining unit is specifically configured to obtain the first log data in the second format through an http service.
Optionally, the processing unit is specifically configured to provide an http service for transmitting the first amount of the first log data at the second periodicity.
The alarm unit is specifically configured to alarm, by the monitoring module, if the sum of the first number is greater than a first threshold within a time length nxm, where M is a first periodic time length, a second periodic time length is nxm, and N is an integer greater than 1.
Optionally, the first obtaining unit is further configured to obtain period information of a second periodicity;
the processing unit is specifically configured to provide an http service for transmitting the first log data at a second periodicity with nxm as a period according to the period information.
The fourth aspect of the application provides a Prometheus-based log alarm device.
The apparatus comprises: a memory, a processor;
wherein, the memory is used for storing programs; the processor is adapted to execute the program in the memory, including performing the method according to any of the above second aspects.
A fifth aspect of embodiments of the present application provides a computer storage medium, including:
the computer storage medium has stored therein instructions that, when executed on a computer, cause the computer to perform the method of any of the second aspects described above.
A sixth aspect of embodiments of the present application provides a computer program product, which, when executed on a computer, causes the computer to perform the method according to any one of the second aspects.
According to the technical scheme, the method has the following advantages: the data acquisition unit can retrieve the Elasticsearch module to obtain first log data corresponding to the keyword and provide an http service for transmitting the first log data, and the Prometheus service unit can obtain the first log data through the http service. Therefore, the Prometous-based log alarm system can have the advantages of both the Elasticissearch and Prometous, and can be realized under an alarm framework based on Prometous, so that the management burden can be reduced.
Drawings
FIG. 1A is a block diagram of a Prometheus-based log alarm system according to an embodiment of the present application;
FIG. 1B is a schematic diagram of another framework of a Prometheus-based log alarm system according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a Prometheus-based log alarm method according to an embodiment of the present application;
FIG. 3 is another schematic flow chart of a Prometheus-based log alarm method in an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a Prometheus-based log alarm device in an embodiment of the present application;
FIG. 5 is another schematic structural diagram of a Prometheus-based log alarm device in an embodiment of the present application;
fig. 6 is a schematic structural diagram of a Prometheus-based log alarm device in an embodiment of the present application.
Detailed Description
The application provides a Prometheus-based log alarm system, a Prometheus-based log alarm method and related equipment, which can combine elastic search and Prometheus to reduce management burden.
Cloud-originated is a developing technology direction, and the definition thereof will be continuously evolved in the future. In a narrow sense, cloud-native includes cloud-native technologies represented by containers, service grids, micro-services, and Serverless, and brings a new way to construct applications. The method not only can well support the Internet application, but also can deeply influence a new computing architecture and a new intelligent data application. The cloud native technology is beneficial to the establishment and the operation of elastically expandable application of various organizations in novel dynamic environments such as public cloud, private cloud, mixed cloud and the like. Representative technologies of cloud-native include containers, service grids, microservices, immutable infrastructure, and declarative APIs.
The Elasticissearch is an open-source, distributed and RESTful interface full-text search engine constructed based on Lucene. The Elasticsearch is also a distributed document database, where each field is indexed data and can be searched, which can be extended to hundreds of servers storing and processing PB-level data, and can store, search and analyze large amounts of data in a short time.
Prometheus is a cloud-native monitoring tool deployed on Kubernates for observing workloads. The cloud native world fills up an important gap existing in the cloud native world through comprehensive indexes and abundant instrument boards. Prometheus incorporates a mainstream data visualization tool like Grafana. The functionality of Kubernates next introduced regarding expansion and monitoring will depend on Prometheus, which makes Prometheus a crucial item in the construction of cloud-native platforms.
In the cloud-native era, Prometheus has almost become the de facto standard in the field of monitoring. Prometheus is provided with an efficient time sequence database for storage, so that a single Prometheus can efficiently process a large amount of data. And friendly and strong PromQL grammar can be used for flexibly inquiring various monitoring data and configuring alarm rules. Meanwhile, the pull model index acquisition mode is widely adopted, and the metering interface of Prometheus is realized by a great number of applications to expose various data indexes of the Prometheus so as to enable the Prometheus to acquire the data indexes. Therefore, it is desirable to combine the advantages of the Elasticsearch in search and the advantages of Prometheus in monitoring to create a monitoring framework system containing the advantages of both. Because the elastic search does not have the adaptation metrics interface to expose various data indexes of the elastic search to collect promemeus, a third party is required to adapt the promemeus by the elastic search to obtain a promemeus-based log alarm system, method and related equipment.
The Prometheus-based log alarm system will be described below by way of specific embodiments.
Referring to fig. 1A, fig. 1A is a schematic diagram of a framework of a Prometheus-based log alarm system in an embodiment of the present application.
Fig. 1A includes a log module 101, a monitoring module 102, and an Elasticsearch module 103. The monitoring module 102 includes a data acquisition unit 104, a promemeus service unit 105, and an alarm unit 106.
The monitoring module 102 is dedicated to perform monitoring and alarming, and may include operating system layer monitoring, application layer monitoring, business layer monitoring, cloud container monitoring, log monitoring, and the like. The log monitoring can be understood as log alarm, and in practical application, log data is generally stored in a text form. And the data stored in text form can be searched through the Elasticsearch.
The log module 101 is a software for collecting and querying log data. The log module 101 is configured to obtain a keyword and send the keyword to the data acquisition unit 104.
The Elasticsearch module 103 may be an Elasticsearch server storing log data.
The data acquisition unit 104 is configured to retrieve the first periodic elastic search module 103 according to the keyword to obtain the first log data, and provide an http service for transmitting the first log data, or may be understood as creating a metrics interface for the Prometheus service unit 105 to acquire the first log data. The period duration of the first periodicity is preset. The data acquisition unit 104 may agree with the Elasticsearch module 103 for a periodic time length of the first periodicity in advance, or the data acquisition unit 104 may use a default periodic time length of the Elasticsearch module 103 as the periodic time length of the first periodicity.
The Prometheus service unit 105 is a Prometheus tool for acquiring the first log data through the http service;
the alarm unit 106 is configured to alarm if the amount of the first log data in the Prometheus service unit 105 is greater than a first threshold.
In the implementation of the present application, the data acquisition unit 104 may retrieve the Elasticsearch module 103, and provide an http service for transmitting the first log data, and the Prometheus service unit 105 may obtain the first log data through the http service. Therefore, the Prometous-based log alarm system can have the advantages of both the Elasticissearch and Prometous, and can be realized under an alarm framework based on Prometous, so that the management burden can be reduced.
Referring to fig. 1B, fig. 1B is another schematic diagram of a Prometheus-based log alarm system in an embodiment of the present application.
Optionally, on the basis of fig. 1A, the data acquisition unit 104 includes a data acquisition sub-unit 107, a data format conversion sub-unit 108, and an http service transmission sub-unit 109.
The data collecting subunit 107 is configured to retrieve the Elasticsearch module 103 periodically according to the key to obtain a first number of first log data in the first format corresponding to the key. By format-converting the first log data, the format of the first log data can be made compatible with the Prometheus service unit 105.
The data format conversion subunit 108 is configured to convert the format of the first log data to obtain first log data in a second format.
Optionally, the first format is a json format and the second format is a metrics format.
Optionally, the data acquisition unit 104 is specifically configured to provide an http service for transmitting the first log data at a second periodicity; the Prometheus service unit is specifically used for acquiring a first amount of first log data through http service at a second periodicity; the alarm unit is specifically configured to alarm if the sum of the first number is greater than a first threshold within a time length nxm, where M is a first periodic time length, a second periodic time length is nxm, and N is an integer greater than 1. When the sum of the first number is greater than the first threshold within the duration nxm of the alarm unit 106, the alarm unit 106 will perform an alarm. Therefore, the Prometheus service unit 105 periodically acquires the first log data for the second time, so that the frequency of acquiring the first log data by the Prometheus service unit 105 can be reduced, the efficiency of acquiring the first log data is improved, and the energy consumption is reduced.
Optionally, the log module 101 is further configured to obtain period information of a second periodicity, where the period information includes N, and send the period information to the data acquisition unit. The data acquisition unit 104 is specifically configured to provide an http service for transmitting the first log data at a second periodicity with nxm as a period according to the period information. The log module 101 faces a user, and the user can add, in addition to the keyword in the log module 101, second periodic period information in the log module to control a period duration of the http service for transmitting the first log data provided by the data acquisition unit. Therefore, the periodic information and the keywords are all put into the log module, so that a user can conveniently manage and modify the alarm system in the embodiment of the application, and the complexity of management and modification is reduced.
Optionally, the log module 101 is further configured to obtain interval durations of a first periodicity and a third periodicity, where a period duration of the first periodicity is M, and a period duration of the third periodicity is M; the log module 101 is further configured to send the interval duration to the data acquisition unit 104. The data acquisition unit 104 is specifically configured to retrieve the Elasticsearch module 103 periodically according to the keyword to obtain the first log data of the second quantity. The Prometheus service unit 105 is further configured to obtain a second amount of the first log data through an http service; the alarm unit is further configured to alarm if the second number is greater than the first threshold.
And the period duration of the third periodicity is the same as that of the first periodicity. The data acquisition unit 104 respectively starts from different time points, and the interval between the two start points is the interval duration. Using the key to retrieve the Elasticsearch module 103, two time series of log data can be generated. Even if the same keyword and the same period duration exist, the alarm results of log data with different time sequences may be different. For example, the sum of the first number of consecutive 3 cycles is greater than 50, and the sum of the second number of consecutive 3 cycles is less than 50, the alarm unit 106 alarms the sum of the first number, and the alarm unit 106 does not alarm the sum of the second number. Therefore, the interval duration is acquired by the log module 101, and the data acquisition unit 104 can acquire log data of two time sequences, so that the alarm probability under the same keyword is improved, and the alarm accuracy is improved.
Optionally, the data acquisition unit 104 is obtained according to the SDK programming of prometheus. The data acquisition unit 104 needs to interface with a prometheus service unit 105, which is a prometheus tool. Thus, by using prometheus' SDK, the programming effort can be reduced.
The foregoing describes the Prometheus-based log alarm system in the embodiment of the present application, and the following describes the Prometheus-based log alarm method in the embodiment of the present application.
Referring to fig. 2, fig. 2 is a schematic flowchart of a Prometheus-based log alarm method in an embodiment of the present application.
In step 201, the device obtains a keyword.
The equipment has computing power and is provided with a monitoring module, the monitoring module can be software, and the equipment can be a notebook, a desktop computer, a server and the like. The device may obtain multiple keywords for log alerts for the multiple keywords, such as table one, which is a table of keywords obtained by the device. The embodiment of the present application is described by taking only one keyword as an example.
Numbering Key word
1 Software X
2 Software M
Watch 1
In step 202, the device periodically retrieves the Elasticsearch module according to the key to obtain a first amount of first log data.
Taking the above keyword software X as an example, please refer to table two, which is a schematic structural diagram of the first log data. It should be noted that the keywords, the time periods, and the log numbers in table two are all embodied as specific numerical values for detailed description, and in practical applications, the keywords, the time periods, and the log numbers may be modified accordingly according to actual needs. For example, the duration of the time period may vary, the start time of the time period may vary, the number of logs may vary, and so forth. The first number is a certain number of logs in table two, e.g. 10.
Key word Time period Number of logs
Software X 15:00:15-15:00:30 10
Software X 15:00:30-15:00:45 15
Software X 15:00:45-15:00:60 5
Software X 15:00:60-15:01:15 6
Watch two
In step 203, the device provides an http service for transmitting said first log data. The device is provided with a data acquisition unit which is developed based on a prometheus go language version sdk and can provide standard http service to the outside by exposing a port.
In step 204, the device obtains first log data via an http service.
After the device provides the http service for transmitting the first log data, the device may obtain a first amount of the first log data through the http service to obtain the time series data of the first log data. Please refer to table three, which is the timing data of the first log data.
Key word Time period Number of logs
Software X 15:00:15 10
Software X 15:00:30 25
Software X 15:00:45 30
Software X 15:00:60 36
Watch III
In step 205, if the sum of the first number of the plurality of time periods obtained through the http service is greater than a first threshold, the device alarms.
Taking the above table three as an example, if the sum of the first number of consecutive 2 time periods is greater than 20, an alarm is given. Third, the first number of the first time period 15:00:15-15:00:30 is 10, the first number of the second time period 15:00:30-15:00:45 is 15, and the sum of the first number of the 2 time periods is 25, which is greater than 20, so the device alarms.
The equipment in the implementation of the application can retrieve the Elasticissearch module, provide the http service for transmitting the first log data, and acquire the first log data through the http service. Therefore, the Prometous-based log alarm method has the advantages of both the Elasticissearch and Prometous, and can be realized under an alarm framework based on Prometous, so that the management burden can be reduced.
Referring to fig. 3, fig. 3 is another schematic flow chart of a Prometheus-based log alarm method in an embodiment of the present application.
In step 301, the device obtains the key, N, and the interval duration.
Wherein N is an integer greater than 1. The examples of the present application are described with N being 2.
The interval duration is the difference between the starting time point of the first periodicity and the starting time point of the third periodicity. One period of the first periodicity is M and one period of the second periodicity is N × M. The embodiment of the present application is described with the interval duration of 10S.
In step 302, the device retrieves an elasticsearch module periodically according to the keyword, with the first period as a period duration, to obtain a first amount of first log data. The device further retrieves the elasticsearch module periodically according to the keyword, taking the third period as a period, so as to obtain a second amount of the first log data. The cycle duration of the third cycle is the same as the cycle duration of the first cycle. The first log data of the first number refers to the second table, the first log data of the second number refers to the fourth table, and the number of logs in the fourth table is the second number.
Key word Time period Number of logs
Software X 15:00:05-15:00:20 8
Software X 15:00:20-15:00:35 9
Software X 15:00:35-15:00:50 10
Software X 15:00:50-15:01:05 9
Watch four
In step 303, the device provides an http service for transmitting the first log data. In the aforementioned step 301, the device obtains the alarm period N, i.e. the value 2. The device provides http service for transmitting the first log data with 2 first cycles as 1 cycle. It should be determined that the device may provide the http service for transmitting the first amount of the first log data and the http service for transmitting the second amount of the first log data at the same time at one port, or may provide the http service at different ports. The equipment is provided with a data acquisition unit.
In step 304, the device obtains a first amount of first log data and a second amount of first log data via an http service.
Since the device provides the http service for transmitting the first log data with 2 first cycles as 1 cycle, the device will also obtain the first amount of the first log data through the http service with 2 first cycles as 1 cycle to obtain the time sequence data of the first amount of the first log data. The device may further obtain a second amount of the first log data through the http service with 2 first cycles as 1 cycle to obtain time series data of the second amount of the first log data. Table five is time series data of the first log data of the first number, and table six is time series data of the first log data of the second number.
Key word Time period Number of logs
Software X 15:00:15 25
Software X 15:00:45 36
Watch five
Key word Time period Number of logs
Software X 15:00:05 17
Software X 15:00:35 19
Watch six
In step 305, if the sum of the first number obtained by the http service for N consecutive cycles is greater than a first threshold, the device alarms; and if the sum of the second quantity obtained through the http service in N continuous periods is greater than the first threshold value, the equipment alarms.
Since the device acquires the first log data from the http service for 2 first cycles of 1 cycle, that is, the device acquires the first log data from the http service for the second cycle, the device can directly make a determination with the value of 25 or 17 without calculating the sum of the first line and the second line. For example, in the embodiment of the present application, the value 25 is greater than the value 20, so that the sum of the first number obtained by http service for 2 consecutive first cycles is greater than 20, and the device alarms. The value 17 is smaller than the value 20, so that the sum of the second number obtained by the http service for 2 consecutive first cycles is smaller than 20 and the device does not alarm.
Regarding the description of the beneficial effects of the Prometheus-based log alarm method in the embodiment of the present application, reference may be made to the foregoing description of fig. 1A and 1B on the beneficial effects of the Prometheus-based log alarm system.
The foregoing describes a method for logging alarm based on Prometheus in this embodiment, and a device for logging alarm based on Prometheus in this embodiment is described below.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a Prometheus-based log alarm device in an embodiment of the present application.
The device comprises
A first obtaining unit 401, configured to obtain a keyword;
a second obtaining unit 402, configured to periodically retrieve the Elasticsearch module according to the keyword to obtain first log data corresponding to the keyword;
a processing unit 403, configured to provide an http service for transmitting the first log data;
a third obtaining unit 404, configured to obtain the first log data through an http service;
an alarm unit 405, configured to alarm if the number of the first log data in the third obtaining unit is greater than the first threshold.
In this embodiment, the second obtaining unit 402 may retrieve an Elasticsearch module, the processing unit 403 may provide an http service for transmitting the first log data, and the third obtaining unit 404 may obtain the first log data through the http service. Therefore, the Prometous-based log alarm device can have the advantages of both the Elasticissearch and Prometous, and can be realized under an alarm framework based on Prometous, so that the management burden can be reduced.
On the basis of the Prometheus-based log alarm device described in fig. 4, the Prometheus-based log alarm device may further include the following:
referring to fig. 5, fig. 5 is another schematic structural diagram of a Prometheus-based log alarm device in the embodiment of the present application.
Optionally, the second obtaining unit 402 is specifically configured to, according to the keyword, retrieve the Elasticsearch module periodically to obtain the first log data in the first format corresponding to the keyword;
the device still includes: a converting unit 501, configured to convert a format of the first log data to obtain the first log data in a second format that is adapted to the Prometheus service unit;
the third obtaining unit 404 is specifically configured to obtain the first log data in the second format through the http service.
Optionally, the processing unit 403 is specifically configured to provide an http service for transmitting the first amount of the first log data at the second periodicity.
The alarm unit 405 is specifically configured to alarm, by the monitoring module, if the sum of the first number is greater than a first threshold within a time duration nxm, where M is a first periodic time duration, a second periodic time duration is nxm, and N is an integer greater than 1. Optionally, the first obtaining unit is further configured to obtain period information of a second periodicity;
the processing unit 403 is specifically configured to provide an http service for transmitting the first log data at a second periodicity with nxm as a period according to the period information.
The foregoing describes the Prometheus-based log alarm device in the embodiment of the present application, and the following describes the Prometheus-based log alarm device in the embodiment of the present application.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a Prometheus-based log alarm device in an embodiment of the present application.
The Prometheus-based log alarm device may be a desktop computer, a notebook, or a server, etc., fig. 6 is a schematic structural diagram of a Prometheus-based log alarm device provided in an embodiment of the present application, where the Prometheus-based log alarm device 600 may include one or more Central Processing Units (CPUs) 601 and a memory 605, where the memory 605 stores one or more applications or data.
The memory 605 may be volatile storage or persistent storage, among other things. The program stored in the memory 605 may include one or more modules, each of which may include a sequence of instructions operating on a server. Still further, the central processor 601 may be configured to communicate with the memory 605 to execute a series of instruction operations in the memory 605 on the Prometheus-based log alert device 600.
The Prometheus-based log alarm apparatus 600 may also include one or more power supplies 602, one or more wired or wireless network interfaces 603, one or more input-output interfaces 604, and/or one or more operating systems, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
In the embodiment of the present application, the processor 601 included in the server further has the following functions:
acquiring a keyword; according to the key words, a first periodic retrieval Elasticissearch module is used for obtaining first log data corresponding to the key words; providing an http service for transmitting the first log data; acquiring first log data through http service; and if the quantity of the first log data acquired through the http service is larger than a first threshold value, alarming.
Optionally, in this embodiment of the present application, the processor 601 included in the Prometheus-based log alarm device further has the following functions:
optionally, according to the keyword, the first periodic retrieving the Elasticsearch module to obtain first log data in a first format corresponding to the keyword;
converting the format of the first log data to obtain first log data in a second format adapted to Prometheus;
the first log data in the second format is obtained through the http service.
Optionally, the second periodicity provides an http service for transmitting the first log data;
and if the number of the first log data acquired through the http service is greater than a first threshold value within the time length NxM, alarming, wherein M is the period time length of the first periodicity, the period time length of the second periodicity is N times of the period time length of the first periodicity, and N is an integer greater than 1.
Optionally, second periodic period information is acquired, where the period information includes N, and the period information is sent to the data acquisition unit;
and according to the period information, taking N × M as a period, and providing http service for transmitting the first log data at a second periodicity.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and various other media capable of storing program codes.

Claims (10)

1. A Prometheus-based log alarm system, comprising;
the system comprises a log module, a monitoring module and an Elasticissearch module;
the monitoring module comprises a data acquisition unit, a Prometous service unit and an alarm unit;
the log module is used for acquiring keywords and sending the keywords to the data acquisition unit;
the data acquisition unit is used for periodically retrieving the elastic search module according to the keyword to obtain first log data corresponding to the keyword, and the data acquisition unit is also used for providing http service for transmitting the first log data;
the Prometheus service unit is used for acquiring and storing the first log data through the http service;
the alarm unit is used for giving an alarm if the number of the first log data in the Prometheus service unit is larger than a first threshold value.
2. The system of claim 1, wherein the data acquisition unit comprises a data acquisition subunit, a data format conversion subunit and an http service transmission subunit;
the data acquisition subunit is specifically configured to retrieve, by the data acquisition subunit, the Elasticsearch module in a first periodicity according to the keyword, so as to obtain the first log data in a first format corresponding to the keyword;
the data format conversion subunit is used for converting the format of the first log data to obtain the first log data in a second format which is suitable for the Prometheus service unit;
the data acquisition unit is specifically configured to provide an http service for transmitting the first log data in the second format through the http service transmission subunit.
3. The system according to claim 1 or 2,
the Prometheus service unit is specifically configured to obtain and store a first amount of the first log data through the http service at a second periodicity;
the alarm unit is specifically configured to alarm if the sum of the first number is greater than the first threshold within a duration nxm, where M is the first periodic duration, the second periodic duration is nxm, and N is an integer greater than 1.
4. The system of claim 3,
the log module is further configured to obtain period information of the second periodicity, where the period information includes N, and send the period information to the data acquisition unit;
the data acquisition unit is specifically configured to provide an http service for transmitting the first log data at a second periodicity with nxm as a period according to the period information.
5. A Prometous-based log alarm method is characterized by comprising the following steps of;
the monitoring module acquires keywords;
the monitoring module periodically retrieves an Elasticissearch module according to the keyword to obtain first log data corresponding to the keyword;
the monitoring module provides http service for transmitting the first log data;
the monitoring module acquires and stores the first log data through the http service;
and if the quantity of the first log data in the Prometheus service unit is larger than a first threshold value, the monitoring module gives an alarm.
6. The method of claim 5,
the monitoring module, according to the keyword, first periodically retrieving an Elasticsearch module to obtain first log data corresponding to the keyword includes:
the monitoring module retrieves the Elasticissearch module periodically according to the keyword to obtain the first log data in a first format corresponding to the keyword;
the method further comprises the following steps:
the monitoring module converts the format of the first log data to obtain the first log data in a second format adapted to the Prometheus service unit;
the acquiring, by the monitoring module, the first log data through the http service includes:
and acquiring the first log data in the second format through the http service.
7. The method according to any one of claims 5 or 6,
the monitoring module provides http service for transmitting the first log data, and the http service comprises the following steps:
the monitoring module provides the http service for transmitting a first amount of the first log data at a second periodicity;
if the number of the first log data in the Prometheus service unit is greater than a first threshold, the monitoring module alarms including:
if the sum of the first number is greater than the first threshold within a time length nxm, the monitoring module alarms, where M is the period time length of the first periodicity, the period time length of the second periodicity is nxm, and N is an integer greater than 1.
8. A Prometheus-based log alarm device, comprising:
a first obtaining unit configured to obtain a keyword;
a second obtaining unit, configured to periodically retrieve an Elasticsearch module according to the keyword, so as to obtain first log data corresponding to the keyword;
the processing unit is used for providing http service for transmitting the first log data;
a third obtaining unit, configured to obtain the first log data through the http service;
and the alarm unit is used for giving an alarm if the number of the first log data in the third acquisition unit is greater than a first threshold value.
9. A computer device, comprising: a memory, a processor;
wherein the memory is used for storing programs;
the processor is configured to execute the program in the memory, including performing the method of any of claims 5 to 7.
10. A computer storage medium having stored therein instructions that, when executed on a computer, cause the computer to perform the method of any one of claims 5 to 7.
CN202010392449.2A 2020-05-11 2020-05-11 Prometaus-based log alarm system, method and related equipment Active CN111581060B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010392449.2A CN111581060B (en) 2020-05-11 2020-05-11 Prometaus-based log alarm system, method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010392449.2A CN111581060B (en) 2020-05-11 2020-05-11 Prometaus-based log alarm system, method and related equipment

Publications (2)

Publication Number Publication Date
CN111581060A true CN111581060A (en) 2020-08-25
CN111581060B CN111581060B (en) 2024-03-12

Family

ID=72126462

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010392449.2A Active CN111581060B (en) 2020-05-11 2020-05-11 Prometaus-based log alarm system, method and related equipment

Country Status (1)

Country Link
CN (1) CN111581060B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199249A (en) * 2020-09-16 2021-01-08 中国建设银行股份有限公司 Monitoring data processing method, device, equipment and medium
CN112596994A (en) * 2020-12-25 2021-04-02 福州掌中云科技有限公司 XHProf-based PHP program performance detection method and device
CN113282920A (en) * 2021-05-28 2021-08-20 平安科技(深圳)有限公司 Log abnormity detection method and device, computer equipment and storage medium
CN113381884A (en) * 2021-06-02 2021-09-10 上海数禾信息科技有限公司 Full link monitoring method and device for monitoring alarm system
CN113626300A (en) * 2021-08-03 2021-11-09 上海上讯信息技术股份有限公司 Log management method and device
CN113722183A (en) * 2021-09-02 2021-11-30 北京金山云网络技术有限公司 Log alarm method and device and electronic equipment
CN115473783A (en) * 2022-08-04 2022-12-13 浪潮软件集团有限公司 Prometheus-based index alarm management system and method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107565953A (en) * 2017-10-18 2018-01-09 南京邮电大学南通研究院有限公司 A kind of control circuit of transition detection device and clock frequency regulating system
CN108921551A (en) * 2018-06-11 2018-11-30 西安纸贵互联网科技有限公司 Alliance's block catenary system based on Kubernetes platform
CN109245931A (en) * 2018-09-19 2019-01-18 四川长虹电器股份有限公司 The log management of container cloud platform based on kubernetes and the implementation method of monitoring alarm
CN109656784A (en) * 2018-12-25 2019-04-19 新华三技术有限公司 A kind of log processing method and device
CN110175152A (en) * 2019-05-30 2019-08-27 深圳前海微众银行股份有限公司 A kind of log inquiring method, transfer server cluster and log query system
CN110321371A (en) * 2019-07-01 2019-10-11 腾讯科技(深圳)有限公司 Daily record data method for detecting abnormality, device, terminal and medium
CN110968482A (en) * 2019-12-18 2020-04-07 上海良鑫网络科技有限公司 Enterprise service and application intelligent monitoring system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107565953A (en) * 2017-10-18 2018-01-09 南京邮电大学南通研究院有限公司 A kind of control circuit of transition detection device and clock frequency regulating system
CN108921551A (en) * 2018-06-11 2018-11-30 西安纸贵互联网科技有限公司 Alliance's block catenary system based on Kubernetes platform
CN109245931A (en) * 2018-09-19 2019-01-18 四川长虹电器股份有限公司 The log management of container cloud platform based on kubernetes and the implementation method of monitoring alarm
CN109656784A (en) * 2018-12-25 2019-04-19 新华三技术有限公司 A kind of log processing method and device
CN110175152A (en) * 2019-05-30 2019-08-27 深圳前海微众银行股份有限公司 A kind of log inquiring method, transfer server cluster and log query system
CN110321371A (en) * 2019-07-01 2019-10-11 腾讯科技(深圳)有限公司 Daily record data method for detecting abnormality, device, terminal and medium
CN110968482A (en) * 2019-12-18 2020-04-07 上海良鑫网络科技有限公司 Enterprise service and application intelligent monitoring system

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199249A (en) * 2020-09-16 2021-01-08 中国建设银行股份有限公司 Monitoring data processing method, device, equipment and medium
CN112596994A (en) * 2020-12-25 2021-04-02 福州掌中云科技有限公司 XHProf-based PHP program performance detection method and device
CN113282920A (en) * 2021-05-28 2021-08-20 平安科技(深圳)有限公司 Log abnormity detection method and device, computer equipment and storage medium
CN113282920B (en) * 2021-05-28 2023-10-10 平安科技(深圳)有限公司 Log abnormality detection method, device, computer equipment and storage medium
CN113381884A (en) * 2021-06-02 2021-09-10 上海数禾信息科技有限公司 Full link monitoring method and device for monitoring alarm system
CN113626300A (en) * 2021-08-03 2021-11-09 上海上讯信息技术股份有限公司 Log management method and device
CN113722183A (en) * 2021-09-02 2021-11-30 北京金山云网络技术有限公司 Log alarm method and device and electronic equipment
CN115473783A (en) * 2022-08-04 2022-12-13 浪潮软件集团有限公司 Prometheus-based index alarm management system and method

Also Published As

Publication number Publication date
CN111581060B (en) 2024-03-12

Similar Documents

Publication Publication Date Title
CN111581060A (en) Prometheus-based log alarm system and method and related equipment
US10031973B2 (en) Method and system for identifying a sensor to be deployed in a physical environment
JP6922538B2 (en) API learning
CN110362544B (en) Log processing system, log processing method, terminal and storage medium
US9965209B2 (en) Large-scale, dynamic graph storage and processing system
JP6404106B2 (en) Computing device and method for connecting people based on content and relationship distance
KR102593171B1 (en) Information processing method and device, electronic equipment and storage medium
US20190034498A1 (en) Determining a presentation format for search results based on a presentation recommendation machine learning model
KR102067032B1 (en) Method and system for data processing based on hybrid big data system
CN110321544B (en) Method and device for generating information
CN114840671A (en) Dialogue generation method, model training method, device, equipment and medium
US20200112475A1 (en) Real-time adaptive infrastructure scenario identification using syntactic grouping at varied similarity
CN112948486A (en) Batch data synchronization method and system and electronic equipment
US20190034430A1 (en) Disambiguating a natural language request based on a disambiguation recommendation machine learning model
CN117971606B (en) Log management system and method based on elastic search
CN112732663A (en) Log information processing method and device
CN113282611A (en) Method and device for synchronizing stream data, computer equipment and storage medium
CN115481227A (en) Man-machine interaction dialogue method, device and equipment
CN110245357B (en) Main entity identification method and device
CN113326305A (en) Method and device for processing data
CN116225848A (en) Log monitoring method, device, equipment and medium
CN116521664A (en) Data monitoring method and device for data warehouse, computing equipment and storage medium
US20190089659A1 (en) Bursty detection for message streams
CN115269862A (en) Electric power question-answering and visualization system based on knowledge graph
Cheng et al. A service-oriented context-awareness reasoning framework and its implementation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant