CN116974876B - Method for realizing millisecond-level monitoring alarm based on real-time flow frame - Google Patents

Method for realizing millisecond-level monitoring alarm based on real-time flow frame Download PDF

Info

Publication number
CN116974876B
CN116974876B CN202311213401.0A CN202311213401A CN116974876B CN 116974876 B CN116974876 B CN 116974876B CN 202311213401 A CN202311213401 A CN 202311213401A CN 116974876 B CN116974876 B CN 116974876B
Authority
CN
China
Prior art keywords
stream
log
rule
time
matching rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311213401.0A
Other languages
Chinese (zh)
Other versions
CN116974876A (en
Inventor
张科
陈继政
张自平
张勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunzhu Information Technology Chengdu Co ltd
Original Assignee
Yunzhu Information Technology Chengdu Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunzhu Information Technology Chengdu Co ltd filed Critical Yunzhu Information Technology Chengdu Co ltd
Priority to CN202311213401.0A priority Critical patent/CN116974876B/en
Publication of CN116974876A publication Critical patent/CN116974876A/en
Application granted granted Critical
Publication of CN116974876B publication Critical patent/CN116974876B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3495Performance evaluation by tracing or monitoring for systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computer Hardware Design (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to the technical field of monitoring and alarming, and discloses a method for realizing millisecond monitoring and alarming based on a real-time flow frame, which comprises the steps of receiving log data in real time to form a log flow; constructing a matching rule, receiving the matching rule in real time, and forming a rule stream; broadcasting the rule stream into an operator state of the log stream to form a rule broadcast stream; combining and matching the log stream and the regular broadcast stream to form an event stream; counting event streams, and starting window calculation when the number of the event streams accumulated by counting meets the number defined in the matching rule; obtaining logs and corresponding log time of all event streams of a window, extracting the maximum value and the minimum value of the log time, calculating the difference value of the maximum value and the minimum value, judging whether the difference value meets the time length defined in the matching rule, and triggering an alarm if the difference value meets the time length defined in the matching rule. The invention realizes millisecond monitoring by pushing data from upstream to downstream without time limitation, avoids creating a plurality of windows due to time step, and reduces resource and performance consumption.

Description

Method for realizing millisecond-level monitoring alarm based on real-time flow frame
Technical Field
The invention relates to the technical field of monitoring alarms, in particular to a method for realizing millisecond-level monitoring alarms based on a real-time flow frame.
Background
In the internet field, a problem occurs in the back-end service, and a developer needs to check a log at the first time. In the prior actual scene, platform clients encounter some specific scene errors in the use process, through customer service feedback, and then research and development personnel review logs to perform error shooting, the root cause of the problem is tracked. At present, a plurality of logs are printed in advance, a platform acquisition mode is combined with a plurality of real-time stream calculation frames, the logs are subjected to condition judgment in real time, whether the logs hit a rule for a certain number of times in a period is judged, and if the conditions are met, a first time alarm is given to inform that the problems occur in research and development, and attention is needed. At present, an open source real-time flow frame is mainly used for calculating alarms, and taking a Flink as an example, if second-level alarms are to be realized, the step length is shortened, a plurality of time windows are cut, the larger the time range is, the more the time windows are, and the performance cost is high; meanwhile, the step length is shortened, the condition hit is repeated, and the resource consumption is high.
Therefore, the invention provides a method for realizing millisecond monitoring alarm based on a real-time flow frame, so as to at least solve the technical problems.
Disclosure of Invention
The invention aims to solve the technical problems that: a method for realizing millisecond monitoring alarm based on real-time stream frame is provided to solve at least some of the above technical problems.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a method for realizing millisecond monitoring alarm based on real-time flow frame includes the following steps:
step 1, receiving log data in real time to form a log stream;
step 2, constructing a matching rule, receiving the matching rule in real time, and forming a rule flow; broadcasting the rule stream into an operator state of the log stream to form a rule broadcast stream;
step 3, merging and matching the log stream and the regular broadcast stream to form an event stream;
step 4, counting event streams, and starting window calculation when the number of the event streams accumulated by counting meets the number defined in the matching rule;
and step 5, acquiring logs and corresponding log time of all event streams of the window, extracting the maximum value and the minimum value of the log time, calculating the difference value of the maximum value and the minimum value, judging whether the difference value meets the time length defined in the matching rule, and triggering an alarm if the difference value meets the time length.
Further, in the step 1, log data in Kafka is received in real time, and a log stream is formed by a built-in adding source method of Kafka.
Further, in the step 2, a matching rule is constructed, the matching rule pushed to the Kafka by the server is received in real time, and a rule stream is formed by a built-in adding source method of the Kafka; and broadcasting the rule stream into an operator state of the log stream to form a rule broadcast stream, calling a broadcasting method of the rule stream, setting a broadcasting state, and returning to the rule broadcast stream.
Further, in the step 3, the rule broadcast stream is processed through the broadcast processing function, the received matching rule is stored in the broadcast state, the matching rule in the broadcast state is obtained through the broadcast state method, each piece of log data of the log stream is matched with the matching rule, and the log meeting the matching rule is screened out to form the event stream.
Further, the event stream contains log data and information of matching rules.
Further, in said step 4, a counter is used to count the different event streams.
Further, in the step 4, a window processing function is used to perform window calculation.
Further, in the step 5, a window processing function is used to obtain the logs of all event streams of the window and the corresponding log time.
Further, the method further comprises the following steps: and 6, removing the log corresponding to the minimum log time value in the window after each window calculation is completed.
Further, in the step 6, a log corresponding to the minimum log time value in the window is removed by using a remover.
Compared with the prior art, the invention has the following beneficial effects:
the invention only needs to open one window for one matching rule, avoids the situation that a plurality of windows are created due to time steps, triggers one alarm after the number condition is met, removes the data with the minimum time in the window, and does not trigger the alarm for a plurality of times, thereby reducing resource and performance consumption by reducing the windows and the triggering times. The invention actively pushes data to downstream through upstream, so that the data is calculated immediately after arriving, and the invention has no time limitation and can realize millisecond monitoring.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In one embodiment of the present invention, as shown in fig. 1, a method for implementing millisecond-level monitoring alarm based on real-time streaming framework includes the following steps:
step 1, receiving log data in real time to form a log stream;
step 2, constructing a matching rule, receiving the matching rule in real time, and forming a rule flow; broadcasting the rule stream into an operator state of the log stream to form a rule broadcast stream;
step 3, merging and matching the log stream and the regular broadcast stream to form an event stream;
step 4, counting event streams, and starting window calculation when the number of the event streams accumulated by counting meets the number defined in the matching rule;
and step 5, acquiring logs and corresponding log time of all event streams of the window, extracting the maximum value and the minimum value of the log time, calculating the difference value of the maximum value and the minimum value, judging whether the difference value meets the time length defined in the matching rule, and triggering an alarm if the difference value meets the time length.
The invention only needs to open one window for one matching rule, avoids the situation that a plurality of windows are created due to time steps, triggers one alarm after the number condition is met, removes the data with the minimum time in the window, and does not trigger the alarm for a plurality of times, thereby reducing resource and performance consumption by reducing the windows and the triggering times. The invention actively pushes data to downstream through upstream, so that the data is calculated immediately after arriving, and the invention has no time limitation and can realize millisecond monitoring.
In some embodiments, in the step 1, log data in Kafka is received in real time, and a log stream is formed by an add source method (addSource) built in Kafka.
In some embodiments, in the step 2, a matching rule pushed to Kafka by the server is received in real time, and a rule stream is formed by an add source method (addSource) built in Kafka; and broadcasting the rule stream into an operator state of the log stream to form a rule broadcast stream, calling a broadcast (broadcast) method of the rule stream, setting a broadcast state, and returning to the rule broadcast stream. The application scenes with different matching rule roots are set by a developer in a user-defined mode at the server side.
In some embodiments, in the step 3, a rule broadcast stream is processed by a broadcast processing function (BroadcastProcessFunction), the rule broadcast stream is processed in a processBroadcastElement, the received matching rule is stored in a broadcast state, the matching rule in the broadcast state is obtained by a broadcast state method (ctx.getb roadcaststate), each piece of log data of the log stream is matched with the matching rule, and a log meeting the matching rule is screened out to form an event stream. The matching logic is a definition rule supporting regularities, greater than, less than, equal to, unequal to, including, not including, forming a combination condition according to AND AND OR. The event stream contains log data and information of matching rules.
In some embodiments, in the step 4, a counter is used to count different event streams, and a window processing function is used to perform window calculation. And counting different event streams through a counter, wherein different matching rules are provided with corresponding windows, and if the accumulated number in the counter meets the number defined in the matching rules, triggering a window processing function to calculate through the windows.
In some embodiments, in the step 5, a window processing function is used to obtain logs of all event streams in a window and corresponding log times. The window processing function acquires the logs and the corresponding log time of all event streams of the window, extracts the maximum value and the minimum value of the log time, calculates the difference value of the maximum value and the minimum value, judges whether the difference value meets the time length defined in the matching rule, and triggers an alarm if the difference value meets the time length defined in the matching rule.
In some embodiments, a method for implementing millisecond-level monitoring alarm based on a real-time streaming framework further includes: and 6, removing the log corresponding to the minimum log time value in the window after each window calculation is completed. By removing the data with minimum time in the window, the condition of triggering the alarm for many times can not occur
In one embodiment, a matching rule of the keyword of "abnormal login" hit 5 times within 3 minutes is set, and a log containing "abnormal login" is printed in the receiving log, and the specific implementation process is as follows: recording login anomalies at the time points of 3:01, 3:02, 4:01, 4:05, 4:18, 5:01 and 8:02, printing logs, matching each log with a matching rule, defining each log as a variable a through matching logic (a comprises 'error' and a comprises 'login') of the matching rule and a specific java code a.containers ('login') of the code, judging the matching through for-loop traversal, and sequentially forming a plurality of event streams of the logs at the time points of 3:01, 3:02, 4:01, 4:05, 4:18, 5:01 and 8:02 and the matching rule; counting event streams, when the event streams of 4:18 arrive, the number of the event streams accumulated at the moment is 5, the number (5 times) defined in a matching rule is met (greater than or equal to) and then the maximum value 4:18 and the minimum value 3:01 of log time are extracted, the difference value of the maximum value and the minimum value is calculated to be 77 seconds, the time length (within 3 minutes) defined in the matching rule is met, and an alarm is triggered; after the difference value calculation is completed, eliminating event streams at a 3:01 time point, then enabling 5:01 event streams to arrive, enabling the number of accumulated event streams to be 5, and also meeting (being greater than or equal to) the number (5 times) defined in a matching rule, extracting that the difference value between the maximum value 5:01 and the minimum value 3:02 of log time at the moment is 180 seconds, triggering an alarm within 3 minutes of meeting the defined time length, and then eliminating event streams at a 3:02 time; then the 8:02 event streams arrive, the number of the accumulated event streams is 5, the difference value between the accumulated event streams and the minimum value of 4:01 is 241 seconds, the time length (within 3 minutes) defined in the matching rule is not met, and no alarm is triggered; and after the difference value calculation is completed, eliminating the event stream at the time point of 4:01.
Finally, it should be noted that: the above embodiments are merely preferred embodiments of the present invention for illustrating the technical solution of the present invention, but not limiting the scope of the present invention; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions; that is, even though the main design concept and spirit of the present invention is modified or finished in an insubstantial manner, the technical problem solved by the present invention is still consistent with the present invention, and all the technical problems are included in the protection scope of the present invention; in addition, the technical scheme of the invention is directly or indirectly applied to other related technical fields, and the technical scheme is included in the scope of the invention.

Claims (6)

1. The method for realizing millisecond-level monitoring alarm based on the real-time flow frame is characterized by comprising the following steps:
step 1, receiving log data in real time to form a log stream;
step 2, constructing a matching rule, receiving the matching rule in real time, and forming a rule flow; broadcasting the rule stream into an operator state of the log stream to form a rule broadcast stream;
step 3, merging and matching the log stream and the regular broadcast stream to form an event stream;
step 4, counting event streams, and starting window calculation when the number of the event streams accumulated by counting meets the number defined in the matching rule;
step 5, acquiring logs and corresponding log time of all event streams of a window, extracting the maximum value and the minimum value of the log time, calculating the difference value of the maximum value and the minimum value, judging whether the difference value meets the time length defined in the matching rule, and triggering an alarm if the difference value meets the time length;
step 6, removing the log corresponding to the minimum log time value in the window after each window calculation is completed;
in the step 2, a matching rule is constructed, the matching rule pushed to the Kafka by the server is received in real time, and a rule stream is formed by a built-in adding source method of the Kafka; broadcasting the rule stream into an operator state of the log stream to form a rule broadcast stream, calling a broadcasting method of the rule stream, setting a broadcasting state, and returning to the rule broadcast stream;
in the step 3, a rule broadcast stream is processed through a broadcast processing function, the received matching rule is stored in a broadcast state, the matching rule in the broadcast state is obtained through a broadcast state method, each piece of log data of the log stream is matched with the matching rule, and the log meeting the matching rule is screened out to form an event stream;
in the step 6, a remover is adopted to remove the log corresponding to the minimum log time value in the window.
2. The method for realizing millisecond monitoring and alarming based on real-time stream framework according to claim 1, wherein in the step 1, log data in Kafka is received in real time, and a log stream is formed by a built-in adding source method of Kafka.
3. The method for implementing millisecond monitoring alarm based on real time streaming framework of claim 1 wherein the event stream contains log data and information matching rules.
4. A method for implementing millisecond monitoring alarms based on a real time streaming framework according to claim 1, characterized in that in said step 4, different event streams are counted with a counter.
5. The method for realizing millisecond monitoring alarm based on real-time streaming framework according to claim 1, wherein in said step 4, window processing function is used for window calculation.
6. The method for realizing millisecond monitoring alarm based on real-time stream frame according to claim 1, wherein in step 5, the log and the corresponding log time of all event streams of the window are obtained by using a window processing function.
CN202311213401.0A 2023-09-20 2023-09-20 Method for realizing millisecond-level monitoring alarm based on real-time flow frame Active CN116974876B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311213401.0A CN116974876B (en) 2023-09-20 2023-09-20 Method for realizing millisecond-level monitoring alarm based on real-time flow frame

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311213401.0A CN116974876B (en) 2023-09-20 2023-09-20 Method for realizing millisecond-level monitoring alarm based on real-time flow frame

Publications (2)

Publication Number Publication Date
CN116974876A CN116974876A (en) 2023-10-31
CN116974876B true CN116974876B (en) 2024-02-23

Family

ID=88481793

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311213401.0A Active CN116974876B (en) 2023-09-20 2023-09-20 Method for realizing millisecond-level monitoring alarm based on real-time flow frame

Country Status (1)

Country Link
CN (1) CN116974876B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107209673A (en) * 2015-08-05 2017-09-26 谷歌公司 Data flow adding window and triggering
CN113411208A (en) * 2021-05-28 2021-09-17 青岛海尔科技有限公司 System, device for distributed traffic management
CN113448752A (en) * 2021-06-24 2021-09-28 亿企赢网络科技有限公司 Index data acquisition method and device, electronic equipment and storage medium
CN114238415A (en) * 2021-12-24 2022-03-25 四川新网银行股份有限公司 Real-time rule engine control method, system and medium based on Flink
CN114579809A (en) * 2022-01-25 2022-06-03 北京北信源软件股份有限公司 Event analysis method and device, electronic equipment and storage medium
CN115129736A (en) * 2022-07-04 2022-09-30 东方合智数据科技(广东)有限责任公司 Rule engine-based rule event dynamic loading and updating method and related equipment
CN115641139A (en) * 2022-07-12 2023-01-24 浙江师范大学 Block chain consensus method based on weight plan behavior certification
CN116340114A (en) * 2023-03-22 2023-06-27 上海浦东发展银行股份有限公司 Stream processing log alarming method
CN116436772A (en) * 2023-06-08 2023-07-14 上海观安信息技术股份有限公司 Real-time alarm method and device, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8683074B2 (en) * 2011-05-16 2014-03-25 Microsoft Corporation Stream operator
US10977152B2 (en) * 2019-04-16 2021-04-13 Oracle International Corporation Rule-based continuous diagnosing and alerting from application logs
US11016826B2 (en) * 2019-05-31 2021-05-25 Digital Guardian Llc Systems and methods for multi-event correlation
US11689318B2 (en) * 2020-01-10 2023-06-27 California Institute Of Technology Systems and methods for communicating using random codewords located within a restricted subset of a multi-dimensional sphere

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107209673A (en) * 2015-08-05 2017-09-26 谷歌公司 Data flow adding window and triggering
CN113411208A (en) * 2021-05-28 2021-09-17 青岛海尔科技有限公司 System, device for distributed traffic management
CN113448752A (en) * 2021-06-24 2021-09-28 亿企赢网络科技有限公司 Index data acquisition method and device, electronic equipment and storage medium
CN114238415A (en) * 2021-12-24 2022-03-25 四川新网银行股份有限公司 Real-time rule engine control method, system and medium based on Flink
CN114579809A (en) * 2022-01-25 2022-06-03 北京北信源软件股份有限公司 Event analysis method and device, electronic equipment and storage medium
CN115129736A (en) * 2022-07-04 2022-09-30 东方合智数据科技(广东)有限责任公司 Rule engine-based rule event dynamic loading and updating method and related equipment
CN115641139A (en) * 2022-07-12 2023-01-24 浙江师范大学 Block chain consensus method based on weight plan behavior certification
CN116340114A (en) * 2023-03-22 2023-06-27 上海浦东发展银行股份有限公司 Stream processing log alarming method
CN116436772A (en) * 2023-06-08 2023-07-14 上海观安信息技术股份有限公司 Real-time alarm method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN116974876A (en) 2023-10-31

Similar Documents

Publication Publication Date Title
US10148540B2 (en) System and method for anomaly detection in information technology operations
CN113407507B (en) Method, device and system for generating alarm type association rule and storage medium
US20040260804A1 (en) System and method for throttling events in an information technology system
WO2015027829A1 (en) Method and apparatus for monitoring website access data
CN108923972B (en) Weight-reducing flow prompting method, device, server and storage medium
CN111930592A (en) Method and system for detecting log sequence abnormity in real time
CN111740868B (en) Alarm data processing method and device and storage medium
CN111258798B (en) Fault positioning method and device for monitoring data, computer equipment and storage medium
CN114548706A (en) Early warning method for business risk and related equipment
CN113206797A (en) Flow control method and device, electronic equipment and storage medium
CN112988525B (en) Method and device for matching alarm association rules
US8132182B2 (en) Parallel processing of triggering rules in SIP event notification filters
CN112583642A (en) Abnormality detection method, model, electronic device, and computer-readable storage medium
CN112491622A (en) Method and system for positioning fault root cause of business system
CN116974876B (en) Method for realizing millisecond-level monitoring alarm based on real-time flow frame
CN113612657A (en) Method for detecting abnormal HTTP connection
US9467560B2 (en) End-to-end logic tracing of complex call flows in a distributed call system
CN112822046B (en) Flow prediction method and device
CN113094241B (en) Method, device, equipment and storage medium for determining accuracy of real-time program
CN115296904A (en) Domain name reflection attack detection method and device, electronic equipment and storage medium
CN113254313A (en) Monitoring index abnormality detection method and device, electronic equipment and storage medium
CN112804070A (en) Method, device and equipment for positioning service barrier
CN111111211A (en) Method, device, system, equipment and storage medium for reporting game data
CN116599822B (en) Fault alarm treatment method based on log acquisition event
CN114138620B (en) Cloud platform log explosion detection method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant