CN112395156A - Fault warning method and device, storage medium and electronic equipment - Google Patents

Fault warning method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN112395156A
CN112395156A CN202011233566.0A CN202011233566A CN112395156A CN 112395156 A CN112395156 A CN 112395156A CN 202011233566 A CN202011233566 A CN 202011233566A CN 112395156 A CN112395156 A CN 112395156A
Authority
CN
China
Prior art keywords
keyword
log
alarm
target
error
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011233566.0A
Other languages
Chinese (zh)
Inventor
王强
陈秀升
钟志雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weimin Insurance Agency Co Ltd
Original Assignee
Weimin Insurance Agency Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weimin Insurance Agency Co Ltd filed Critical Weimin Insurance Agency Co Ltd
Priority to CN202011233566.0A priority Critical patent/CN112395156A/en
Publication of CN112395156A publication Critical patent/CN112395156A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/301Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is a virtual computing platform, e.g. logically partitioned systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application provides a fault warning method and device, a storage medium and electronic equipment, wherein the method comprises the following steps: acquiring log change information of the application system, wherein the log change information is used for indicating that the first error log changes in a target time period; matching a first keyword in the first error log with a second keyword, wherein the second keyword is a keyword which needs to eliminate the alarm; and under the condition that the first keyword matched with the second keyword does not exist in the first error log, triggering a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period. According to the method and the device, the log change information in the application system is obtained, the first error log is determined, the first keyword and the second keyword in the first error log are matched, and the corresponding log alarm mechanism is triggered according to the matching result, so that the problem can be sensed and found at the first time when the problem occurs, and the alarm efficiency is improved.

Description

Fault warning method and device, storage medium and electronic equipment
Technical Field
The present application relates to the field of internet, and in particular, to a method and an apparatus for alarming a fault, a storage medium, and an electronic device.
Background
In a production environment, various application systems are deployed on numerous servers or containers (hereinafter collectively referred to as "servers"), and in the running process, the systems output various logs to reflect system states, feedback service execution conditions and the like, collect and analyze log information, and can analyze and monitor and alarm data of one application system on a business level.
In the related art, the monitoring and alarming can be performed based on the hardware resources of the server and the system performance, and the defects are as follows: the fault can be identified and an alarm can be given out only after the problems are accumulated to a certain degree, and the timeliness is delayed; the fault warning message can be obtained based on the client buried point reporting, but the buried point reporting mode usually needs to be used for prejudging the buried point in advance, and the buried point has certain intrusiveness on codes; in addition, the current monitoring and alarming system generally has the problem of alarm bombing, so that the sensitivity of effective alarm is not high enough.
Therefore, the related art has problems of poor alarm timeliness and insufficient sensitivity of effective alarms.
Disclosure of Invention
The application provides a fault warning method and device, a storage medium and electronic equipment, which are used for at least solving the problem of poor warning timeliness in the related art.
According to an aspect of an embodiment of the present application, there is provided a method for warning of a fault, the method including:
acquiring log change information of an application system, wherein the log change information is used for indicating that a first error log changes in a target time period;
matching a first keyword in the first error log with a second keyword, wherein the second keyword is a keyword which needs to eliminate an alarm;
and under the condition that the first error log does not have a first keyword matched with the second keyword, triggering a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period.
According to an aspect of an embodiment of the present application, there is provided a fault warning apparatus, including:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring log change information of an application system, and the log change information is used for indicating that a first error log changes in a time period;
the first matching module is used for matching a first keyword in the first error log with a second keyword, wherein the second keyword is a keyword which needs to exclude an alarm;
and the triggering module is used for triggering a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period under the condition that the first keyword matched with the second keyword does not exist in the first error log.
Optionally, the first obtaining module includes:
a first obtaining unit, configured to obtain log change sub-information of multiple application systems, where the log change sub-information is used to indicate that a first error sub-log changes in a target time window, where the first error sub-log belongs to the first error log, and the target time window is a time window in the target time period;
the classification unit is used for classifying the application systems corresponding to the log change sub-information with the same keyword into the same application type, wherein the application type is used for indicating the category to which the application systems belong;
and the summarizing unit is used for summarizing the log change sub-information of the application systems under the same application type into the log change information.
Optionally, the apparatus further comprises:
the second acquisition unit is used for acquiring error reporting statistical information under each application type in a plurality of applications after the application systems corresponding to the log change sub-information with the same keyword are classified into the same application type, wherein the error reporting statistical information comprises an error reporting type, an error source and alarm time;
and the display unit is used for pushing the error reporting statistical information under each application type to a target information window matched with the application type in a plurality of information windows for displaying, wherein one information window in the plurality of information windows is used for displaying the error reporting statistical information under one application type.
Optionally, the triggering module includes:
and the first sending unit is used for sending a first alarm message to an alarm receiving end under the condition that the frequency of the first keyword appearing in the target time window is greater than or equal to a preset frequency, wherein the target frequency is used for representing the frequency value for sending the first alarm message.
Optionally, the triggering module further includes:
a first determining unit, configured to determine a duration of the target time window;
the storage unit is used for storing each first keyword appearing in the duration into a distributed cache;
the statistical unit is used for counting the number of the first keywords in the distributed cache;
and the second sending unit is used for sending the number serving as a second alarm message to the alarm receiving end.
Optionally, the storage unit includes:
a generating subunit, configured to generate the first keyword according to an application name of an application system in which the first keyword is located, a sub-keyword included in the first keyword, current time, the target time window length in which the first keyword is located, and an end time sequence;
and the storage subunit is used for storing the first keyword into a distributed cache.
Optionally, the apparatus further comprises:
a first determining module, configured to determine that a target error log exists in a first error log before a first keyword in the first error log is matched with a second keyword, where the target error log does not include the keyword;
and the first sending module is used for sending a third alarm message according to the fault data represented in the target error log.
Optionally, the apparatus further comprises:
and a second sending module, configured to send a fourth alarm message to an alarm receiving end after sending a third alarm message according to the fault data indicated in the target error log, where the fourth alarm message is used to instruct a log monitoring platform to stop sending the third alarm message, and the target alarm threshold is a maximum alarm message value received by the alarm receiving end when the number of the third alarm messages is greater than or equal to a target alarm threshold.
Optionally, the apparatus further comprises:
a third sending module, configured to, after the third warning message is stopped from being sent, obtain the number of the third warning messages again when a time from the time of stopping sending the third warning message reaches a target time length, where the target time length is used to trigger execution of a step of obtaining the number of the third warning messages again;
and the fourth sending module is used for sending the third warning message according to the fault data represented in the target error log under the condition that the number of the third warning messages obtained again is smaller than the target warning threshold.
Optionally, the apparatus further comprises:
and the second matching module is used for not generating an alarm message under the condition that the first keyword matched with the second keyword exists in the first error log after triggering a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period.
Optionally, the apparatus further comprises:
the second acquisition module is used for acquiring the trigger operation of the target fault switch before acquiring the log change information of the application system;
and the control module is used for controlling the sending operation or the closing operation of the alarm message according to the triggering operation and the function corresponding to the target fault switch.
According to yet another aspect of the embodiments of the present application, there is further provided a computer-readable storage medium, in which a computer program is stored, where the computer program is configured to execute the steps of any of the above-mentioned fault warning methods when running.
According to yet another aspect of embodiments of the present application, there is also provided an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the computer program to perform the steps of any of the above-mentioned fault warning methods.
According to yet another aspect of an embodiment of the present application, there is also provided a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium; the processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the steps of any of the embodiments of the method for alerting of a fault described above.
In the embodiment of the application, by acquiring log change information in an application system, a first error log is determined, matching is performed according to a first keyword and a second keyword in the first error log, and under the condition that the first keyword matched with the second keyword does not exist, a corresponding log alarm mechanism is triggered according to the occurrence frequency of the first keyword in a target time period, and whether a fault occurs can be determined only by comparing keywords without problem accumulation, so that the problem can be sensed and found at the first time when the problem occurs, and the problem of poor alarm timeliness in the related technology is solved; meanwhile, the embedded points do not need to be judged in advance, and the intrusiveness to the codes is reduced.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
FIG. 1 is a schematic diagram of a hardware environment for an alternative method of fault alerting, according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating an alternative fault warning method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an alternative alarm configuration page of a monitoring log according to an embodiment of the application;
FIG. 4 is a schematic diagram of an alternative failed alarm system deployment architecture according to an embodiment of the present application;
FIG. 5 is an alternative schematic diagram of an application alarm interface of the same type according to an embodiment of the present application;
FIG. 6 is an alternative alarm data trend diagram for log faults in accordance with an embodiment of the present application;
FIG. 7 is a schematic diagram illustrating an alternative fault alarm handling process according to an embodiment of the present application;
FIG. 8 is a block diagram of an alternative fault alerting device according to an embodiment of the present application;
fig. 9 is a block diagram of an alternative electronic device according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiment of the application provides a fault warning method which can be applied to a service scene of real-time monitoring warning of a log, for example, when the content of a log record changes, information such as report errors and warnings can be usually caused, at the moment, a short message and mail warning mode can be adopted to send the information to a code engineer to prompt that the current state of the log record changes, so that the code engineer can accurately position problems when a program is abnormal, and the service analysis efficiency is improved.
Optionally, in this embodiment of the present application, the method for warning of a fault may be applied to a hardware environment as shown in fig. 1. As shown in fig. 1, the terminal 102 may include a memory 104, a processor 106, and a display 108 (optional components). The terminal 102 may be communicatively coupled to a server 112 via a network 110, the server 112 may be configured to provide services (e.g., gaming services, application services, etc.) to the terminal or to clients installed on the terminal, and a database 114 may be provided on the server 112 or separate from the server 112 to provide data storage services to the server 112. Additionally, a processing engine 116 may be run in the server 112, and the processing engine 116 may be used to perform the steps performed by the server 112.
Alternatively, the terminal 102 may be, but is not limited to, a terminal capable of calculating data, such as a mobile terminal (e.g., a mobile phone, a tablet Computer), a notebook Computer, a PC (Personal Computer) Computer, and the like, and the network may include, but is not limited to, a wireless network or a wired network. Wherein, this wireless network includes: bluetooth, WIFI (Wireless Fidelity), and other networks that enable Wireless communication. Such wired networks may include, but are not limited to: wide area networks, metropolitan area networks, and local area networks. The server 112 may include, but is not limited to, any hardware device capable of performing computations.
In addition, in this embodiment, the above-mentioned fault warning method may also be applied, but not limited, to an independent processing device with a relatively high processing capability without data interaction. For example, the processing device may be, but is not limited to, a terminal device with a relatively high processing capability, that is, each operation in the above-mentioned fault warning method may be integrated into a separate processing device. The above is merely an example, and this is not limited in this embodiment.
In this embodiment of the present application, the above fault warning method may be run in a server, specifically, as shown in fig. 2, fig. 2 is a schematic flow diagram of an optional fault warning method according to an embodiment of the present application, where the flow of the method may include the following steps:
in step S201, the server obtains log change information of the application system, where the log change information is used to indicate that the first error log changes within a target time period.
In the embodiment of the application, the LogMonitor is used as a log monitoring and alarming platform to monitor the log in real time and alarm the error log in time, and it should be noted that the LogMonitor can monitor the files or folders in the computer and check whether the files or folders have changed records, and when the files or folders change, the platform can send an alarm notification. The following explains the above-mentioned fault warning method.
Optionally, the developer or the operation and maintenance personnel configures the relevant information of the monitoring alarm application on the LogMonitor platform, where the relevant information may include: the application name, the alarm group id, the alarm type, the matching field, the frequency condition, the alarm number, the trigger time period, and the like may be referred to in fig. 3.
As shown in fig. 4, each application server of the LogMonitor platform provided in the embodiment of the present application includes an application system, log information, and a log monitoring and warning client, where the relationship among the application system, the log information, and the log monitoring and warning client is as follows: the application system is used for outputting log information, and the log monitoring and warning client monitors the change condition of the log information.
After the log monitoring and warning client acquires the log change information of the application system, it is indicated that error log information exists in the current log information and a warning message needs to be sent. The log change information is used for indicating that the first error log changes in the target time period.
Step S202, matching the first keyword in the first error log with a second keyword, wherein the second keyword is a keyword for which an alarm needs to be excluded.
Optionally, after monitoring the error log, the client of the LogMonitor platform provided in the embodiment of the present application supports reporting to the server to perform rule analysis and processing, where the method for rule analysis and processing is a manner of controlling alarm reporting based on the occurrence frequency of the error keyword. In this embodiment, the first keyword in the first error log is obtained, where the first keyword is a keyword triggering generation of the first error log, and the first keyword may be a sub-keyword, for example, 404Not Found (or Not Found), or a keyword set including the sub-keyword, for example, the first keyword includes the sub-keyword, an application name, and the like, where the first keyword includes the sub-keyword.
Checking whether a keyword matching a preset second keyword exists in the current first keyword, where the second keyword may be one keyword, for example, 404Not Found (cannot be Found) or a keyword set including a plurality of keywords, for example, 404Not Found (cannot be Found), 502Bad Gateway (web page error), and 500-13Server is too busy), and the second keyword is a keyword that needs to exclude an alarm, that is, after a first keyword identical to the second keyword is matched, the first keyword does Not trigger an alarm message.
Step S203, under the condition that the first error log does not have the first keyword matched with the second keyword, triggering a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period.
Optionally, in the keyword matching process, if there is no first keyword matching with the second keyword in the first error log, the corresponding log alarm mechanism is triggered according to the occurrence frequency of the first keyword in the target time period. The log alarm mechanism is a mode for controlling alarm reporting based on the occurrence frequency of the error keywords.
Optionally, in the alarm system deployment architecture diagram of the fault, as shown in fig. 4, a log monitoring alarm server cluster is further included, where the log monitoring alarm server cluster mainly includes two major functions of alarm configuration and alarm processing, and the alarm processing may include four processing logics of log collection, rule analysis, log analysis and alarm sending, and more specifically, after an error log is collected by a log collection module in the log monitoring alarm server cluster, the error log is transmitted to the rule analysis module, log analysis is performed according to an analysis result, the occurrence frequency of a keyword in the error log in a target time period is determined, then a corresponding alarm mechanism is triggered, an alarm message may be sent to a client application program of a developer, and the client application program is presented on a client in an interface display manner; the system can also inform developers in a telephone alarm mode, wherein whether a telephone alarm rule is triggered or not can be judged when the alarm is sent, if the telephone alarm condition is met, the relevant developers can be informed in a telephone alarm mode, and the telephone alarm rule is an alarm rule which is independently set by a log monitoring alarm service end cluster and is used for carrying out alarm notification in a telephone alarm mode under the condition that the situation is urgent and serious.
In the embodiment of the application, by acquiring log change information in an application system, a first error log is determined, matching is performed according to a first keyword and a second keyword in the first error log, and under the condition that the first keyword matched with the second keyword does not exist, a corresponding log alarm mechanism is triggered according to the occurrence frequency of the first keyword in a target time period, and whether a fault occurs can be determined only by comparing keywords without problem accumulation, so that the problem can be sensed and found at the first time when the problem occurs, and the problem of poor alarm timeliness in the related technology is solved; meanwhile, the embedded points do not need to be judged in advance, and the intrusiveness to the codes is reduced.
As an alternative embodiment, the obtaining log change information of the application system includes:
acquiring log change sub-information of a plurality of application systems, wherein the log change sub-information is used for indicating that a first error sub-log changes in a target time window, the first error sub-log belongs to the first error log, and the target time window is a time window in a target time period;
classifying application systems with the same keywords in the log change sub-information into the same application type, wherein the application type is used for indicating the category to which the application systems belong;
and summarizing the log change sub-information of the application systems under the same application type into log change information.
Optionally, the LogMonitor platform includes a plurality of application servers, each application server corresponds to a plurality of application systems, so that in order to better collect error logs in the application systems and to manage hundreds of alarm messages more optimally, the embodiment of the present application provides that the plurality of application systems can be classified, the application systems of the same type are classified together, and when the alarm messages are uploaded at a later stage, the application systems are published in a group to which the application systems belong.
In this embodiment of the present application, a time window (i.e., a sliding window) is taken as a unit, and application systems having the same keyword are collected in the same time window and classified into the same type, where the same keyword may be one keyword or multiple keywords, and the number of the same keyword is not specifically limited in this embodiment of the present application.
The keyword is stored in the distributed cache, and the keyword is sequentially stored in the distributed cache according to the application name of the application system in which the keyword is located, the included sub-keyword, the current time, the target time window length in which the keyword is located, and the format of the ending time, for example, there is a first keyword, and the storage format is: logcount # kfp # xxxx #2019-03-1911:03#600# 2019-03-1911: 13#404Not Found, where logcount # kfp denotes log statistics name, xxxx denotes application name, 2019-03-1911:03 denotes current time, 600 denotes target time window length, 2019-03-1911: 13 denotes end time, 404Not Found (cannot be Found) denotes sub-key. It should be noted that, in the embodiment of the present application, the first keyword may be a keyword that only includes a sub-keyword, or may include the sub-keyword and other information (such as an application name or a target time window length).
Illustratively, an application server of the LogMonitor platform acquires log change sub-information of a plurality of application systems, where the log change sub-information is used to indicate that a first error sub-log changes within a target time window, where the log change information includes the log change sub-information, the first error log includes the first error sub-log, and the target time window is a time window within a target time period. For example, the target time period may be set to 14:00-16:00, the target time window may be any time window within the target time period, the step size of the time window can be set to 30 minutes, in this case, 15:00-15:30 in the time period of 14:00-16:00 can be selected as the target time window, log changes for multiple application systems over a time period of 15:00-15:30 are then obtained, because the log changes and error log information is inevitably generated, the keyword information in the log change sub-information is acquired at the moment, the application systems with the same keyword are classified into the same application type, then acquiring the log change sub-information of the application system under the same application type, and because the log change information comprises the log change sub-information, therefore, the log change information can be obtained by performing operations such as splicing and deduplication on the plurality of log change sub-information.
Illustratively, acquiring error reporting statistical information under each application type in a plurality of applications, wherein the error reporting statistical information comprises an error reporting type, an error source and an alarm time; and pushing the error reporting statistical information under each application type to a target information window matched with the application type in a plurality of information windows for displaying, wherein one information window in the plurality of information windows is used for displaying the error reporting statistical information under one application type. As shown in fig. 5, fig. 5 is a screenshot interface of an alarm message displayed in an enterprise communication application program received by a user, the current application system belongs to a "cooperation service PRD" service type, after log change sub-information under the cooperation service PRD service type is obtained, a corresponding log alarm mechanism is selected from the obtained first error sub-log according to a keyword matching condition and a keyword occurrence frequency, and then the alarm message under the same application type is sent to an established corresponding service group, so that the classification of the alarm message is facilitated, the condition that other unrelated service types send alarm messages is reduced, and the burden of checking each alarm message is relieved for background personnel.
According to the method and the device, the application systems which acquire the same keyword in the target time window are classified into the same application type, so that the acquired error logs under the same application type generally have similar characteristics, the later-stage characteristic statistics of the error logs is facilitated, and meanwhile, the alarm messages sent by the application systems under the same application type can be put into the same alarm prompt group, so that the alarm messages can be conveniently checked and managed.
As an alternative embodiment, triggering the corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period includes:
and under the condition that the frequency of the first keyword appearing in the target time window is greater than or equal to the target frequency, sending a first alarm message to an alarm receiving end, wherein the target frequency is used for representing the frequency value for sending the first alarm message.
Optionally, in this embodiment of the present application, a target frequency may be set in advance, a value of a frequency of a first keyword appearing in a target time window is counted, and a value of the value is compared with the target frequency, and when the frequency of the first keyword appearing in the target time window is greater than or equal to the target frequency, a first warning message is sent to a warning receiving end of an application server, where the target frequency is used to represent the value of the frequency of sending the first warning message. For example, an alert message (first alert message) may be sent to the alert receiver when the first keyword a occurs 9 times (9 times are the target frequency) or more than 9 times within a target time window of 15:00-15:30 (i.e., within 30 minutes).
In order to reduce the number of times of sending the alarm message, the embodiment of the application can set a target threshold of the number of times of sending the alarm message in advance, and only when the frequency value of the first keyword appearing in the target time window is larger than or equal to the target threshold, one alarm message is sent to the server, so that the storage pressure of the server is reduced, the communication interaction is reduced, and the network flow consumption is saved.
As an alternative embodiment, triggering the corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period further includes:
determining the duration of a target time window;
storing each first keyword appearing in the duration into a distributed cache;
counting the number of first keywords in the distributed cache;
and sending the number as a second alarm message to an alarm receiving end.
Alternatively, as shown in fig. 6, the values on the ordinate represent the number of alarm messages sent, such as 0, 50, 100, 150, 200, etc., and the values on the abscissa represent various time points of sending alarm messages, such as 00:00, 02:00, 04:00, 06:00, etc., from which the number of alarm messages sent by various application systems in different time periods can be derived, and the whole data development trend of the alarm messages can also be derived according to a plurality of lines displayed in the graph, so as to be used for the later error log analysis and backtracking. Further, since the various application systems are grouped according to application type, viewing of statistical views by application system dimension is supported.
In the embodiment of the present application, the corresponding log alarm mechanism is triggered according to the occurrence frequency of the first keyword in the target time window, and illustratively, the step size (i.e. duration) of the target time window is first set, such as one hour, then, starting timing at the first time when the occurrence of the keyword is detected, the current first time is the starting time point of the target time window, the end point in time of the target time window is then determined in terms of step size, one hour, e.g., the starting node where the occurrence of the keyword (i.e., the first keyword) is detected is 14:00, determining that the ending time point of the current target time window falls at 15:00 according to the set step length by one hour, then, the total number of the keywords appearing in the time period of 14:00-15:00 is counted, the total number is stored in a distributed cache, and the number is used as an alarm message (namely, a second alarm message) and sent to an alarm receiving end.
According to the method and the device, the total number of the keywords appearing in the target time window time period is used as the warning message to be sent to the platform server, so that the keyword appearance frequency can be clearly known, the number of times of sending the warning message can be reduced, the situation of warning bombing is avoided, and the network flow consumption is saved.
As an alternative embodiment, before matching the first key and the second key in the first error log, the method further comprises:
determining that a target error log exists in the first error log, wherein the target error log does not contain keywords;
and sending a third alarm message according to the fault data represented in the target error log.
Optionally, if a target error log exists in the obtained first error log, the target error log does not include any error keyword, where the first error log may be an error log or an error log set, when the first error log is an error log, the present embodiment means that the first error log (i.e., the target error log) does not include any error keyword, and when the first error log is an error log set, the present embodiment means that an error log (i.e., the target error log) exists in the first error log and does not include any error keyword, then all cases that indicate fault data in the target error log are sent to the platform server as an alarm message (third alarm message).
According to the method and the device, each fault data in the target error log which does not contain the keywords is sent to the platform server as one warning message, so that the problems can be sensed and found at the first time when the problems occur, and the warning sensitivity is improved.
As an alternative embodiment, after sending the third warning message according to the fault data indicated in the target error log, the method further includes:
and sending a fourth alarm message to the alarm receiving end under the condition that the number of the third alarm messages is greater than or equal to a preset alarm threshold, wherein the fourth alarm message is used for indicating the log monitoring platform to stop sending the third alarm message, and the target alarm threshold is the maximum alarm message value received by the alarm receiving end.
Optionally, in the foregoing embodiment, when the number of continuously sending the third alarm messages is too large, receiving and storing pressure is usually caused to the platform server, so in this embodiment of the application, a target alarm threshold is set, where the target alarm threshold is used to control the sending number of the third alarm messages, and a value of the target alarm threshold is a maximum alarm message value received by the alarm receiving end.
Illustratively, the number of the third alarm messages is compared with the target alarm threshold, and a fourth alarm message is sent to the alarm receiving end when the number of the third alarm messages is greater than or equal to the target alarm threshold, wherein the fourth alarm message is used for instructing the log monitoring platform to stop sending the third alarm message. For example, when data indicating a fault in the target error log is excessive and a third alarm message sent within 1 minute exceeds a target alarm threshold, a notification that the sending of the third alarm message is stopped needs to be sent to the log monitoring alarm server cluster of the platform, which may be a fourth alarm message, so that the log monitoring alarm server cluster does not send the third alarm message any more.
According to the embodiment of the application, the number of the third alarm messages is controlled by setting a target alarm threshold, so that the pressure of the log monitoring platform for receiving the alarm messages can be reduced.
As an optional embodiment, after stopping sending the third warning message, the method further includes:
under the condition that the time for stopping sending the third warning message reaches the target time length, acquiring the quantity of the third warning message again, wherein the target time length is used for triggering and executing the step of acquiring the quantity of the third warning message again;
and under the condition that the number of the third alarm messages acquired again is smaller than the target alarm threshold value, sending the third alarm messages according to fault data represented in the target error log.
Optionally, the number of the third warning messages is obtained again when the time for stopping sending the third warning messages reaches the target duration, and the third warning messages can be sent normally if the number of the third warning messages is reduced to the value of the target warning threshold value when the number of the third warning messages obtained again is smaller than the target warning threshold value. And the target duration is used for triggering and executing the step of acquiring the number of the third alarm messages again.
According to the embodiment of the application, the number of the third warning messages of the warning receiving end is obtained in real time, and the sending mode of the third warning messages is adjusted in time, so that the receiving capacity of the platform server can be balanced, the warning condition can be fed back in time, and the purpose of improving warning timeliness is achieved.
As an alternative embodiment, after triggering the corresponding log alarm mechanism according to the occurrence frequency of the first keyword within the target time period, the method further includes:
in the case where there is a first keyword matching the second keyword in the first error log, no warning message is generated.
Optionally, in the embodiment of the present application, by using a keyword exclusion method, when a first keyword matched with a second keyword exists in a first error log, an alarm message is not generated.
Illustratively, the storage format of the first keyword is: logcount # kfp # xxxx #2019-03-1911:03#600# 2019-03-1911: 13#404Not Found, the second key may be part of a key that triggers a Server alarm, e.g., 404Not Found, 502Bad Gateway, 500-13Server is to busy, etc. The type of the second keyword may be configured as needed, which is not limited in this embodiment of the application.
When the first keyword and the second keyword in the first error log are matched, because the sub-keyword of the first keyword and the second keyword both have the same keyword: 404Not Found, no alarm message is generated for the first error log at this time.
It should be noted that the second keyword set in the embodiment of the present application is usually a plurality of keywords that cause a log to have a higher error probability, so in order to reduce the number of alarm messages and reduce the statistical pressure of background staff on the alarm messages, when an error log having the same keyword as the second keyword is matched, the alarm message is not generated any more.
According to the method and the device for eliminating the warning, the number of warning messages is reduced by setting the mode of eliminating the warning by matching the second keyword, and the statistical pressure of background workers on the warning messages is relieved.
As an optional embodiment, before obtaining the log change information of the application system, the method further includes:
acquiring a trigger operation performed on a target fault switch;
and controlling to execute sending operation or closing operation on the alarm message according to the triggering operation and the function corresponding to the target fault switch.
Optionally, as shown in fig. 3, a global control switch (i.e., a target failure switch) is disposed on an alarm configuration page of the monitoring log, and is used to perform alarm control on the entire LogMonitor platform, and no alarm message will be sent after the global control switch is turned off.
Illustratively, when the user performs a left-right sliding operation on the global control switch, the LogMonitor platform acquires an operation performed by the user on the global control switch, and triggers a function corresponding to the global control switch according to a sliding direction, for example, when the acquired operation is that the user slides the global control switch button to the left, no alarm message is sent, and only when the user slides the global control switch button to the right, the LogMonitor platform can perform an operation of sending the alarm message.
In addition, in each of the above embodiments, the mentioned platform, LogMonitor platform, and log monitoring platform are all the same platform.
According to the embodiment of the application, the global control switch is arranged, so that the global control requirement of a user for receiving the alarm message can be met, and the user experience is improved.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
As an alternative embodiment, as shown in fig. 7, fig. 7 is a schematic diagram of an alternative fault alarm processing flow according to an embodiment of the present application, and the specific steps are as follows:
step S701, receiving an error log;
step S702, acquiring alarm configuration;
step S703, analyzing the error log;
step S704, determining whether the keyword (i.e., the first keyword) in the error log can be matched with a preset keyword (i.e., the second keyword); if not, executing step S705, otherwise, not processing;
step S705, judging whether the keywords in the error log accord with the keyword frequency convergence rule; if not, executing step S706, otherwise, executing step S707;
step S706, judging whether the alarm quantity exceeds a target alarm threshold value; if the alarm is over, sending a notice for stopping sending the alarm, otherwise, sending a common alarm;
step S707, judging whether the keyword in the current error log appears for the first time, if so, sending an alarm of the first appearance, otherwise, executing step S708;
it should be noted that, in the embodiment of the present application, each keyword in the error log is stored in the distributed cache, so that whether the keyword in the current error log appears for the first time can be determined according to the stored keyword in the distributed cache.
Step S708, judging whether the occurrence frequency of the keywords in the current error log exceeds a target frequency threshold, if so, sending the total times of the occurrence frequency of the keywords, and otherwise, normally sending an alarm.
It should be noted that the keyword frequency convergence rule refers to whether a keyword exists in an error log, if yes, a corresponding log alarm mechanism may be executed according to the keyword frequency, and if not, an alarm is normally sent, and the sending number of the alarm is controlled by comparing the alarm number with a target alarm threshold.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solution of the present application or portions thereof that contribute to the prior art may be embodied in the form of a software product, where the computer software product is stored in a storage medium (e.g., a ROM (Read-Only Memory)/RAM (Random Access Memory), a magnetic disk, an optical disk), and includes several instructions for enabling a terminal device (which may be a mobile phone, a computer, a server, or a network device) to execute the method for warning about a failure in various embodiments of the present application.
According to another aspect of the embodiment of the application, a fault warning device for implementing the fault warning method is further provided. Fig. 8 is a schematic diagram of an alternative fault warning device according to an embodiment of the present application, and as shown in fig. 8, the device may include:
a first obtaining module 801, configured to obtain log change information of an application system, where the log change information is used to indicate that a first error log changes within a time period;
a first matching module 802, connected to the first obtaining module 801, configured to match a first keyword in the first error log with a second keyword, where the second keyword is a keyword from which an alarm needs to be excluded;
the triggering module 803 is connected to the first matching module 802, and configured to trigger a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period when the first error log does not have the first keyword matched with the second keyword.
It should be noted that the first obtaining module 801 in this embodiment may be configured to execute the step S201, the first matching module 802 in this embodiment may be configured to execute the step S202, and the triggering module 803 in this embodiment may be configured to execute the step S203.
Through the module, the log change information in the application system is obtained, the first error log is determined, the first keyword and the second keyword in the first error log are matched, under the condition that the first keyword matched with the second keyword does not exist, a corresponding log alarm mechanism is triggered according to the occurrence frequency of the first keyword in a target time period, problem accumulation is not needed, and whether a fault occurs can be determined only by keyword comparison, so that the problem can be sensed and found at the first time when the problem occurs, and the problem of poor alarm timeliness in the related technology is solved; meanwhile, the embedded points do not need to be judged in advance, and the intrusiveness to the codes is reduced.
As an alternative embodiment, the first obtaining module includes:
the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring log change sub-information of a plurality of application systems, the log change sub-information is used for indicating that a first error sub-log changes in a target time window, the first error sub-log belongs to the first error log, and the target time window is a time window in a target time period;
the classification unit is used for classifying application systems corresponding to the log change sub-information with the same keyword into the same application type, wherein the application type is used for indicating the category to which the application systems belong;
and the summarizing unit is used for summarizing the log change sub-information of the application systems under the same application type into the log change information.
As an alternative embodiment, the apparatus further comprises:
the second acquisition unit is used for acquiring error reporting statistical information under each application type in a plurality of applications after classifying the application systems corresponding to the log change sub-information with the same keyword into the same application type, wherein the error reporting statistical information comprises an error reporting type, an error source and alarm time;
and the display unit is used for pushing the error reporting statistical information under each application type to a target information window matched with the application type in a plurality of information windows for displaying, wherein one information window in the plurality of information windows is used for displaying the error reporting statistical information under one application type.
As an alternative embodiment, the triggering module comprises:
the first sending unit is used for sending a first alarm message to an alarm receiving end under the condition that the frequency of the first keyword appearing in the target time window is greater than or equal to the preset frequency, wherein the target frequency is used for representing the frequency value of sending the first alarm message.
As an optional embodiment, the triggering module further includes:
a first determining unit, configured to determine a duration of a target time window;
the storage unit is used for storing each first keyword appearing in the duration into the distributed cache;
the statistical unit is used for counting the number of the first keywords in the distributed cache;
and the second sending unit is used for sending the number as a second alarm message to the alarm receiving end.
As an alternative embodiment, the memory cell comprises:
the generating subunit is used for generating the first keyword according to the application name of the application system where the first keyword is located, the sub-keywords contained in the first keyword, the current time, the length of the target time window where the first keyword is located and the sequence of the ending time;
and the storage subunit is used for storing the first keyword into the distributed cache.
As an alternative embodiment, the apparatus further comprises:
the first determining module is used for determining that a target error log exists in the first error log before the first keyword in the first error log is matched with the second keyword, wherein the target error log does not contain the keywords;
and the first sending module is used for sending a third warning message according to the fault data represented in the target error log.
As an alternative embodiment, the apparatus further comprises:
and the second sending module is used for sending a fourth alarm message to the alarm receiving end under the condition that the quantity of the third alarm messages is greater than or equal to a target alarm threshold value after the third alarm message is sent according to the fault data represented in the target error log, wherein the fourth alarm message is used for indicating the log monitoring platform to stop sending the third alarm message, and the target alarm threshold value is the maximum alarm message value received by the alarm receiving end.
As an alternative embodiment, the apparatus further comprises:
the third sending module is used for obtaining the number of the third warning messages again under the condition that the time for stopping sending the third warning messages reaches the target time length after the third warning messages are stopped being sent, wherein the target time length is used for triggering and executing the step of obtaining the number of the third warning messages again;
and the fourth sending module is used for sending the third alarm message according to the fault data represented in the target error log under the condition that the number of the third alarm messages obtained again is smaller than the target alarm threshold value.
As an alternative embodiment, the apparatus further comprises:
and the second matching module is used for not generating an alarm message under the condition that the first keyword matched with the second keyword exists in the first error log after triggering a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period.
As an alternative embodiment, the apparatus further comprises:
the second acquisition module is used for acquiring the trigger operation of the target fault switch before acquiring the log change information of the application system;
and the control module is used for controlling the sending operation or the closing operation of the alarm message according to the triggering operation and the function corresponding to the target fault switch.
It should be noted here that the modules described above are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the above embodiments. It should be noted that the modules described above as a part of the apparatus may be operated in a hardware environment as shown in fig. 1, and may be implemented by software, or may be implemented by hardware, where the hardware environment includes a network environment.
According to another aspect of the embodiments of the present application, there is also provided an electronic device for implementing the method for alarming a fault, where the electronic device may be a server, a terminal, or a combination thereof.
Fig. 9 is a block diagram of an alternative electronic device according to an embodiment of the present application, as shown in fig. 9, including a processor 901, a communication interface 902, a memory 903, and a communication bus 904, where the processor 901, the communication interface 902, and the memory 903 are communicated with each other through the communication bus 904, where,
a memory 903 for storing a computer program;
the processor 901 is configured to implement the following steps when executing the computer program stored in the memory 903:
s1, obtaining log change information of the application system, wherein the log change information is used for indicating that the first error log changes in a target time period;
s2, matching the first keyword in the first error log with a second keyword, wherein the second keyword is a keyword for which an alarm needs to be excluded;
s3, when the first error log does not have the first keyword matched with the second keyword, triggering a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period.
Alternatively, in this embodiment, the communication bus may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 9, but this does not indicate only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The memory may include RAM, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory. Alternatively, the memory may be at least one memory device located remotely from the processor.
As an example, as shown in fig. 9, the memory 903 may include, but is not limited to, a first obtaining module 801, a first matching module 802, and a triggering module 803 in the alarm device that includes the fault. In addition, other module units in the above-mentioned fault warning device may also be included, but are not limited to this, and are not described in detail in this example.
The processor may be a general-purpose processor, and may include but is not limited to: a CPU (Central Processing Unit), an NP (Network Processor), and the like; but also a DSP (Digital Signal Processing), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
In addition, the electronic device further includes: and the display is used for displaying the fault alarm result.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments, and this embodiment is not described herein again.
It can be understood by those skilled in the art that the structure shown in fig. 9 is only an illustration, and the device implementing the above fault warning method may be a terminal device, and the terminal device may be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 9 does not limit the structure of the electronic device. For example, the terminal device may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 9, or have a different configuration than shown in FIG. 9.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disk, ROM, RAM, magnetic or optical disk, and the like.
According to still another aspect of an embodiment of the present application, there is also provided a storage medium. Alternatively, in this embodiment, the storage medium may be used to execute a program code of a failure warning method.
Optionally, in this embodiment, the storage medium may be located on at least one of a plurality of network devices in a network shown in the above embodiment.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps:
s1, obtaining log change information of the application system, wherein the log change information is used for indicating that the first error log changes in a target time period;
s2, matching the first keyword in the first error log with a second keyword, wherein the second keyword is a keyword for which an alarm needs to be excluded;
s3, when the first error log does not have the first keyword matched with the second keyword, triggering a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period.
Optionally, the specific example in this embodiment may refer to the example described in the above embodiment, which is not described again in this embodiment.
Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing program codes, such as a U disk, a ROM, a RAM, a removable hard disk, a magnetic disk, or an optical disk.
According to yet another aspect of an embodiment of the present application, there is also provided a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium; the processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the steps of any of the embodiments of the method for alerting of a fault described above.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including instructions for causing one or more computer devices (which may be personal computers, servers, network devices, or the like) to execute all or part of the steps of the method described in the embodiments of the present application.
In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution provided in the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (14)

1. A method for alarming a fault, the method comprising:
acquiring log change information of an application system, wherein the log change information is used for indicating that a first error log changes in a target time period;
matching a first keyword in the first error log with a second keyword, wherein the second keyword is a keyword which needs to eliminate an alarm;
and under the condition that the first error log does not have a first keyword matched with the second keyword, triggering a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period.
2. The method of claim 1, wherein obtaining log change information for an application system comprises:
acquiring log change sub-information of a plurality of application systems, wherein the log change sub-information is used for indicating that a first error sub-log changes in a target time window, the first error sub-log belongs to the first error log, and the target time window is a time window in the target time period;
classifying the application systems corresponding to the log change sub-information with the same keyword into the same application type, wherein the application type is used for indicating the category to which the application systems belong;
and summarizing the log change sub-information of the application systems under the same application type into the log change information.
3. The method of claim 2, wherein after the grouping of the application systems corresponding to log change sub-information for which the same key exists into the same application type, the method further comprises:
acquiring error reporting statistical information under each application type in a plurality of applications, wherein the error reporting statistical information comprises an error reporting type, an error source and alarm time;
and pushing the error reporting statistical information under each application type to a target information window matched with the application type in a plurality of information windows for displaying, wherein one information window in the plurality of information windows is used for displaying the error reporting statistical information under one application type.
4. The method of claim 2, wherein the triggering a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period comprises:
and sending a first alarm message to an alarm receiving end under the condition that the frequency of the first keyword appearing in the target time window is greater than or equal to a target frequency, wherein the target frequency is used for representing the frequency value for sending the first alarm message.
5. The method of claim 2, wherein triggering a corresponding log alarm mechanism according to the frequency of occurrence of the first keyword within the target time period further comprises:
determining the duration of the target time window;
storing each first keyword occurring within the duration into a distributed cache;
counting the number of the first keywords in the distributed cache;
and sending the number as a second alarm message to an alarm receiving end.
6. The method of claim 5, wherein storing each of the first keys occurring within the duration into a distributed cache comprises:
generating the first keyword according to the application name of an application system where the first keyword is located, the sub-keywords contained in the first keyword, the current time, the length of the target time window where the first keyword is located and the sequence of the end time;
and storing the first key into a distributed cache.
7. The method of claim 1, wherein prior to said matching a first key to a second key in said first error log, said method further comprises:
determining that a target error log exists in the first error log, wherein the target error log does not contain the keyword;
and sending a third alarm message according to the fault data represented in the target error log.
8. The method of claim 7, wherein after the sending of the third warning message based on the fault data represented in the target error log, the method further comprises:
and sending a fourth alarm message to an alarm receiving end under the condition that the number of the third alarm messages is greater than or equal to a target alarm threshold, wherein the fourth alarm message is used for indicating a log monitoring platform to stop sending the third alarm messages, and the target alarm threshold is the maximum alarm message value received by the alarm receiving end.
9. The method of claim 8, wherein after the stopping of sending the third warning message, the method further comprises:
under the condition that the time for stopping sending the third warning message reaches the target time length, acquiring the number of the third warning messages again, wherein the target time length is used for triggering and executing the step of acquiring the number of the third warning messages again;
and sending the third alarm message according to fault data represented in the target error log under the condition that the number of the third alarm messages acquired again is smaller than the target alarm threshold.
10. The method of claim 1, wherein after triggering a corresponding log alarm mechanism according to the frequency of occurrence of the first keyword within the target time period, the method further comprises:
and not generating an alarm message when the first keyword matched with the second keyword exists in the first error log.
11. The method of claim 1, wherein prior to obtaining log change information for an application system, the method further comprises:
acquiring a trigger operation performed on a target fault switch;
and controlling to execute sending operation or closing operation on the alarm message according to the triggering operation and the function corresponding to the target fault switch.
12. A fault warning device, characterized in that it comprises:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring log change information of an application system, and the log change information is used for indicating that a first error log changes in a target time period;
the first matching module is used for matching a first keyword in the first error log with a second keyword, wherein the second keyword is a keyword which needs to exclude an alarm;
and the triggering module is used for triggering a corresponding log alarm mechanism according to the occurrence frequency of the first keyword in the target time period under the condition that the first keyword matched with the second keyword does not exist in the first error log.
13. A computer-readable storage medium, in which a computer program is stored, wherein the computer program is configured to execute a method for warning of a fault as claimed in any one of claims 1 to 11 when the computer program is run.
14. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of warning of a fault as claimed in any one of claims 1 to 11 by means of the computer program.
CN202011233566.0A 2020-11-06 2020-11-06 Fault warning method and device, storage medium and electronic equipment Pending CN112395156A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011233566.0A CN112395156A (en) 2020-11-06 2020-11-06 Fault warning method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011233566.0A CN112395156A (en) 2020-11-06 2020-11-06 Fault warning method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN112395156A true CN112395156A (en) 2021-02-23

Family

ID=74599094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011233566.0A Pending CN112395156A (en) 2020-11-06 2020-11-06 Fault warning method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN112395156A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113297183A (en) * 2021-07-21 2021-08-24 国网汇通金财(北京)信息科技有限公司 Alarm analysis method and device for time window
CN113449196A (en) * 2021-07-16 2021-09-28 北京天眼查科技有限公司 Information generation method and device, electronic equipment and readable storage medium
CN113900902A (en) * 2021-10-21 2022-01-07 挂号网(杭州)科技有限公司 Log processing method and device, electronic equipment and storage medium
CN114666210A (en) * 2022-05-23 2022-06-24 江苏金融租赁股份有限公司 Alarm method and device based on big data log analysis
CN115348161A (en) * 2022-08-16 2022-11-15 中国电信股份有限公司 Log alarm information generation method and device, electronic equipment and storage medium
CN116634205A (en) * 2023-07-19 2023-08-22 深圳市华曦达科技股份有限公司 Smart television box and log management method, device and system thereof
CN116737516A (en) * 2023-06-12 2023-09-12 无锡摩芯半导体有限公司 Method for early warning of vehicle gauge chip

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113449196A (en) * 2021-07-16 2021-09-28 北京天眼查科技有限公司 Information generation method and device, electronic equipment and readable storage medium
CN113449196B (en) * 2021-07-16 2024-04-19 北京金堤科技有限公司 Information generation method and device, electronic equipment and readable storage medium
CN113297183A (en) * 2021-07-21 2021-08-24 国网汇通金财(北京)信息科技有限公司 Alarm analysis method and device for time window
CN113900902A (en) * 2021-10-21 2022-01-07 挂号网(杭州)科技有限公司 Log processing method and device, electronic equipment and storage medium
CN114666210A (en) * 2022-05-23 2022-06-24 江苏金融租赁股份有限公司 Alarm method and device based on big data log analysis
CN114666210B (en) * 2022-05-23 2022-08-16 江苏金融租赁股份有限公司 Alarm method and device based on big data log analysis
CN115348161A (en) * 2022-08-16 2022-11-15 中国电信股份有限公司 Log alarm information generation method and device, electronic equipment and storage medium
CN116737516A (en) * 2023-06-12 2023-09-12 无锡摩芯半导体有限公司 Method for early warning of vehicle gauge chip
CN116737516B (en) * 2023-06-12 2024-01-30 无锡摩芯半导体有限公司 Method for early warning of vehicle gauge chip
CN116634205A (en) * 2023-07-19 2023-08-22 深圳市华曦达科技股份有限公司 Smart television box and log management method, device and system thereof

Similar Documents

Publication Publication Date Title
CN112395156A (en) Fault warning method and device, storage medium and electronic equipment
CN110661659B (en) Alarm method, device and system and electronic equipment
US11586972B2 (en) Tool-specific alerting rules based on abnormal and normal patterns obtained from history logs
CN110224858B (en) Log-based alarm method and related device
CN103220173B (en) A kind of alarm monitoring method and supervisory control system
CN102938710B (en) For supervisory control system and the method for large-scale server
CN103001824B (en) A kind of supervisory control system and method for supervising monitoring multiple servers
CN112311617A (en) Configured data monitoring and alarming method and system
CN104796273A (en) Method and device for diagnosing root of network faults
CN112152823B (en) Website operation error monitoring method and device and computer storage medium
US9658908B2 (en) Failure symptom report device and method for detecting failure symptom
CN111240876B (en) Fault positioning method and device for micro-service, storage medium and terminal
CN112231271A (en) Data migration integrity verification method, device and equipment and computer readable medium
CN105743730A (en) Method and system used for providing real-time monitoring for webpage service of mobile terminal
CN110765189A (en) Exception management method and system for Internet products
CN114363151A (en) Fault detection method and device, electronic equipment and storage medium
CN111130944B (en) System monitoring method and system
CN109818808B (en) Fault diagnosis method and device and electronic equipment
CN105825641A (en) Service alarm method and apparatus
CN110609761B (en) Method and device for determining fault source, storage medium and electronic equipment
CN108984362A (en) Log collection method and device, storage medium, electronic equipment
CN113835961B (en) Alarm information monitoring method, device, server and storage medium
CN105827447A (en) Service alarm method and apparatus
US10296967B1 (en) System, method, and computer program for aggregating fallouts in an ordering system
CN111880959A (en) Abnormity detection method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination