CN117155768A - Method and device for monitoring and processing system full link abnormality - Google Patents

Method and device for monitoring and processing system full link abnormality Download PDF

Info

Publication number
CN117155768A
CN117155768A CN202311022542.4A CN202311022542A CN117155768A CN 117155768 A CN117155768 A CN 117155768A CN 202311022542 A CN202311022542 A CN 202311022542A CN 117155768 A CN117155768 A CN 117155768A
Authority
CN
China
Prior art keywords
abnormal
monitoring
anomaly
information
internal flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311022542.4A
Other languages
Chinese (zh)
Inventor
谢凤
王雷
戴稳成
冯轲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Coocaa Network Technology Co Ltd
Original Assignee
Shenzhen Coocaa Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Coocaa Network Technology Co Ltd filed Critical Shenzhen Coocaa Network Technology Co Ltd
Priority to CN202311022542.4A priority Critical patent/CN117155768A/en
Publication of CN117155768A publication Critical patent/CN117155768A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0681Configuration of triggering conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Environmental & Geological Engineering (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses a system all-link abnormity monitoring and processing method and device, wherein the method comprises the following steps: acquiring system internal flow data, confirming abnormal scenes in the system based on the system internal flow data, and setting corresponding monitoring mechanisms for the abnormal scenes in the system; performing anomaly monitoring on the system full link, and judging whether anomaly information is monitored or not based on monitoring mechanisms corresponding to various anomaly scenes in the system; when abnormal information is monitored, pushing the abnormal information to a designated terminal through a unified message interface and through a kafka message middleware; based on the pushed abnormal information, inquiring the corresponding abnormal log, and automatically positioning the fault point of the abnormal information. The application informs service personnel of system abnormality in real time by means of monitoring, information and the like, processes the abnormality in time, can quickly inquire the needed log when the fault exists, shortens the fault positioning time, further improves the availability and provides convenience for users.

Description

Method and device for monitoring and processing system full link abnormality
Technical Field
The application relates to the technical field of internet computers, in particular to a system all-link abnormality monitoring and processing method, a device, an intelligent terminal and a storage medium.
Background
With the development of the internet and the continuous improvement of the living standard of people, the use of various internet computers is becoming more and more popular.
In the prior art, various internet computers are often required to be monitored, but in the prior art, full-link data of a system cannot be monitored comprehensively and in real time, so that full-link abnormal monitoring of the system is difficult to achieve, and rapid and accurate fault point positioning is not facilitated.
Accordingly, there is a need for improvement and development in the art.
Disclosure of Invention
The application aims at solving the technical problems of the prior art and provides a system all-link abnormality monitoring processing method, a device, an intelligent terminal and a storage medium.
The technical scheme adopted by the application for solving the problems is as follows:
a system full link abnormity monitoring and processing method includes:
acquiring system internal flow data, confirming abnormal scenes in the system based on the system internal flow data, and setting corresponding monitoring mechanisms for the abnormal scenes in the system;
performing anomaly monitoring on the system full link, and judging whether anomaly information is monitored or not based on monitoring mechanisms corresponding to various anomaly scenes in the system;
when abnormal information is monitored, pushing the abnormal information to a designated terminal through a unified message interface and through a kafka message middleware;
based on the pushed abnormal information, inquiring the corresponding abnormal log, and automatically positioning the fault point of the abnormal information.
The method for processing the system full link anomaly monitoring, wherein the steps of acquiring the system internal flow data, confirming the anomaly scene in the system based on the system internal flow data, and setting a corresponding monitoring mechanism for each anomaly scene in the system comprise the following steps:
acquiring and determining the internal flow of a system to be monitored, wherein the internal flow comprises various modules, components or functions of the system;
defining an abnormal scene which can occur for each system internal flow;
and collecting system internal flow data according to different abnormal scenes, setting different monitoring indexes, thresholds and alarm mechanisms based on the system internal flow data, and finishing setting corresponding monitoring mechanisms for detecting the abnormal scenes and triggering corresponding alarms in time.
The method for processing the system full link anomaly monitoring, wherein the steps of acquiring the system internal flow data, confirming the anomaly scene in the system based on the system internal flow data, and setting a corresponding monitoring mechanism for each anomaly scene in the system comprise the following steps:
and testing and verifying the effectiveness of the monitoring mechanism by simulating an abnormal scene to generate test data.
The method for processing the system full link anomaly monitoring, wherein the steps of acquiring the system internal flow data, confirming the anomaly scene in the system based on the system internal flow data, and setting a corresponding monitoring mechanism for each anomaly scene in the system comprise the following steps:
according to the evolution and improvement of the system, monitoring indexes and thresholds are adjusted for new abnormal scenes or needed to be adjusted, and monitoring mechanisms corresponding to the abnormal scenes in the system are periodically checked and updated.
The method for monitoring and processing the system all-link abnormality, wherein the steps of monitoring the system all-link abnormality and judging whether abnormal information is monitored based on a monitoring mechanism corresponding to each abnormal scene in the system comprise the following steps:
collecting monitoring data of all links and components in the system;
analyzing the monitoring data of all links and components in the collected system according to the monitoring index corresponding to the determined abnormal scene;
judging whether abnormal information is monitored according to the analysis result of the monitoring data;
when the monitoring data accords with the condition defined by the abnormal scene, a corresponding alarm is triggered.
The method for monitoring and processing the system full link abnormality, wherein when abnormal information is monitored, the abnormal information is pushed to a designated terminal through a unified message interface and a kafka message middleware, and the method comprises the following steps:
presetting a unified message interface for receiving abnormal information and sending the abnormal information to the Kafka message middleware;
when abnormal information is monitored, the abnormal information is packaged into a message object and is sent to the Kafka message middleware through a message interface;
receiving anomaly information from the Kafka message middleware through the designated terminal;
when the abnormal information is received, performing abnormal processing and fault removal according to a preset abnormal information solution.
The system full-link abnormality monitoring processing method, wherein the step of inquiring the corresponding abnormality log based on the pushed abnormality information and automatically positioning the fault point of the abnormality information comprises the following steps:
receiving pushed abnormal information;
according to the received abnormal information, using a corresponding query tool or statement to query a corresponding abnormal log in a storage system;
when the abnormal log is queried, analyzing the abnormal log, including extracting key information, analyzing log content and identifying an abnormal mode;
and locating fault points of the abnormal information according to the analyzed abnormal log.
A system full link anomaly monitoring processing apparatus, wherein the apparatus comprises:
the acquisition and monitoring mechanism setting module is used for acquiring the internal flow data of the system, confirming abnormal scenes in the system based on the internal flow data of the system, and setting corresponding monitoring mechanisms for the abnormal scenes in the system;
the judging module is used for carrying out anomaly monitoring on the whole link of the system and judging whether anomaly information is monitored or not based on a monitoring mechanism corresponding to each anomaly scene in the system;
the monitoring exception handling module is used for pushing the exception information to the appointed terminal through a unified message interface and through kafka message middleware when the exception information is monitored;
the query and fault positioning module is used for querying the corresponding abnormal log based on the pushed abnormal information and automatically positioning the fault point of the abnormal information.
A smart terminal comprising a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising means for performing any of the methods.
A non-transitory computer readable storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform any one of the methods.
The application has the beneficial effects that: the application provides a system all-link anomaly monitoring and processing method and a device, wherein an anomaly scene in a system is combed according to the internal flow of the system; different monitoring mechanisms are designed aiming at different abnormal scenes; pushing the message outwards through a unified message interface when the anomaly monitoring is carried out; and pushing and consuming the abnormal information through the kafka message middleware. The application informs service personnel of system abnormality in real time by means of monitoring, information and the like, processes the abnormality in time, can quickly inquire the needed log when the fault exists, shortens the fault positioning time, further improves the availability and provides convenience for users.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings may be obtained according to the drawings without inventive effort to those skilled in the art.
Fig. 1 is a flow chart of a system full link anomaly monitoring processing method according to an embodiment of the present application.
Fig. 2 is a schematic diagram of a processing flow of a data all-link monitoring platform with a system all-link anomaly monitoring processing method according to an embodiment of the present application.
Fig. 3 is a schematic block diagram of a system full-link anomaly monitoring processing apparatus according to an embodiment of the present application.
Fig. 4 is a schematic block diagram of an internal structure of an intelligent terminal according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear and clear, the present application will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It should be noted that, if directional indications (such as up, down, left, right, front, and rear … …) are included in the embodiments of the present application, the directional indications are merely used to explain the relative positional relationship, movement conditions, etc. between the components in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indications are correspondingly changed.
Improving user experience has been an aspect of interest and importance in internet products. The application provides a system all-link abnormality monitoring and processing method, which informs service personnel of system abnormality in real time by means of monitoring, information and the like, processes the abnormality in time, can quickly inquire a required log when a fault exists, shortens the fault positioning time, further improves the availability and provides convenience for users.
Exemplary method
As shown in fig. 1, a system all-link anomaly monitoring processing method in embodiment 1 of the present application includes the following steps:
step S100, acquiring internal flow data of a system, confirming abnormal scenes in the system based on the internal flow data of the system, and setting corresponding monitoring mechanisms for the abnormal scenes in the system;
in the embodiment of the application, the internal flow data of the system is acquired, the abnormal scene in the system is confirmed based on the internal flow data of the system, and the corresponding monitoring mechanism is set for each abnormal scene in the system. The method mainly comprises the step of carding abnormal scenes in a system according to the internal flow of the system.
Specifically, the S100 includes:
s101, acquiring and determining the internal flow of a system to be monitored, wherein the internal flow comprises various modules, components or functions of the system; this may be accomplished through discussions with the relevant team and stakeholders to ensure that the determined flow is critical and needs to be monitored.
S102, defining an abnormal scene: an abnormal scene which can occur can be defined for each system internal flow; these anomaly scenarios may be known or may be based on past experience or analysis of similar systems. Such as system crashes, performance degradation, data loss, etc.
S103, collecting system internal flow data according to different abnormal scenes, setting different monitoring indexes, thresholds and alarm mechanisms based on the system internal flow data, and finishing setting corresponding monitoring mechanisms for detecting the abnormal scenes and triggering corresponding alarms in time.
Wherein, with respect to collecting system internal flow data: is system internal flow data that is determined to need to be collected, which can help identify abnormal scenarios. This may include system logs, performance metrics, error reports, etc. In cooperation with the relevant team, it is ensured that the system is able to generate and record these data.
And the monitoring mechanism is set as follows: setting a corresponding monitoring mechanism based on the internal flow data of the system; this may be accomplished through the use of a monitoring tool or platform. Different monitoring indexes, thresholds and alarm mechanisms can be set according to different abnormal scenes. For example, when the response time of the system exceeds a threshold, the number of error logs exceeds an expectation, etc., a corresponding alarm is triggered.
In the embodiment of the application, the test and verification are also performed: and testing and verifying the effectiveness of the monitoring mechanism, for example, by simulating an abnormal scene, and generating test data. For example, the validity of the monitoring mechanism may be tested and verified in a real environment. This may be accomplished by simulating an abnormal scene, generating test data, etc. The monitoring mechanism can accurately detect the abnormal scene and trigger the corresponding alarm in time.
In the embodiment of the application, periodic examination and updating are also performed: according to the evolution and improvement of the system, monitoring indexes and thresholds are adjusted for new abnormal scenes or needed to be adjusted, and monitoring mechanisms corresponding to the abnormal scenes in the system are periodically checked and updated. I.e. as the system evolves and improves, new abnormal scenes may appear or monitoring indexes and thresholds need to be adjusted; thus, corresponding adjustment, modification and updating can be carried out, and the monitoring mechanism is ensured to be consistent with the actual condition of the system.
Step S200, monitoring the system all links abnormally, and judging whether abnormal information is monitored or not based on a monitoring mechanism corresponding to each abnormal scene in the system;
in the application, the exception monitoring is mainly carried out aiming at the whole link of the system, and whether the exception information is monitored is judged based on the monitoring mechanism corresponding to each exception scene in the system. And performing anomaly monitoring on the system full link through a set monitoring mechanism, and judging whether anomaly information is monitored.
Specifically, the step S200 includes:
s201, collecting monitoring data of all links and components in a system;
in the embodiment of the application, when the possible abnormal scene is determined according to the characteristics and the requirements of the system, the corresponding monitoring index is defined. For example, the response time exceeds a threshold, the error rate exceeds an expected, service is not available, etc.; and setting a corresponding monitoring mechanism according to the abnormal scene and the monitoring index. The monitoring data is collected: ensuring that each link and component in the system full link can transmit monitoring data. This may be achieved by integrating a monitoring agent or tool in the system. The monitoring agent is responsible for collecting and communicating monitoring data, ensuring that the data can flow in the full link.
S202, analyzing the monitoring data of all links and components in the collected system according to the monitoring indexes corresponding to the determined abnormal scene;
in the step, the collected monitoring data are analyzed to judge whether the monitoring data which do not accord with the monitoring index exist or not. In particular, monitoring data of all links and components in the system are collected and analyzed. The data is stored centrally and analyzed in real time or periodically using a monitoring tool or platform. By analyzing the monitoring data, it is possible to detect an abnormal scene and determine whether abnormal information is monitored.
S203, judging whether abnormal information is monitored according to an analysis result of the monitoring data;
and analyzing the monitoring data according to the monitoring indexes of the monitoring mechanism corresponding to each abnormal scene, and judging whether abnormal information which does not accord with the monitoring indexes exists.
S204, when the monitoring data accords with the condition defined by the abnormal scene, triggering a corresponding alarm.
In the embodiment of the application, whether abnormal information is monitored is judged according to the analysis result of the monitoring data. And if the monitoring data accords with the condition defined by the abnormal scene, triggering a corresponding alarm. The alert may be a notification sent to the relevant team or stakeholder through the monitoring tool.
Step S300, when abnormal information is monitored, pushing the abnormal information to a designated terminal through a unified message interface and through a kafka message middleware;
in the application, when abnormal information is monitored, the abnormal information is packaged into kafka message middleware and pushed to a designated terminal through a unified message interface.
Specifically, the step S300 includes:
s301, presetting a unified message interface for receiving abnormal information and sending the abnormal information to a Kafka message middleware;
for example, a unified message interface needs to be configured to receive exception information and send it to the Kafka message middleware. The message interface may be implemented using an API (application programming interface) or SDK (software development kit) of the message queue.
And installs and configures Kafka message middleware in the system. This includes setting the Kafka cluster, creating topics (Topic), and configuring producers (Producer) and consumers (Consumer).
S302, when abnormal information is monitored, the abnormal information is packaged into a message object and is sent to a Kafka message middleware through a message interface; the message object may contain detailed information of the exception, such as the type of exception, time of occurrence, scope of influence, etc.
Of course, in practice of the application, a producer may be created in the system for sending exception information to the Kafka message middleware. The producer needs to configure the connection information, the topic name, etc. of the Kafka cluster. The anomaly information is then sent by the producer to the specified topic of the Kafka message middleware. The method of sending the message may be called using the producer's API or SDK to send the exception information to the specified topic.
S303, receiving abnormal information from the Kafka message middleware through a designated terminal;
in particular implementations, a consumer may be installed and configured on a designated terminal to receive exception information from the Kafka message middleware. The consumer needs to configure the connection information, topic names, etc. of the Kafka cluster.
The consumer terminal receives anomaly information from the specified topic of the Kafka message middleware. The method of receiving the message may be invoked using the consumer's API or SDK to receive exception information from the specified topic.
S304, when abnormal information is received, performing abnormal processing and fault removal according to a preset abnormal information solution.
Specifically, once the consumer receives the anomaly information, it can be pushed to a designated terminal, such as a cell phone, email, instant messaging tool, etc. The method of pushing the message can be called by using the API or SDK of the terminal to push the abnormal information to the appointed terminal.
In a further embodiment, the method may further include: when the anomaly information is received, the relevant team or stakeholder on the designated terminal may conduct anomaly handling and troubleshooting. And according to the description and the influence range of the abnormal information, adopting corresponding measures to repair the problems and restore normal operation.
In a further embodiment, the method may further include: when the monitoring mechanism triggers an alarm, timely processing an abnormal scene; including investigation of the cause of the anomaly, repair of the problem, notification of the relevant team or stakeholder; ensuring that the abnormal scene is properly processed so as to reduce the influence on the system;
and continuously improving according to feedback of a monitoring mechanism and processing experience of an abnormal scene. This may include optimizing monitoring metrics and thresholds, improving the process flow of the anomaly scenario, and so forth. By continuous improvement, the stability and reliability of the system are improved.
Step S400, based on the pushed abnormal information, inquiring the corresponding abnormal log, and automatically positioning the fault point of the abnormal information.
Specifically, the step S400 includes:
s401, receiving pushed abnormal information;
s402, according to the received abnormal information, using a corresponding query tool or statement to query a corresponding abnormal log in a storage system;
s403, when the abnormal log is queried, analyzing the abnormal log, including extracting key information, analyzing log content and identifying an abnormal mode;
s404, locating fault points of the abnormal information according to the analyzed abnormal logs.
That is, in step S400 of the present application, first, a mechanism for receiving the exception information needs to be established, which may be through a message queue, a log collection tool, or other means. The mechanism is capable of receiving exception information in real-time and storing it in a queriable storage system.
And then, according to the received abnormal information, using a corresponding query tool or statement to query the corresponding abnormal log in the storage system. The query may be based on keywords, time stamps, etc. in the anomaly information.
Once the abnormal log is queried, the queried abnormal log needs to be analyzed. This may include extracting key information, analyzing log content, identifying abnormal patterns, and so forth. The purpose of the exception log is to find out the specific cause and location of the occurrence of the exception information.
And then, according to the analyzed abnormal log, the fault point of the abnormal information can be positioned. This may be a specific code line, function call, configuration file, etc. The purpose of locating the fault point is to further analyze and resolve the anomaly problem.
The fault location of the embodiment of the application can realize automatic location: to enable automated localization, machine learning, natural language processing, etc. techniques may be used to analyze and process the exception log. This may include training models, building rules, using keyword matching, etc. to automatically identify and locate points of failure where anomaly information occurs.
Optionally, in further embodiments of the present application, feedback and processing may be performed on the located fault point: for example, once a fault point is located, the location results may be fed back to the corresponding team or developer so that they can further analyze and resolve the anomaly problem. Feedback and communication may be via mail, instant messaging tools, or other means.
Therefore, through the steps, the method and the device can realize the flow of abnormal information inquiry and automatic fault point positioning based on pushing. The key is to establish a mechanism for receiving and storing the exception information, and to use appropriate tools and techniques to query, parse and locate the exception log. Thus, the abnormal problem can be rapidly positioned, and corresponding measures can be taken to process and repair.
In the embodiment, as shown in fig. 2, the method for processing the system full-link anomaly monitoring in the embodiment of the present application starts anomaly monitoring on an anomaly source through a timing task, monitors whether anomalies occur through timing task anomaly monitoring (xxljobhandlemonitor), and pushes corresponding messages; specifically, the abnormal source can be obtained through an interface and a service; and performing global anomaly monitoring.
(globalExceptionHandler), the runtime fails to capture the exception, global capture, and push the corresponding message; when try-catch captures an exception, then exception notification may be pushed (pushNotify (Exception)) by the exception publisher and exception message publication (ExceptionMsgPush) may be performed; and the abnormal information is packaged into Kafka message middleware and is sent to a designated terminal such as an abnormal consumer through a unified message interface so as to query a corresponding abnormal log according to the abnormal monitored abnormal information and automatically locate the fault point of the abnormal information.
Therefore, the embodiment of the application can realize comprehensive real-time monitoring and processing of the system all-link data, can realize comprehensive system all-link abnormal monitoring and is beneficial to quickly and accurately positioning the fault point.
Exemplary apparatus
As shown in fig. 3, an embodiment of the present application provides a system all-link anomaly monitoring processing apparatus, which includes:
the acquiring and monitoring mechanism setting module 310 is configured to acquire internal flow data of the system, confirm abnormal scenes in the system based on the internal flow data of the system, and set corresponding monitoring mechanisms for the abnormal scenes in the system;
the judging module 320 is configured to perform anomaly monitoring on the system full link, and judge whether anomaly information is monitored based on a monitoring mechanism corresponding to each anomaly scene in the system;
the monitoring exception handling module 330 is configured to push, when exception information is monitored, the exception information to a designated terminal through a unified message interface and through kafka message middleware;
the query and fault location module 340 is configured to query a corresponding exception log based on the pushed exception information, and automatically locate a fault point where the exception information appears, as described above.
Based on the above embodiment, the present application further provides an intelligent terminal, and a functional block diagram thereof may be shown in fig. 4. The intelligent terminal may comprise a processor, a memory, a network interface, a database, and a display screen connected by a system bus. The processor of the intelligent terminal is used for providing computing and control capabilities. The memory of the intelligent terminal comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the intelligent terminal is used for communicating with an external terminal through network connection. The computer program, when executed by a processor, implements a system full link anomaly monitoring processing method. The database of the intelligent terminal is used for storing abnormal information.
It will be appreciated by those skilled in the art that the schematic block diagram shown in fig. 4 is merely a block diagram of a portion of the structure associated with the present inventive arrangements and is not limiting of the smart terminal to which the present inventive arrangements are applied, and that a particular smart terminal may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, a smart terminal is provided that includes a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:
acquiring system internal flow data, confirming abnormal scenes in the system based on the system internal flow data, and setting corresponding monitoring mechanisms for the abnormal scenes in the system;
performing anomaly monitoring on the system full link, and judging whether anomaly information is monitored or not based on monitoring mechanisms corresponding to various anomaly scenes in the system;
when abnormal information is monitored, pushing the abnormal information to a designated terminal through a unified message interface and through a kafka message middleware;
based on the pushed abnormal information, the corresponding abnormal log is queried, and the fault point of the abnormal information is automatically positioned, specifically as described above.
The step of acquiring the internal flow data of the system, confirming abnormal scenes in the system based on the internal flow data of the system, and setting corresponding monitoring mechanisms for the abnormal scenes in the system comprises the following steps:
acquiring and determining the internal flow of a system to be monitored, wherein the internal flow comprises various modules, components or functions of the system;
defining an abnormal scene which can occur for each system internal flow;
and collecting system internal flow data according to different abnormal scenes, setting different monitoring indexes, thresholds and alarm mechanisms based on the system internal flow data, and finishing setting corresponding monitoring mechanisms for detecting the abnormal scenes and triggering corresponding alarms in time.
The step of acquiring the internal flow data of the system, confirming abnormal scenes in the system based on the internal flow data of the system, and setting corresponding monitoring mechanisms for the abnormal scenes in the system comprises the following steps:
and testing and verifying the effectiveness of the monitoring mechanism by simulating an abnormal scene to generate test data.
The step of acquiring the internal flow data of the system, confirming abnormal scenes in the system based on the internal flow data of the system, and setting corresponding monitoring mechanisms for the abnormal scenes in the system comprises the following steps:
according to the evolution and improvement of the system, monitoring indexes and thresholds are adjusted for new abnormal scenes or needed to be adjusted, and monitoring mechanisms corresponding to the abnormal scenes in the system are periodically checked and updated.
The step of monitoring the system full link abnormally, judging whether abnormal information is monitored or not based on a monitoring mechanism corresponding to each abnormal scene in the system comprises the following steps:
collecting monitoring data of all links and components in the system;
analyzing the monitoring data of all links and components in the collected system according to the monitoring index corresponding to the determined abnormal scene;
judging whether abnormal information is monitored according to the analysis result of the monitoring data;
when the monitoring data accords with the condition defined by the abnormal scene, a corresponding alarm is triggered.
When abnormal information is monitored, pushing the abnormal information to a designated terminal through a unified message interface and a kafka message middleware, wherein the step of pushing the abnormal information to the designated terminal comprises the following steps of:
presetting a unified message interface for receiving abnormal information and sending the abnormal information to the Kafka message middleware;
when abnormal information is monitored, the abnormal information is packaged into a message object and is sent to the Kafka message middleware through a message interface;
receiving anomaly information from the Kafka message middleware through the designated terminal;
when the abnormal information is received, performing abnormal processing and fault removal according to a preset abnormal information solution.
The step of inquiring the corresponding abnormal log based on the pushed abnormal information and automatically positioning the fault point of the abnormal information comprises the following steps:
receiving pushed abnormal information;
according to the received abnormal information, using a corresponding query tool or statement to query a corresponding abnormal log in a storage system;
when the abnormal log is queried, analyzing the abnormal log, including extracting key information, analyzing log content and identifying an abnormal mode;
and locating fault points of the abnormal information according to the analyzed abnormal logs, wherein the fault points are specifically described above.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
In summary, the application discloses a method, a device, an intelligent terminal and a storage medium for processing system all-link anomaly monitoring, which are characterized in that the method, the device, the intelligent terminal and the storage medium are used for acquiring system internal flow data, confirming anomaly scenes in a system based on the system internal flow data, and setting corresponding monitoring mechanisms for the anomaly scenes in the system; performing anomaly monitoring on the system full link, and judging whether anomaly information is monitored or not based on monitoring mechanisms corresponding to various anomaly scenes in the system; when abnormal information is monitored, pushing the abnormal information to a designated terminal through a unified message interface and through a kafka message middleware; based on the pushed abnormal information, inquiring the corresponding abnormal log, and automatically positioning the fault point of the abnormal information. The application informs service personnel of system abnormality in real time by means of monitoring, information and the like, processes the abnormality in time, can quickly inquire the needed log when the fault exists, shortens the fault positioning time, further improves the availability and provides convenience for users.
It is to be understood that the application is not limited in its application to the examples described above, but is capable of modification and variation in light of the above teachings by those skilled in the art, and that all such modifications and variations are intended to be included within the scope of the appended claims.

Claims (10)

1. The system all-link abnormality monitoring and processing method is characterized by comprising the following steps:
acquiring system internal flow data, confirming abnormal scenes in the system based on the system internal flow data, and setting corresponding monitoring mechanisms for the abnormal scenes in the system;
performing anomaly monitoring on the system full link, and judging whether anomaly information is monitored or not based on monitoring mechanisms corresponding to various anomaly scenes in the system;
when abnormal information is monitored, pushing the abnormal information to a designated terminal through a unified message interface and through a kafka message middleware;
based on the pushed abnormal information, inquiring the corresponding abnormal log, and automatically positioning the fault point of the abnormal information.
2. The method for processing system all-link anomaly monitoring according to claim 1, wherein the steps of obtaining the system internal flow data, confirming the anomaly scene in the system based on the system internal flow data, and setting the corresponding monitoring mechanism for each anomaly scene in the system comprise:
acquiring and determining the internal flow of a system to be monitored, wherein the internal flow comprises various modules, components or functions of the system;
defining an abnormal scene which can occur for each system internal flow;
and collecting system internal flow data according to different abnormal scenes, setting different monitoring indexes, thresholds and alarm mechanisms based on the system internal flow data, and finishing setting corresponding monitoring mechanisms for detecting the abnormal scenes and triggering corresponding alarms in time.
3. The method for processing system all-link anomaly monitoring according to claim 2, wherein the steps of obtaining the system internal flow data, confirming the anomaly scene in the system based on the system internal flow data, and setting the corresponding monitoring mechanism for each anomaly scene in the system comprise:
and testing and verifying the effectiveness of the monitoring mechanism by simulating an abnormal scene to generate test data.
4. The method for processing system all-link anomaly monitoring according to claim 2, wherein the steps of obtaining the system internal flow data, confirming the anomaly scene in the system based on the system internal flow data, and setting the corresponding monitoring mechanism for each anomaly scene in the system comprise:
according to the evolution and improvement of the system, monitoring indexes and thresholds are adjusted for new abnormal scenes or needed to be adjusted, and monitoring mechanisms corresponding to the abnormal scenes in the system are periodically checked and updated.
5. The method for monitoring and processing the anomaly of the system full link according to claim 1, wherein the step of monitoring the anomaly of the system full link based on the monitoring mechanism corresponding to each anomaly scene in the system, and determining whether the anomaly information is monitored comprises:
collecting monitoring data of all links and components in the system;
analyzing the monitoring data of all links and components in the collected system according to the monitoring index corresponding to the determined abnormal scene;
judging whether abnormal information is monitored according to the analysis result of the monitoring data;
when the monitoring data accords with the condition defined by the abnormal scene, a corresponding alarm is triggered.
6. The system full link anomaly monitoring processing method according to claim 1, wherein when anomaly information is monitored, pushing the anomaly information to a designated terminal through a unified message interface and through a kafka message middleware comprises:
presetting a unified message interface for receiving abnormal information and sending the abnormal information to the Kafka message middleware;
when abnormal information is monitored, the abnormal information is packaged into a message object and is sent to the Kafka message middleware through a message interface;
receiving anomaly information from the Kafka message middleware through the designated terminal;
when the abnormal information is received, performing abnormal processing and fault removal according to a preset abnormal information solution.
7. The system full-link anomaly monitoring and processing method according to claim 1, wherein the step of querying a corresponding anomaly log based on the pushed anomaly information to automatically locate a fault point of the anomaly information comprises:
receiving pushed abnormal information;
according to the received abnormal information, using a corresponding query tool or statement to query a corresponding abnormal log in a storage system;
when the abnormal log is queried, analyzing the abnormal log, including extracting key information, analyzing log content and identifying an abnormal mode;
and locating fault points of the abnormal information according to the analyzed abnormal log.
8. A system full link anomaly monitoring and processing device, the device comprising:
the acquisition and monitoring mechanism setting module is used for acquiring the internal flow data of the system, confirming abnormal scenes in the system based on the internal flow data of the system, and setting corresponding monitoring mechanisms for the abnormal scenes in the system;
the judging module is used for carrying out anomaly monitoring on the whole link of the system and judging whether anomaly information is monitored or not based on a monitoring mechanism corresponding to each anomaly scene in the system;
the monitoring exception handling module is used for pushing the exception information to the appointed terminal through a unified message interface and through kafka message middleware when the exception information is monitored;
the query and fault positioning module is used for querying the corresponding abnormal log based on the pushed abnormal information and automatically positioning the fault point of the abnormal information.
9. An intelligent terminal comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for performing the method of any of claims 1-7.
10. A non-transitory computer readable storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of any one of claims 1-7.
CN202311022542.4A 2023-08-14 2023-08-14 Method and device for monitoring and processing system full link abnormality Pending CN117155768A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311022542.4A CN117155768A (en) 2023-08-14 2023-08-14 Method and device for monitoring and processing system full link abnormality

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311022542.4A CN117155768A (en) 2023-08-14 2023-08-14 Method and device for monitoring and processing system full link abnormality

Publications (1)

Publication Number Publication Date
CN117155768A true CN117155768A (en) 2023-12-01

Family

ID=88907124

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311022542.4A Pending CN117155768A (en) 2023-08-14 2023-08-14 Method and device for monitoring and processing system full link abnormality

Country Status (1)

Country Link
CN (1) CN117155768A (en)

Similar Documents

Publication Publication Date Title
CN111221743B (en) Automatic test method and system
CN109284269B (en) Abnormal log analysis method and device, storage medium and server
CN109684847B (en) Automatic repairing method, device, equipment and storage medium for script loopholes
CN112506894A (en) Service chain log processing method and device based on link tracking and computer equipment
US20160274997A1 (en) End user monitoring to automate issue tracking
CN109408371A (en) Software defect analyzes input method, device, computer equipment and storage medium
CN109324961B (en) System automatic test method, device, computer equipment and storage medium
CN110851471A (en) Distributed log data processing method, device and system
CN114201408A (en) Regression testing method, device, computer equipment and storage medium
CN108665237B (en) Method for establishing automatic inspection model and positioning abnormity based on business system
CN111143185A (en) Log-based fault analysis method and device
CN110347565B (en) Application program abnormity analysis method and device and electronic equipment
CN115952081A (en) Software testing method, device, storage medium and equipment
CN115080299A (en) Software fault feedback processing method, device, medium and equipment
CN115048257A (en) System service function verification method and device, computer equipment and storage medium
CN112632330A (en) Method and device for routing inspection of ATM equipment, computer equipment and storage medium
CN112235128B (en) Transaction path analysis method, device, server and storage medium
CN112214517A (en) Stream data processing method and device, electronic device and storage medium
CN112100035A (en) Page abnormity detection method, system and related device
CN110069382B (en) Software monitoring method, server, terminal device, computer device and medium
CN116610967A (en) Bank system abnormality detection method, device and equipment based on clustering
CN117155768A (en) Method and device for monitoring and processing system full link abnormality
CN115269424A (en) Automatic regression testing method, device, equipment and storage medium for production flow
CN114416420A (en) Equipment problem feedback method and system
CN110362464B (en) Software analysis method and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination