CN114881112A - System anomaly detection method, device, equipment and medium - Google Patents

System anomaly detection method, device, equipment and medium Download PDF

Info

Publication number
CN114881112A
CN114881112A CN202210344526.6A CN202210344526A CN114881112A CN 114881112 A CN114881112 A CN 114881112A CN 202210344526 A CN202210344526 A CN 202210344526A CN 114881112 A CN114881112 A CN 114881112A
Authority
CN
China
Prior art keywords
abnormal
event
tested
model
tracking data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210344526.6A
Other languages
Chinese (zh)
Inventor
饶琛琳
梁玫娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Youtejie Information Technology Co ltd
Original Assignee
Beijing Youtejie Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Youtejie Information Technology Co ltd filed Critical Beijing Youtejie Information Technology Co ltd
Priority to CN202210344526.6A priority Critical patent/CN114881112A/en
Publication of CN114881112A publication Critical patent/CN114881112A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models

Abstract

The invention discloses a system anomaly detection method, a system anomaly detection device, system anomaly detection equipment and a system anomaly detection medium. The method comprises the following steps: acquiring real-time log data generated by a system to be tested in work, and extracting tracking data to be tested from the real-time log data; inputting the tracking data to be detected into a pre-trained anomaly detection model, and acquiring an event detection result of the system to be detected; and if the system to be tested is determined to be an abnormal event, inputting an event detection result of the abnormal event into a pre-trained abnormal pattern recognition model to obtain an abnormal pattern of the system to be tested. According to the technical scheme, the system can be efficiently and accurately detected for the abnormity, and the accuracy and efficiency of the system abnormity detection are improved.

Description

System anomaly detection method, device, equipment and medium
Technical Field
The present invention relates to the field of anomaly detection technologies, and in particular, to a method, an apparatus, a device, and a medium for system anomaly detection.
Background
Log data is the basis for many enterprise applications such as troubleshooting, monitoring, security, compliance, and electronic evidence collection, and simultaneously, log data has a huge analytical value. However, as the big data age comes, the capacity and type of log data are gradually increasing, and the log data has a tendency to exceed the cognitive ability of human beings.
In the prior art, in order to detect the system abnormality, an experienced worker is usually required to analyze a log file, track an event chain, filter noise, and finally diagnose the root cause of the system abnormality. However, the method for analyzing the log data by using the experience of the staff to detect the system abnormality has strong limitation, and the process of abnormality detection is time-consuming and labor-consuming, thereby reducing the efficiency of system abnormality detection. Therefore, how to efficiently and accurately detect the system abnormality is a problem to be solved at present.
Disclosure of Invention
The invention provides a system anomaly detection method, a system anomaly detection device, system anomaly detection equipment and a system anomaly detection medium, which can solve the problems of low accuracy and low efficiency in system anomaly detection.
According to an aspect of the present invention, there is provided a system abnormality detection method, including:
acquiring real-time log data generated by a system to be tested in work, and extracting tracking data to be tested from the real-time log data;
inputting the tracking data to be detected into a pre-trained anomaly detection model, and acquiring an event detection result of the system to be detected;
and if the system to be tested is determined to be an abnormal event, inputting an event detection result of the abnormal event into a pre-trained abnormal pattern recognition model to obtain an abnormal pattern of the system to be tested.
According to another aspect of the present invention, there is provided a system abnormality detection apparatus, including:
the data acquisition module is used for acquiring real-time log data generated by a system to be detected in work and extracting tracking data to be detected from the real-time log data;
the result acquisition module is used for inputting the tracking data to be detected into a pre-trained anomaly detection model and acquiring an event detection result of the system to be detected;
and the mode acquisition module is used for inputting the event detection result of the abnormal event into a pre-trained abnormal mode identification model to acquire the abnormal mode of the system to be detected if the system to be detected is determined to be the abnormal event.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the system anomaly detection method of any of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer-readable storage medium storing computer instructions for causing a processor to implement the system anomaly detection method according to any one of the embodiments of the present invention when the computer instructions are executed.
According to the technical scheme of the embodiment of the invention, the tracking data to be detected in the real-time log data generated by the system to be detected in the working process is input into the pre-trained abnormity detection model, the event detection result of the system to be detected is obtained, and the event detection result when the system to be detected is an abnormity event is input into the pre-trained abnormity pattern recognition model, so that the abnormity pattern of the system to be detected is obtained, a worker can timely and effectively know whether the system is in fault and the specific classification of the fault, the efficiency and the accuracy of system abnormity detection are improved, and meanwhile, the work load of the worker is reduced.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a system anomaly detection method according to an embodiment of the present invention;
FIG. 2a is a flowchart of a system anomaly detection method according to a second embodiment of the present invention;
FIG. 2b is a schematic flow chart of a system anomaly detection method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a system anomaly detection device according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device implementing the system abnormality detection method according to the embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "target," "original," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example one
Fig. 1 is a flowchart of a system anomaly detection method according to an embodiment of the present invention, where the embodiment is applicable to a system anomaly detection situation, the method may be executed by a system anomaly detection apparatus, the system anomaly detection apparatus may be implemented in a form of hardware and/or software, and the system anomaly detection apparatus may be configured in an electronic device. As shown in fig. 1, the method includes:
s110, acquiring real-time log data generated by the system to be tested in the working process, and extracting the tracking data to be tested from the real-time log data.
The system to be tested may refer to a system that needs to perform anomaly detection, such as a micro service system or a cloud computing system. The log data can be data representing the internal working state of the system to be tested, and the state of the system to be tested can be judged by analyzing the data in the log data. The trace data may refer to traces in the log data that represent all messages exchanged between components and clients.
Optionally, the obtaining real-time log data generated by the system to be tested during operation, and extracting the tracking data to be tested from the real-time log data includes: the method comprises the steps of obtaining real-time log data generated by a system to be tested in work, and extracting tracking data to be tested containing a span sequence from the real-time log data, wherein the span sequence comprises a message sender, a service application program interface and a state code. Each piece of to-be-detected tracking data consists of a plurality of span sequences, a feature sequence consists of three attributes of a message sender, a service application program interface and a state code in each span sequence, and when each span sequence needs to be compared, each feature sequence can be compared, so that an effective basis is provided for subsequent work.
S120, inputting the to-be-detected tracking data into a pre-trained anomaly detection model, and obtaining an event detection result of the to-be-detected system.
The anomaly detection model may be a model for judging whether the system to be tested has a fault anomaly according to the tracking data to be tested in the system to be tested. The event detection result may refer to a detection result after the abnormal detection is performed on the system to be detected, and for example, if the system to be detected is a normal system, the event detection result may be a normal event; if the system to be tested is a fault abnormal system, the event detection result may be set as an abnormal event first, and then the event detection result of the abnormal event, such as a missing or error, may be determined according to parameters included in the abnormal event.
S130, if the system to be tested is determined to be an abnormal event, inputting an event detection result of the abnormal event into a pre-trained abnormal pattern recognition model, and obtaining an abnormal pattern of the system to be tested.
The abnormal mode may refer to a specific fault abnormal category of the system to be tested, that is, a category where a specific fault occurs, for example, an attribute in the span sequence is abnormal. The abnormal mode recognition model can be a model for determining the specific fault abnormal category of the system to be tested, so that the fault abnormal category of the system to be tested can be directly provided for workers, the workers can repair the system to be tested in time, and when multiple faults of the same category occur simultaneously, the workers can obtain a solution by analyzing one fault of the category, and the reliability and the recovery capability of the system to be tested are greatly improved.
It should be noted that, in the embodiment of the present invention, after the event detection result of the system to be tested is obtained by using the anomaly detection model and the anomaly mode of the system to be tested is obtained by using the anomaly mode identification model, the event detection result of the system to be tested and the anomaly mode of the system to be tested may also be provided to the worker, so that the worker can timely obtain the relevant result of the system to be tested and perform relevant adjustment on the parameters in the anomaly detection model or the anomaly mode identification model.
According to the technical scheme of the embodiment of the invention, the tracking data to be detected in the real-time log data generated by the system to be detected in the working process is input into the pre-trained abnormity detection model, the event detection result of the system to be detected is obtained, and the event detection result when the system to be detected is an abnormity event is input into the pre-trained abnormity pattern recognition model, so that the abnormity pattern of the system to be detected is obtained, a worker can timely and effectively know whether the system is in fault and the specific classification of the fault, the efficiency and the accuracy of system abnormity detection are improved, and meanwhile, the work load of the worker is reduced.
Example two
Fig. 2a is a flowchart of a system anomaly detection method according to a second embodiment of the present invention, where this embodiment is added based on the above-mentioned embodiment, and in this embodiment, specifically, the training process of the anomaly detection model is added before the to-be-detected tracking data is input into the pre-trained anomaly detection model, and the method specifically includes: injecting a set type fault into a system to be tested to obtain abnormal tracking data; filtering the same span sequences between the abnormal tracking data and the normal tracking data based on the normal training model to obtain the number of the same span sequences and a difference span sequence; and training the target verification model according to the number of the same span sequences and the difference span sequences to obtain an abnormality detection model.
And adding a training process of the abnormal pattern recognition model before inputting the event detection result of the abnormal event into the abnormal pattern recognition model trained in advance, and specifically may include: and acquiring a feature vector of an event detection result of each abnormal event, and clustering the event detection result of each abnormal event by using the occurrence times of the span sequence of the error type and the span sequence of the deletion type to obtain an abnormal mode clustering model.
As shown in fig. 2a, the method comprises:
s210, acquiring real-time log data generated by the system to be tested in the working process, and extracting the tracking data to be tested from the real-time log data.
S220, injecting a set type fault into the system to be tested to obtain abnormal tracking data.
The step of injecting the set type of fault into the system to be tested may refer to performing fault injection operation on the system to be tested under normal operation, rather than the system to be tested under the abnormal fault state. The set type of fault may refer to a preset type of fault, and may be, for example, a system-level fault, such as network delay, an excessive load of a Central Processing Unit (CPU), an increase in packet loss rate, or an application-level fault. In the embodiment of the present invention, the set type of fault may be injected by one fault type alone, or may be injected by a combination of multiple fault types, which is not limited in the embodiment of the present invention. The abnormal trace data may refer to trace data generated after the system under test is injected with a fault.
In an optional embodiment, before injecting a set type of fault into the system under test to obtain the abnormal trace data, the method further includes: the method comprises the steps of obtaining normal tracking data generated when a system to be tested works normally, extracting normal features of each normal tracking data, and constructing a normal training model according to the normal features. The normal trace data may refer to trace data generated when the system to be tested normally operates. The normal training model may refer to a model for judging whether the system is normal or not by tracking features of data. Therefore, whether the system to be tested is a normal system or not can be judged by constructing a normal training model, and an effective basis is provided for subsequent work.
It is worth noting that in the embodiment of the present invention, fault injection or trace data call chain data acquisition may be performed on a normally operating system to be tested under the condition of flow on a mirror line, so that it may be ensured that the original system to be tested may still operate normally.
And S230, filtering the same span sequence between the abnormal tracking data and the normal tracking data based on the normal training model to obtain the number of the same span sequences and a difference span sequence.
Wherein, the number of sequences of the same span may refer to the number of sequences of the same span between the abnormal trace data and the normal trace data. The sequence of difference spans may refer to a sequence of different spans between the abnormal trace data and the normal trace data.
The same span sequence or the different span sequence may be determined according to whether the attribute values of the attributes in the span sequence are the same.
S240, training the target verification model according to the number of the same span sequences and the difference span sequences to obtain an abnormal detection model.
The target verification model may refer to a model, such as a hidden markov model or a variable-order markov model, for detecting whether the system to be detected is an abnormal fault system through probability calculation according to the number of sequences with the same span and the sequence with the different spans.
In an optional embodiment, training the target verification model according to the number of the same span sequences and the difference span sequences to obtain an anomaly detection model, includes: and if the number of the same span sequences is zero, the abnormal detection model outputs an abnormal event, and judges that the event detection result of the abnormal event is error or missing according to the difference span sequences. Specifically, when the number of identical span sequences between the abnormal trace data and the normal trace data is at least one, the feature of the condition is marked as a normal event; when the number of the same span sequences between the abnormal tracking data and the normal tracking data is zero, and the abnormal tracking data is proved to be completely inconsistent with the normal tracking data, the characteristic of the condition is marked as an abnormal event, and meanwhile, under the condition of the abnormal event, the specific fault result can be obtained according to the characteristic of each difference span sequence. Therefore, an effective basis can be provided for subsequent work by obtaining the abnormality detection model.
In another optional embodiment, the determining, according to the difference span sequence, that the event detection result of the abnormal event is an error or a missing, includes: if the abnormal tracking data comprise the difference span sequence and the conditional probability of the variable-order Markov model is lower than a first set threshold, the event detection result of the abnormal event is an error; and if the abnormal tracking data does not contain the difference span sequence and the conditional probability of the variable-order Markov model is higher than a second set threshold, the event detection result of the abnormal event is missing. Wherein, an error may refer to a situation where a new span sequence occurs in the abnormal trace data; a miss may refer to a situation where the sequence of spans is present in normal trace data and absent in abnormal trace data. The first set threshold may refer to a predetermined threshold for evaluating whether the event detection result of the abnormal event is an error, and may be 20% for example. The second set threshold may refer to a predetermined threshold for evaluating whether the event detection result of the abnormal event is missing, and may be 80% as an example. It should be noted that the first set threshold or the second set threshold may be changed according to the actual accuracy requirement of the system under test. Therefore, by further dividing the abnormal conditions, the staff can more intuitively know whether the event detection result of the abnormal event of the system to be tested is wrong or missing.
And S250, inputting the to-be-detected tracking data into a pre-trained anomaly detection model, and acquiring an event detection result of the to-be-detected system.
S260, obtaining a feature vector of an event detection result of each abnormal event, and clustering the event detection result of each abnormal event by using the occurrence times of the span sequence of the error type and the span sequence of the deletion type to obtain an abnormal mode clustering model.
The feature vector may refer to a vector specifically displaying the number of times that the span sequence of the error type and the span sequence of the deletion type occur, and for example, the dimension of the feature vector may be equal to 2 times of the number d of the attributes in all the feature sequences, that is, two dimensions are included. The first d 1 indicates the number of occurrences of the error span sequence under abnormal conditions, and the last d 1 indicates the number of occurrences of the missing span sequence under abnormal conditions. For example, the span sequence includes A, B and C attributes, and the feature vector is [1, 1, 0, 0, 2, 3], that is, it can indicate that the anomaly detection model outputs the event detection results of two error types of anomaly events, which are respectively caused by an a error and a B error; and outputting event detection results of five abnormal events of the deletion type, namely two B deletions and three C deletions. Therefore, the event detection results of all abnormal events are clustered, and data in the same group can be grouped, so that the system fault symptoms can be detected with high precision, and the fault clustering quality is improved.
S270, if the system to be tested is determined to be an abnormal event, inputting an event detection result of the abnormal event into a pre-trained abnormal pattern recognition model, and obtaining an abnormal pattern of the system to be tested.
It should be noted that, in the embodiment of the present invention, training of the anomaly detection model and the anomaly pattern recognition model may be performed in a previous step using the two models, or may be performed before performing anomaly detection on the system under test, which is not limited in this embodiment of the present invention. In addition, if the abnormal mode of the system under test cannot be determined according to the abnormal mode identification model, the code change stage needs to be returned preferentially.
According to the technical scheme of the embodiment of the invention, the abnormity detection model and the abnormity mode identification model are trained in advance, the to-be-detected tracking data in the real-time log data generated by the to-be-detected system in the working process is input into the pre-trained abnormity detection model, the event detection result of the to-be-detected system is obtained, the event detection result when the to-be-detected system is an abnormity event is input into the pre-trained abnormity mode identification model, and the abnormity mode of the to-be-detected system is obtained, so that a worker can timely and effectively know whether the system is in fault and the specific classification of the fault, the efficiency and the accuracy of system abnormity detection are improved, and meanwhile, the work load of the worker is reduced.
FIG. 2b is a schematic flow chart of a system anomaly detection method according to a second embodiment of the present invention; specifically, normal tracking data of a system to be tested is collected, a normal training model is built on the basis of the tracking data, further, a fault injection tool is used for injecting a set type of fault into the system to be tested to obtain abnormal tracking data, and an abnormal detection model is built on the basis of the abnormal tracking data and the normal tracking data; clustering the event detection result of the abnormal event output by the abnormal detection model to construct an abnormal pattern recognition model and an abnormal pattern of the system to be detected; and finally, performing visual processing on the event detection result output by the abnormality detection model and the abnormal mode output by the abnormal mode identification model to realize the abnormal detection of the system to be detected.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a system anomaly detection device according to a third embodiment of the present invention. As shown in fig. 3, the apparatus includes: a data acquisition module 310, a result acquisition module 320, a mode acquisition module 330;
the data acquisition module 310 is configured to acquire real-time log data generated by a system to be tested during working, and extract trace data to be tested from the real-time log data;
a result obtaining module 320, configured to input the to-be-detected tracking data into a pre-trained anomaly detection model, and obtain an event detection result for the to-be-detected system;
the mode obtaining module 330 is configured to, if it is determined that the system to be tested is an abnormal event, input an event detection result of the abnormal event into a pre-trained abnormal mode identification model, and obtain an abnormal mode of the system to be tested.
According to the technical scheme of the embodiment of the invention, the tracking data to be detected in the real-time log data generated by the system to be detected in the working process is input into the pre-trained abnormity detection model, the event detection result of the system to be detected is obtained, and the event detection result when the system to be detected is an abnormity event is input into the pre-trained abnormity pattern recognition model, so that the abnormity pattern of the system to be detected is obtained, a worker can timely and effectively know whether the system is in fault and the specific classification of the fault, the efficiency and the accuracy of system abnormity detection are improved, and meanwhile, the work load of the worker is reduced.
Optionally, the data obtaining module 310 may be specifically configured to obtain real-time log data generated by the system to be tested during operation, and extract the trace data to be tested including a span sequence from the real-time log data, where the span sequence includes a message sender, a service application program interface, and a status code.
Optionally, the system anomaly detection apparatus may further include an anomaly detection model construction module, which may specifically include an anomaly tracking data acquisition unit, a data screening unit, and an anomaly detection model construction unit;
the system comprises an abnormal tracking data acquisition unit, a fault detection unit and a fault detection unit, wherein the abnormal tracking data acquisition unit is used for injecting a set type of fault into a system to be detected before inputting the tracking data to be detected into a pre-trained abnormal detection model to obtain abnormal tracking data;
the data screening unit is used for filtering the same span sequences between the abnormal tracking data and the normal tracking data based on the normal training model to obtain the number of the same span sequences and the difference span sequences;
and the anomaly detection model construction unit is used for training the target verification model according to the number of the same span sequences and the difference span sequences to obtain an anomaly detection model.
Optionally, the system anomaly detection apparatus may further include a normal training model construction module, configured to acquire normal tracking data generated when the system to be tested normally works before a set type of fault is injected into the system to be tested to obtain the abnormal tracking data, extract normal features of each normal tracking data, and construct a normal training model according to the normal features.
Optionally, the anomaly detection model constructing unit may be specifically configured to, if the number of identical span sequences is zero, output an anomaly event by the anomaly detection model, and determine that an event detection result of the anomaly event is an error or a loss according to the difference span sequence.
Optionally, the anomaly detection model constructing unit may be specifically configured to determine that an event detection result of the abnormal event is an error if the anomaly tracking data includes a difference span sequence and the conditional probability of the variable-order markov model is lower than a first set threshold; and if the abnormal tracking data does not contain the difference span sequence and the conditional probability of the variable-order Markov model is higher than a second set threshold, the event detection result of the abnormal event is missing.
Optionally, the system anomaly detection apparatus may further include an anomaly pattern clustering model constructing unit, configured to obtain a feature vector of an event detection result of each anomaly event before inputting the event detection result of the anomaly event into a pre-trained anomaly pattern recognition model, and cluster the event detection result of each anomaly event by using the number of times that a span sequence of an error type and a span sequence of a deletion type occur, so as to obtain an anomaly pattern clustering model.
The system anomaly detection device provided by the embodiment of the invention can execute the system anomaly detection method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
FIG. 4 shows a schematic block diagram of an electronic device 410 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, electronic device 410 includes at least one processor 420, and a memory communicatively coupled to at least one processor 420, such as a Read Only Memory (ROM)430, a Random Access Memory (RAM)440, etc., where the memory stores computer programs that may be executed by at least one processor, and processor 420 may perform various suitable actions and processes according to the computer programs stored in Read Only Memory (ROM)430 or loaded from storage unit 490 into Random Access Memory (RAM) 440. In the RAM440, various programs and data required for the operation of the electronic device 410 may also be stored. The processor 420, the ROM 430 and the RAM440 are connected to each other through a bus 450. An input/output (I/O) interface 460 is also connected to bus 450.
Various components in the electronic device 410 are connected to the I/O interface 460, including: an input unit 470 such as a keyboard, a mouse, etc.; an output unit 480 such as various types of displays, speakers, and the like; a storage unit 490, such as a magnetic disk, optical disk, or the like; and a communication unit 4100 such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 4100 allows the electronic device 410 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
Processor 420 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of processor 420 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. Processor 420 performs the various methods and processes described above, such as the system anomaly detection method.
The method comprises the following steps:
acquiring real-time log data generated by a system to be tested in work, and extracting tracking data to be tested from the real-time log data;
inputting the tracking data to be detected into a pre-trained anomaly detection model, and acquiring an event detection result of the system to be detected;
and if the system to be tested is determined to be an abnormal event, inputting an event detection result of the abnormal event into a pre-trained abnormal pattern recognition model to obtain an abnormal pattern of the system to be tested.
In some embodiments, the system anomaly detection method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as storage unit 490. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 410 via the ROM 430 and/or the communication unit 4100. When the computer program is loaded into RAM440 and executed by processor 420, one or more steps of the system anomaly detection method described above may be performed. Alternatively, in other embodiments, processor 420 may be configured to perform the system anomaly detection method by any other suitable means (e.g., by way of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for system anomaly detection, comprising:
acquiring real-time log data generated by a system to be tested in work, and extracting tracking data to be tested from the real-time log data;
inputting the tracking data to be detected into a pre-trained anomaly detection model, and acquiring an event detection result of the system to be detected;
and if the system to be tested is determined to be an abnormal event, inputting an event detection result of the abnormal event into a pre-trained abnormal pattern recognition model to obtain an abnormal pattern of the system to be tested.
2. The method of claim 1, wherein the obtaining real-time log data generated by a system under test during operation and extracting trace data under test from the real-time log data comprises:
the method comprises the steps of obtaining real-time log data generated by a system to be tested in work, and extracting tracking data to be tested containing a span sequence from the real-time log data, wherein the span sequence comprises a message sender, a service application program interface and a state code.
3. The method of claim 1, further comprising, prior to inputting the trace data to be tested into a pre-trained anomaly detection model:
injecting a set type fault into a system to be tested to obtain abnormal tracking data;
filtering the same span sequence between the abnormal tracking data and the normal tracking data based on the normal training model to obtain the number of the same span sequences and a difference span sequence;
and training the target verification model according to the number of the same span sequences and the difference span sequences to obtain an abnormality detection model.
4. The method of claim 3, further comprising, before injecting a set type of fault into the system under test to obtain the anomaly tracking data:
the method comprises the steps of obtaining normal tracking data generated when a system to be tested works normally, extracting normal features of each normal tracking data, and constructing a normal training model according to the normal features.
5. The method of claim 3, wherein training the target verification model according to the number of sequences with the same span and the sequence with the different span to obtain the anomaly detection model comprises:
and if the number of the same span sequences is zero, the abnormal detection model outputs an abnormal event, and judges that the event detection result of the abnormal event is error or missing according to the difference span sequences.
6. The method according to claim 5, wherein the determining whether the event detection result of the abnormal event is an error or missing according to the difference span sequence comprises:
if the abnormal tracking data comprise the difference span sequence and the conditional probability of the variable-order Markov model is lower than a first set threshold, the event detection result of the abnormal event is an error;
and if the abnormal tracking data does not contain the difference span sequence and the conditional probability of the variable-order Markov model is higher than a second set threshold, the event detection result of the abnormal event is missing.
7. The method of claim 1, further comprising, before inputting the event detection result of the abnormal event into a pre-trained abnormal pattern recognition model:
and acquiring a feature vector of an event detection result of each abnormal event, and clustering the event detection result of each abnormal event by using the occurrence times of the span sequence of the error type and the span sequence of the deletion type to obtain an abnormal mode clustering model.
8. A system abnormality detection device, characterized by comprising:
the data acquisition module is used for acquiring real-time log data generated by a system to be detected in work and extracting tracking data to be detected from the real-time log data;
the result acquisition module is used for inputting the tracking data to be detected into a pre-trained anomaly detection model and acquiring an event detection result of the system to be detected;
and the mode acquisition module is used for inputting the event detection result of the abnormal event into a pre-trained abnormal mode identification model to acquire the abnormal mode of the system to be detected if the system to be detected is determined to be the abnormal event.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the system anomaly detection method of any one of claims 1-7.
10. A computer-readable storage medium storing computer instructions for causing a processor to implement the system anomaly detection method of any one of claims 1-7 when executed.
CN202210344526.6A 2022-03-31 2022-03-31 System anomaly detection method, device, equipment and medium Pending CN114881112A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210344526.6A CN114881112A (en) 2022-03-31 2022-03-31 System anomaly detection method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210344526.6A CN114881112A (en) 2022-03-31 2022-03-31 System anomaly detection method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN114881112A true CN114881112A (en) 2022-08-09

Family

ID=82668628

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210344526.6A Pending CN114881112A (en) 2022-03-31 2022-03-31 System anomaly detection method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN114881112A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115793588A (en) * 2022-12-21 2023-03-14 广州市智慧农业服务股份有限公司 Data acquisition method and system based on industrial Internet of things

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103235933A (en) * 2013-04-15 2013-08-07 东南大学 Vehicle abnormal behavior detection method based on Hidden Markov Model
US20170147927A1 (en) * 2015-11-23 2017-05-25 International Business Machines Corporation Detection Algorithms for Distributed Emission Sources of Abnormal Events
CN111782472A (en) * 2020-06-30 2020-10-16 平安科技(深圳)有限公司 System abnormality detection method, device, equipment and storage medium
CN112905380A (en) * 2021-03-22 2021-06-04 上海海事大学 System anomaly detection method based on automatic monitoring log
CN113228006A (en) * 2018-12-17 2021-08-06 华为技术有限公司 Apparatus and method for detecting anomalies in successive events and computer program product thereof
CN114118295A (en) * 2021-12-07 2022-03-01 苏州浪潮智能科技有限公司 Anomaly detection model training method, anomaly detection device and medium
CN114138973A (en) * 2021-12-03 2022-03-04 大连海事大学 Log sequence anomaly detection method based on contrast countertraining

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103235933A (en) * 2013-04-15 2013-08-07 东南大学 Vehicle abnormal behavior detection method based on Hidden Markov Model
US20170147927A1 (en) * 2015-11-23 2017-05-25 International Business Machines Corporation Detection Algorithms for Distributed Emission Sources of Abnormal Events
CN113228006A (en) * 2018-12-17 2021-08-06 华为技术有限公司 Apparatus and method for detecting anomalies in successive events and computer program product thereof
CN111782472A (en) * 2020-06-30 2020-10-16 平安科技(深圳)有限公司 System abnormality detection method, device, equipment and storage medium
CN112905380A (en) * 2021-03-22 2021-06-04 上海海事大学 System anomaly detection method based on automatic monitoring log
CN114138973A (en) * 2021-12-03 2022-03-04 大连海事大学 Log sequence anomaly detection method based on contrast countertraining
CN114118295A (en) * 2021-12-07 2022-03-01 苏州浪潮智能科技有限公司 Anomaly detection model training method, anomaly detection device and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
镰刀韭菜: "DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning", 《CSDN》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115793588A (en) * 2022-12-21 2023-03-14 广州市智慧农业服务股份有限公司 Data acquisition method and system based on industrial Internet of things
CN115793588B (en) * 2022-12-21 2023-09-08 广东暨通信息发展有限公司 Data acquisition method and system based on industrial Internet of things

Similar Documents

Publication Publication Date Title
CN115033463B (en) System exception type determining method, device, equipment and storage medium
CN116049146B (en) Database fault processing method, device, equipment and storage medium
CN115529595A (en) Method, device, equipment and medium for detecting abnormity of log data
CN109308225B (en) Virtual machine abnormality detection method, device, equipment and storage medium
CN114924990A (en) Abnormal scene testing method and electronic equipment
CN115396289A (en) Fault alarm determination method and device, electronic equipment and storage medium
CN115686910A (en) Fault analysis method and device, electronic equipment and medium
CN115794578A (en) Data management method, device, equipment and medium for power system
CN116089231A (en) Fault alarm method and device, electronic equipment and storage medium
CN114881112A (en) System anomaly detection method, device, equipment and medium
CN117034149A (en) Fault processing strategy determining method and device, electronic equipment and storage medium
CN116226644A (en) Method and device for determining equipment fault type, electronic equipment and storage medium
CN115687406A (en) Sampling method, device and equipment of call chain data and storage medium
CN115437961A (en) Data processing method and device, electronic equipment and storage medium
CN115509797A (en) Method, device, equipment and medium for determining fault category
CN113535458B (en) Abnormal false alarm processing method and device, storage medium and terminal
CN114885014A (en) Method, device, equipment and medium for monitoring external field equipment state
CN115829160B (en) Time sequence abnormality prediction method, device, equipment and storage medium
CN116149933B (en) Abnormal log data determining method, device, equipment and storage medium
CN117493127B (en) Application program detection method, device, equipment and medium
CN116820826B (en) Root cause positioning method, device, equipment and storage medium based on call chain
CN116185765B (en) Alarm processing method and device, electronic equipment and storage medium
CN115774648A (en) Abnormity positioning method, device, equipment and storage medium
CN117573412A (en) System fault early warning method and device, electronic equipment and storage medium
CN117743093A (en) Data quality evaluation method, device, equipment and medium of call chain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220809

RJ01 Rejection of invention patent application after publication