CN111858242A - System log anomaly detection method and device, electronic equipment and storage medium - Google Patents

System log anomaly detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111858242A
CN111858242A CN202010664669.6A CN202010664669A CN111858242A CN 111858242 A CN111858242 A CN 111858242A CN 202010664669 A CN202010664669 A CN 202010664669A CN 111858242 A CN111858242 A CN 111858242A
Authority
CN
China
Prior art keywords
vector
system log
neural network
network model
log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010664669.6A
Other languages
Chinese (zh)
Other versions
CN111858242B (en
Inventor
庆隆阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010664669.6A priority Critical patent/CN111858242B/en
Publication of CN111858242A publication Critical patent/CN111858242A/en
Application granted granted Critical
Publication of CN111858242B publication Critical patent/CN111858242B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3034Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a storage system, e.g. DASD based or network based
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a system log abnormity detection method, a device, an electronic device and a computer readable storage medium, wherein the method comprises the following steps: acquiring an original system log, and analyzing the original system log into structured data; extracting a message count vector and a flow state vector from the structured data; the message counting vector represents the message type characteristics of the logs containing the same general identification, and the flow state vector represents the system behavior characteristics in a preset time window; splicing the message counting vector and the flow state vector into feature vectors, and labeling the log state of each feature vector; and training a neural network model based on the feature vectors and the corresponding labels so as to perform system log anomaly detection by using the trained neural network model. The system log anomaly detection method improves the system log anomaly detection efficiency.

Description

System log anomaly detection method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for detecting system log anomalies, an electronic device, and a computer-readable storage medium.
Background
In the daily operation of the storage cluster, the system log generated by the cluster operation is suddenly increased, and for the system log of the cluster, the system log records strong relevant information of a specific event in the whole set of system. The log content records not only the information that the system operates normally, but also the abnormal information generated by the cluster system. In terms of stability and security of the operating state of the cluster system, people pay more attention to abnormal information in the system log. At present, aiming at the abnormal detection in the system log, developers often use keyword search and rule matching to manually check the log.
Therefore, how to improve the efficiency of system log anomaly detection is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
The application aims to provide a system log abnormity detection method and device, an electronic device and a computer readable storage medium, and efficiency of system log abnormity detection is improved.
In order to achieve the above object, the present application provides a system log anomaly detection method, including:
acquiring an original system log, and analyzing the original system log into structured data;
Extracting a message count vector and a flow state vector from the structured data; the message counting vector represents the message type characteristics of the logs containing the same general identification, and the flow state vector represents the system behavior characteristics in a preset time window;
splicing the message counting vector and the flow state vector into feature vectors, and labeling the log state of each feature vector;
and training a neural network model based on the feature vectors and the corresponding labels so as to perform system log anomaly detection by using the trained neural network model.
Wherein parsing the original system log into structured data comprises:
an execution path is extracted from the original system log to parse the original system log into structured data.
Wherein extracting a message count vector in the structured data comprises:
determining a universal identifier contained in each log in the structured data, and classifying all the logs based on the universal identifier;
creating a message count vector for each category; and each dimension of the message counting vector corresponds to a message type one by one, and the value of each dimension is the number of logs of the corresponding message type in the category.
Wherein extracting a flow state vector in the structured data comprises:
determining a state variable type contained in each log in the structured data, and classifying all the logs based on the state variable types;
creating a flow state vector for each category; and each dimension of the flow state vector corresponds to a value of the state variable type one by one, and the value of each dimension is the number of the values of the state variable type in the category.
Wherein training a neural network model based on the feature vectors and corresponding labels comprises:
dividing all the feature vectors into a training set and a test set according to a preset proportion; wherein the training set and the test set each comprise a plurality of feature vectors and corresponding labels;
initializing weights and biases of all layers in the neural network model, training the neural network model by using the training set, and adjusting the weights and biases of the layers by using a back propagation error-based criterion to obtain a trained neural network model;
and testing the trained neural network model by using the test set.
Wherein the initializing weights and biases of all layers in the neural network model comprises:
and initializing weights and biases of all layers in the neural network model by utilizing a Gaussian function.
The method for detecting the system log abnormity by using the trained neural network model comprises the following steps:
acquiring a system log to be detected, and analyzing the system log to be detected into target structured data;
extracting a target message counting vector and a target process state vector from the target structured data, and splicing the target message counting vector and the target process state vector into a target feature vector;
and inputting the target characteristic vector into the trained neural network model so as to obtain a detection result of the system log to be detected.
In order to achieve the above object, the present application provides a system log abnormality detection apparatus, including:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring an original system log and analyzing the original system log into structured data;
the extraction module is used for extracting a message counting vector and a process state vector from the structured data; the message counting vector represents the message type characteristics of the logs containing the same general identification, and the flow state vector represents the system behavior characteristics in a preset time window;
The splicing module is used for splicing the message counting vector and the flow state vector into feature vectors and labeling the log state of each feature vector;
a training module for training a neural network model based on the feature vectors and the corresponding labels;
and the detection module is used for detecting the system log abnormity by utilizing the trained neural network model.
To achieve the above object, the present application provides an electronic device including:
a memory for storing a computer program;
and the processor is used for realizing the steps of the system log abnormity detection method when executing the computer program.
To achieve the above object, the present application provides a computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the above system log anomaly detection method.
According to the scheme, the system log abnormity detection method comprises the following steps: acquiring an original system log, and analyzing the original system log into structured data; extracting a message count vector and a flow state vector from the structured data; the message counting vector represents the message type characteristics of the logs containing the same general identification, and the flow state vector represents the system behavior characteristics in a preset time window; splicing the message counting vector and the flow state vector into feature vectors, and labeling the log state of each feature vector; and training a neural network model based on the feature vectors and the corresponding labels so as to perform system log anomaly detection by using the trained neural network model.
According to the system log anomaly detection method, the original system log is subjected to structuralization processing and feature extraction to form a feature vector, and the feature vector is subjected to log state labeling, namely labeling of positive and negative samples. The training of the neural network model is carried out by utilizing the characteristic vectors and the corresponding labels, and the trained neural network model is applied to the detection of the cluster system logs, so that the detection speed and the detection accuracy can be improved, the abnormal efficiency of a developer positioning system is improved, and the stability and the safety of the cluster system can be better maintained. The application also discloses a system log abnormity detection device, an electronic device and a computer readable storage medium, which can also realize the technical effects.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
FIG. 1 is a flow diagram illustrating a method for system log anomaly detection in accordance with an exemplary embodiment;
FIG. 2 is a flow diagram illustrating another method of system log anomaly detection in accordance with an illustrative embodiment;
FIG. 3 is a block diagram illustrating a system log anomaly detection apparatus according to an exemplary embodiment;
FIG. 4 is a block diagram illustrating an electronic device in accordance with an exemplary embodiment.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application discloses a system log abnormity detection method, which improves the system log abnormity detection efficiency.
Referring to fig. 1, a flowchart of a system log anomaly detection method according to an exemplary embodiment is shown, as shown in fig. 1, including:
s101: acquiring an original system log, and analyzing the original system log into structured data;
The original system log is an event record generated during the operation of the network device, system and service program, in which each row of log records the description of the relevant operation such as date, time, user and action. It can be understood that the original system log is unstructured text data, and can be directly obtained from a log file, and the original system log needs to be parsed before anomaly detection, where the parsing is to extract a set of event templates, so that the original log can be constructed to form structured data.
As a preferred embodiment, the step of parsing the original system log into structured data may comprise: an execution path is extracted from the original system log to parse the original system log into structured data. In a specific implementation, the log may be parsed by a logkey method, each log is composed of a constant and a variable, the constant is a message directly printed by a system program source code, and the variable is generally a timestamp or a parameter value. The common constant message in all similar log entries is the logkey, which is used to indicate the message type. The normal log output can follow a certain flow and sequence, which is generally called as the execution path of the log, and the logkey sequence can represent the execution path of the log, so that extracting logkey from the log is an effective log parsing method.
S102: extracting a message count vector and a flow state vector from the structured data; the message counting vector represents the message type characteristics of the logs containing the same general identification, and the flow state vector represents the system behavior characteristics in a preset time window;
s103: splicing the message counting vector and the flow state vector into feature vectors, and labeling the log state of each feature vector;
the feature extraction is to extract features capable of representing log attributes from the analyzed structured data, and finally form a group of specific digital feature vectors. In a specific implementation, a message count vector and a flow state vector are extracted for each log, respectively. The message count vector emphasizes detecting errors from different angles based on the cross sequence behavior of different modules of the system, and the flow state vector can capture the system behavior characteristics in a preset time window.
It should be noted that each log includes a common identifier, and the logs containing the same common identifier convey a piece of information about the common identifier, and by grouping these different messages, a message count vector can be obtained, which is similar to the execution path. The process of extracting the message count vector is: determining a universal identifier contained in each log in the structured data, and classifying all logs based on the universal identifiers, namely the logs containing the same universal identifier belong to the same category; creating a message count vector for each category; each dimension of the message counting vector corresponds to a message type one by one, and the value of each dimension is the number of logs of the corresponding message type in the category.
The state variable occurs in most log messages, and the relative frequency of each value of the state variable in a preset time window is usually constant during normal execution of the system, but changes significantly when the system has a problem. The process of extracting the flow state vector comprises the following steps: determining a state variable type contained in each log in the structured data, classifying all logs based on the state variable type, and creating a flow state vector for each category; each dimension of the flow state vector corresponds to a value of the state variable type one by one, and the value of each dimension is the number of the values of the state variable type in the category. Each flow state vector represents a state variable type in a preset time window, and each dimension of the flow state vector corresponds to a different state variable value, the value of the dimension being the number of times the state variable value appears in the preset time window.
The message counting vector and the flow state vector can be used as detection indexes of log abnormity in a mutual assistance mode, and therefore the message counting vector and the flow state vector are spliced into a feature vector. After the feature vector is extracted, the log state corresponding to the feature vector needs to be marked, including a normal state or an abnormal state, that is, the positive and negative samples are marked.
S104: and training a neural network model based on the feature vectors and the corresponding labels so as to perform system log anomaly detection by using the trained neural network model.
In this step, the neural network model is trained by using the feature vector and the label corresponding to each log, and the trained neural network model can be used for detecting the abnormality of the system log. Specifically, the step of training the neural network model based on the feature vectors and the corresponding labels may include: dividing all the feature vectors into a training set and a test set according to a preset proportion; wherein the training set and the test set each comprise a plurality of feature vectors and corresponding labels; initializing weights and biases of all layers in the neural network model, training the neural network model by using the training set, and adjusting the weights and biases of the layers by using a back propagation error-based criterion to obtain a trained neural network model; and testing the trained neural network model by using the test set.
In the specific implementation, the training set and the test set are divided according to a preset proportion, and the proportion of the test set to the training set can be 8:2, wherein the training set is used for training the model, and the test set is used for verifying the quality of the training model. The neural network model is preferably a CNN model, the CNN model is firstly subjected to cross optimization training by using a training set, massive samples are required for training the CNN model, usually ten thousand of samples are taken as units, and therefore, if the CNN model is directly used for training the model under the condition that the number of the samples is not sufficient, the CNN model has the influences of overfitting, low convergence speed and poor final classification effect. To cope with the above-mentioned influence, the structure of CNN needs to be optimized. In order to improve the nonlinear mapping capability of the network and solve the problem of gradient dispersion or gradient disappearance, a ReLU activation function can be used to replace the traditional activation function.
The neural network model of the embodiment comprises an input layer, a hidden layer, an output layer, a convolution layer and a pooling layer which are alternately connected, and a full-connection layer. In the log model training process, the data set far short of the mass level exists, so that the traditional convolutional layer C5 can be removed, the calculated amount and the model parameter to be trained can be reduced, and the phenomenon of model overfitting under the condition of insufficient sample number can be solved. In addition to the above improvement, a drop out strategy can be added to the last fully connected layer of the model, and in each iterative training process, the neurons of the model are in a dormant state according to a set probability, but the training parameters at this time still conform to the idea of weight sharing of the neural network. In the training process, the weights and the bias of each layer are gradually adjusted by using a back propagation based on the criterion of minimizing errors, and batch training is selected in a training mode to avoid back searching. Preferably, the weights and biases of all layers in the neural network model can be initialized using gaussian functions. After the model is optimally trained, the effect of the model needs to be tested by using a test set so as to verify the quality of the model.
According to the system log anomaly detection method provided by the embodiment of the application, the original system log is subjected to structural processing and feature extraction to form the feature vector, and the log state of the feature vector is labeled, namely, the labeling of positive and negative samples is carried out. The training of the neural network model is carried out by utilizing the characteristic vectors and the corresponding labels, and the trained neural network model is applied to the detection of the cluster system logs, so that the detection speed and the detection accuracy can be improved, the abnormal efficiency of a developer positioning system is improved, and the stability and the safety of the cluster system can be better maintained.
The following describes an anomaly detection process of a system log, specifically:
referring to fig. 2, a flowchart of another method for detecting anomalies in a system log according to an exemplary embodiment is shown, as shown in fig. 2, including:
s201: acquiring a system log to be detected, and analyzing the system log to be detected into target structured data;
s202: extracting a target message counting vector and a target process state vector from the target structured data, and splicing the target message counting vector and the target process state vector into a target feature vector;
s203: and inputting the target characteristic vector into the trained neural network model so as to obtain a detection result of the system log to be detected.
The purpose of this embodiment is to perform anomaly detection on the system log to be detected by using the trained neural network model. In specific implementation, firstly, the system log to be detected is subjected to structural processing, and is analyzed into target structural data. Secondly, extracting a system log target message counting vector and a target process state vector to be detected from the target structured data, and splicing the two vectors into a target characteristic vector. And finally, inputting the target characteristic vector into the trained neural network model to obtain a detection result of the system log to be detected.
In the following, a system log anomaly detection device provided by the embodiment of the present application is introduced, and a system log anomaly detection device described below and a system log anomaly detection method described above may be referred to each other.
Referring to fig. 3, a block diagram of a system log abnormality detecting apparatus according to an exemplary embodiment is shown, as shown in fig. 3, including:
an obtaining module 301, configured to obtain an original system log, and analyze the original system log into structured data;
an extracting module 302, configured to extract a message count vector and a flow state vector from the structured data; the message counting vector represents the message type characteristics of the logs containing the same general identification, and the flow state vector represents the system behavior characteristics in a preset time window;
A splicing module 303, configured to splice the message count vector and the flow state vector into feature vectors, and label a log state for each feature vector;
a training module 304 for training a neural network model based on the feature vectors and corresponding labels;
and a detection module 305, configured to perform system log anomaly detection by using the trained neural network model.
The system log anomaly detection device provided by the embodiment of the application performs structural processing and feature extraction on an original system log to form a feature vector, and performs log state labeling on the feature vector, namely labeling positive and negative samples. The training of the neural network model is carried out by utilizing the characteristic vectors and the corresponding labels, and the trained neural network model is applied to the detection of the cluster system logs, so that the detection speed and the detection accuracy can be improved, the abnormal efficiency of a developer positioning system is improved, and the stability and the safety of the cluster system can be better maintained.
On the basis of the foregoing embodiment, as a preferred implementation, the obtaining module 301 includes:
the acquisition unit is used for acquiring an original system log;
And the analysis unit is used for extracting an execution path from the original system log so as to analyze the original system log into structured data.
On the basis of the foregoing embodiment, as a preferred implementation, the extraction module 302 includes:
the first extraction unit is used for determining a universal identifier contained in each log in the structured data, classifying all the logs based on the universal identifier and creating a message counting vector for each category; each dimension of the message counting vector corresponds to a message type one by one, and the value of each dimension is the number of logs of the corresponding message type in the category;
the second extraction unit is used for determining the state variable type contained in each log in the structured data, classifying all the logs based on the state variable type and creating a flow state vector for each category; and each dimension of the flow state vector corresponds to a value of the state variable type one by one, and the value of each dimension is the number of the values of the state variable type in the category.
On the basis of the above embodiment, as a preferred implementation, the training module 304 includes:
The dividing unit is used for dividing all the characteristic vectors into a training set and a test set according to a preset proportion; wherein the training set and the test set each comprise a plurality of feature vectors and corresponding labels;
the training unit is used for initializing weights and biases of all layers in the neural network model, training the neural network model by using the training set, and adjusting the weights and biases of the layers by using a back propagation error-based criterion to obtain the trained neural network model;
and the test unit is used for testing the trained neural network model by using the test set.
On the basis of the above embodiment, as a preferred implementation manner, the training unit is specifically a unit that initializes the weights and biases of all layers in the neural network model by using a gaussian function, trains the neural network model by using the training set, and adjusts the weights and biases of the layers by using a back propagation criterion based on a minimized error, so as to obtain a trained neural network model.
On the basis of the foregoing embodiment, as a preferred implementation, the detection module 305 includes:
The system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a system log to be detected and analyzing the system log to be detected into target structured data;
a third extraction unit, configured to extract a target message count vector and a target flow state vector from the target structured data, and splice the target message count vector and the target flow state vector into a target feature vector;
and the detection unit is used for inputting the target characteristic vector into the trained neural network model so as to obtain a detection result of the system log to be detected.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
The present application further provides an electronic device, and referring to fig. 4, a structure diagram of an electronic device 400 provided in an embodiment of the present application, as shown in fig. 4, may include a processor 11 and a memory 12. The electronic device 400 may also include one or more of a multimedia component 13, an input/output (I/O) interface 14, and a communication component 15.
The processor 11 is configured to control the overall operation of the electronic device 400, so as to complete all or part of the steps in the above-mentioned system log abnormality detection method. The memory 12 is used to store various types of data to support operation at the electronic device 400, such as instructions for any application or method operating on the electronic device 400 and application-related data, such as contact data, transmitted and received messages, pictures, audio, video, and so forth. The Memory 12 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk. The multimedia component 13 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in the memory 12 or transmitted via the communication component 15. The audio assembly also includes at least one speaker for outputting audio signals. The I/O interface 14 provides an interface between the processor 11 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 15 is used for wired or wireless communication between the electronic device 400 and other devices. Wireless communication, such as Wi-Fi, bluetooth, Near Field Communication (NFC), 2G, 3G or 4G, or a combination of one or more of them, so that the corresponding communication component 15 may include: Wi-Fi module, bluetooth module, NFC module.
In an exemplary embodiment, the electronic Device 400 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components, for performing the above-described system log anomaly detection method.
In another exemplary embodiment, a computer readable storage medium comprising program instructions which, when executed by a processor, implement the steps of the above-described system log anomaly detection method is also provided. For example, the computer readable storage medium may be the memory 12 described above including program instructions that are executable by the processor 11 of the electronic device 400 to perform the above-described system log anomaly detection method.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A system log anomaly detection method is characterized by comprising the following steps:
acquiring an original system log, and analyzing the original system log into structured data;
extracting a message count vector and a flow state vector from the structured data; the message counting vector represents the message type characteristics of the logs containing the same general identification, and the flow state vector represents the system behavior characteristics in a preset time window;
Splicing the message counting vector and the flow state vector into feature vectors, and labeling the log state of each feature vector;
and training a neural network model based on the feature vectors and the corresponding labels so as to perform system log anomaly detection by using the trained neural network model.
2. The method of detecting anomalies in a system log according to claim 1, wherein parsing the original system log into structured data includes:
an execution path is extracted from the original system log to parse the original system log into structured data.
3. The method of detecting anomalies in logs of a system according to claim 1, characterized in that extracting a message count vector in said structured data comprises:
determining a universal identifier contained in each log in the structured data, and classifying all the logs based on the universal identifier;
creating a message count vector for each category; and each dimension of the message counting vector corresponds to a message type one by one, and the value of each dimension is the number of logs of the corresponding message type in the category.
4. The method of detecting anomalies in logs of a system according to claim 1, wherein extracting flow state vectors in the structured data includes:
determining a state variable type contained in each log in the structured data, and classifying all the logs based on the state variable types;
creating a flow state vector for each category; and each dimension of the flow state vector corresponds to a value of the state variable type one by one, and the value of each dimension is the number of the values of the state variable type in the category.
5. The method of detecting anomalies in logs of a system according to claim 1, wherein said training a neural network model based on said feature vectors and corresponding labels comprises:
dividing all the feature vectors into a training set and a test set according to a preset proportion; wherein the training set and the test set each comprise a plurality of feature vectors and corresponding labels;
initializing weights and biases of all layers in the neural network model, training the neural network model by using the training set, and adjusting the weights and biases of the layers by using a back propagation error-based criterion to obtain a trained neural network model;
And testing the trained neural network model by using the test set.
6. The method of claim 5, wherein the initializing weights and biases of all layers in the neural network model comprises:
and initializing weights and biases of all layers in the neural network model by utilizing a Gaussian function.
7. The method for detecting the abnormal system log according to any one of claims 1 to 6, wherein the detecting the abnormal system log by using the trained neural network model comprises:
acquiring a system log to be detected, and analyzing the system log to be detected into target structured data;
extracting a target message counting vector and a target process state vector from the target structured data, and splicing the target message counting vector and the target process state vector into a target feature vector;
and inputting the target characteristic vector into the trained neural network model so as to obtain a detection result of the system log to be detected.
8. A system log anomaly detection apparatus, comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring an original system log and analyzing the original system log into structured data;
The extraction module is used for extracting a message counting vector and a process state vector from the structured data; the message counting vector represents the message type characteristics of the logs containing the same general identification, and the flow state vector represents the system behavior characteristics in a preset time window;
the splicing module is used for splicing the message counting vector and the flow state vector into feature vectors and labeling the log state of each feature vector;
a training module for training a neural network model based on the feature vectors and the corresponding labels;
and the detection module is used for detecting the system log abnormity by utilizing the trained neural network model.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the system log anomaly detection method according to any one of claims 1 to 7 when executing said computer program.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the system log anomaly detection method according to any one of claims 1 to 7.
CN202010664669.6A 2020-07-10 2020-07-10 System log abnormality detection method and device, electronic equipment and storage medium Active CN111858242B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010664669.6A CN111858242B (en) 2020-07-10 2020-07-10 System log abnormality detection method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010664669.6A CN111858242B (en) 2020-07-10 2020-07-10 System log abnormality detection method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111858242A true CN111858242A (en) 2020-10-30
CN111858242B CN111858242B (en) 2023-05-30

Family

ID=72984012

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010664669.6A Active CN111858242B (en) 2020-07-10 2020-07-10 System log abnormality detection method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111858242B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112364284A (en) * 2020-11-23 2021-02-12 北京八分量信息科技有限公司 Method, device and related product for detecting abnormity based on context
CN112434245A (en) * 2020-11-23 2021-03-02 北京八分量信息科技有限公司 Method and device for judging abnormal behavior event based on UEBA (unified extensible architecture), and related product
CN112738098A (en) * 2020-12-28 2021-04-30 北京天融信网络安全技术有限公司 Anomaly detection method and device based on network behavior data
CN113282433A (en) * 2021-06-10 2021-08-20 中国电信股份有限公司 Cluster anomaly detection method and device and related equipment
CN113468035A (en) * 2021-07-15 2021-10-01 创新奇智(重庆)科技有限公司 Log anomaly detection method and device, training method and device and electronic equipment
CN113709125A (en) * 2021-08-18 2021-11-26 北京明略昭辉科技有限公司 Method and device for determining abnormal flow, storage medium and electronic equipment
CN115333973A (en) * 2022-08-05 2022-11-11 武汉联影医疗科技有限公司 Equipment abnormality detection method and device, computer equipment and storage medium
CN115426254A (en) * 2022-08-26 2022-12-02 中国银行股份有限公司 Method and device for establishing and identifying system log abnormity identification network
CN115801447A (en) * 2023-01-09 2023-03-14 北京安帝科技有限公司 Flow analysis method and device based on industrial safety and electronic equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110321371A (en) * 2019-07-01 2019-10-11 腾讯科技(深圳)有限公司 Daily record data method for detecting abnormality, device, terminal and medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110321371A (en) * 2019-07-01 2019-10-11 腾讯科技(深圳)有限公司 Daily record data method for detecting abnormality, device, terminal and medium

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112364284A (en) * 2020-11-23 2021-02-12 北京八分量信息科技有限公司 Method, device and related product for detecting abnormity based on context
CN112434245A (en) * 2020-11-23 2021-03-02 北京八分量信息科技有限公司 Method and device for judging abnormal behavior event based on UEBA (unified extensible architecture), and related product
CN112738098A (en) * 2020-12-28 2021-04-30 北京天融信网络安全技术有限公司 Anomaly detection method and device based on network behavior data
CN113282433A (en) * 2021-06-10 2021-08-20 中国电信股份有限公司 Cluster anomaly detection method and device and related equipment
CN113468035A (en) * 2021-07-15 2021-10-01 创新奇智(重庆)科技有限公司 Log anomaly detection method and device, training method and device and electronic equipment
CN113468035B (en) * 2021-07-15 2023-09-29 创新奇智(重庆)科技有限公司 Log abnormality detection method, device, training method, device and electronic equipment
CN113709125A (en) * 2021-08-18 2021-11-26 北京明略昭辉科技有限公司 Method and device for determining abnormal flow, storage medium and electronic equipment
CN115333973A (en) * 2022-08-05 2022-11-11 武汉联影医疗科技有限公司 Equipment abnormality detection method and device, computer equipment and storage medium
CN115426254A (en) * 2022-08-26 2022-12-02 中国银行股份有限公司 Method and device for establishing and identifying system log abnormity identification network
CN115801447A (en) * 2023-01-09 2023-03-14 北京安帝科技有限公司 Flow analysis method and device based on industrial safety and electronic equipment
CN115801447B (en) * 2023-01-09 2023-04-21 北京安帝科技有限公司 Industrial safety-based flow analysis method and device and electronic equipment

Also Published As

Publication number Publication date
CN111858242B (en) 2023-05-30

Similar Documents

Publication Publication Date Title
CN111858242B (en) System log abnormality detection method and device, electronic equipment and storage medium
CN109697162B (en) Software defect automatic detection method based on open source code library
US8453027B2 (en) Similarity detection for error reports
US20150347923A1 (en) Error classification in a computing system
Guerrouj et al. The influence of app churn on app success and stackoverflow discussions
CN110263538B (en) Malicious code detection method based on system behavior sequence
CN110442712B (en) Risk determination method, risk determination device, server and text examination system
WO2018235252A1 (en) Analysis device, log analysis method, and recording medium
CN115328756A (en) Test case generation method, device and equipment
CN112416778A (en) Test case recommendation method and device and electronic equipment
CN113688630B (en) Text content auditing method, device, computer equipment and storage medium
CN111447224A (en) Web vulnerability scanning method and vulnerability scanner
CN116611074A (en) Security information auditing method, device, storage medium and apparatus
CN116346456A (en) Business logic vulnerability attack detection model training method and device
CN109933648B (en) Real user comment distinguishing method and device
CN107341110B (en) Tool for modifying and affecting range of software test positioning patch and implementation method
CN116167336B (en) Sensor data processing method based on cloud computing, cloud server and medium
US20210056395A1 (en) Automatic testing of web pages using an artificial intelligence engine
CN112087473A (en) Document downloading method and device, computer readable storage medium and computer equipment
Periyasamy et al. Prediction of future vulnerability discovery in software applications using vulnerability syntax tree (PFVD-VST).
CN112464237B (en) Static code security diagnosis method and device
CN113010339A (en) Method and device for automatically processing fault in online transaction test
CN114662099A (en) AI model-based application malicious behavior detection method and device
CN114676428A (en) Application program malicious behavior detection method and device based on dynamic characteristics
CN115879446B (en) Text processing method, deep learning model training method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant