CN111611218A - Distributed abnormal log automatic identification method based on deep learning - Google Patents

Distributed abnormal log automatic identification method based on deep learning Download PDF

Info

Publication number
CN111611218A
CN111611218A CN202010333973.2A CN202010333973A CN111611218A CN 111611218 A CN111611218 A CN 111611218A CN 202010333973 A CN202010333973 A CN 202010333973A CN 111611218 A CN111611218 A CN 111611218A
Authority
CN
China
Prior art keywords
log
vector
word
model
knowledge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010333973.2A
Other languages
Chinese (zh)
Inventor
玄跻峰
许宜森
张玉虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202010333973.2A priority Critical patent/CN111611218A/en
Publication of CN111611218A publication Critical patent/CN111611218A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a distributed abnormal log automatic identification method based on deep learning, which comprises the following steps: 1) acquiring log file data and preprocessing the log file data; 2) based on the preprocessed log, training by using a word2vec model to obtain a word vector of each word in the log; 3) converting sentences in the log text into sentence vectors by using the obtained word vectors; 4) inputting the sentence vectors into a long-term and short-term memory neural network model to train to obtain a two-classification model; 5) and processing the new log file and inputting the processed log file into the trained long-short term memory neural network model, and judging whether the input log is an abnormal log. The method establishes a classification model for automatically identifying the abnormal logs based on the deep neural network, converts the original manually identified abnormal logs into the automatically identified abnormal logs, reduces error risks caused by manually identifying the abnormal logs, and reduces labor and time costs of manually identifying the logs.

Description

Distributed abnormal log automatic identification method based on deep learning
Technical Field
The invention relates to a data mining technology, in particular to a distributed abnormal log automatic identification method based on deep learning.
Background
Modern software is increasingly complex and large in scale, and software maintenance cost is increased. The widespread use of distributed and heterogeneous software systems makes it extremely difficult to manually monitor the operational status of the software and to discover operational failures. The log is an indispensable output form during software operation. In order to find the fault of the distributed system as early as possible and reduce the potential downtime risk, a large number of distributed systems save the software state during operation through real-time log output, and provide a data basis for maintenance personnel.
In modern distributed systems, maintenance personnel can manually check the software runtime state, discover and analyze where the fault is, based on the log output by the system. However, a large number of distributed systems remain operational around the clock, outputting a huge amount of log data each day. This makes it very difficult to manually analyze the entire log.
In order to find out the fault and the potential risk in the software operation through the log, maintenance personnel manually define the log characteristics corresponding to the correct log based on the normal log set. For a new log, whether the log is output when the program is executed normally can be identified by matching the log with the log characteristics, that is, whether the log has abnormal behavior. If not, the operation fault or potential risk of the software is indicated, and further manual analysis can be carried out according to the operation fault or potential risk. However, it is time-consuming and error-prone for maintenance personnel to manually define the correct log features, mainly because (1) the log itself is complex and manual definition of log features often occurs under-definition; (2) the continuous integrated development of modern software causes the software version to change frequently, and the definition of the log feature needs to be changed frequently. For the above two reasons, the method of manually defining log features and then identifying abnormal logs consumes a lot of labor and time cost in practice.
Disclosure of Invention
The invention aims to solve the technical problem of providing a distributed abnormal log automatic identification method based on deep learning aiming at the defects in the prior art, and the method can reduce the error risk caused by manually identifying abnormal logs.
The technical scheme adopted by the invention for solving the technical problems is as follows: a distributed abnormal log automatic identification method based on deep learning comprises the following steps:
1) acquiring normal and abnormal log sets and preprocessing the normal and abnormal log sets; intercepting a timestamp of each log, sequencing log messages in a log file by using timestamp character strings in the logs, and then filtering the timestamp character strings in each log;
2) based on the preprocessed log, training by using a word2vec model to obtain a word vector of each word in the log;
3) converting sentences in the log text into sentence vectors by using the obtained word vectors;
4) inputting the sentence vectors into a long-term and short-term memory neural network model to train to obtain a two-classification model;
5) preprocessing a new log file, extracting word vectors, converting sentences in the log file into sentence vectors, inputting the sentence vectors into a trained long short-Term Memory neural network model (LSTM), and judging whether the input log is an abnormal log or not; the new log file is a file of which the occurrence time of the log message is after the log file is trained;
according to the scheme, in the step 2), word vectors of each word in the log are obtained by using word2vec model training, the word2vec model training mode uses a skip-gram or CBOW word model calculation mode, and a negative sampling model is adopted for training to obtain the word vectors.
According to the scheme, the training process of the medium-long short-term memory neural network model in the step 4) is as follows:
4.1) each neural unit input vector X is a sentence vector, and the sentence vectors are sequentially input into the long-term and short-term memory neural network model according to the time sequence;
4.2) the input vector of each nerve unit passes through a forgetting gate,The knowledge information is stored in a knowledge base C after the input gate and the output gate are processed, and the knowledge after the current neural unit is processed is output to ht+1In the meantime, h outputted from the last neural unitt+1The knowledge of (a) is input into the next neural unit;
the activation function of the forgetting gate is a sigmoid function, and the data after the current vector forgetting is used as the vector inner product of the weight and the knowledge base to realize the partial forgetting of the old knowledge;
the input gate is a set of input sentence vectors and output vectors of a previous neural unit, and specifically comprises the following steps: firstly, obtaining a memory weight by a sigmoid function of a current vector and a vector inner product, and secondly, obtaining knowledge by tanh value of the current vector; thirdly, obtaining the latest knowledge which is partially forgotten by the vector inner product of the memory weight and the knowledge; finally, merging the new knowledge into a knowledge base;
the output gate is the current neural unit output obtained by taking the tan h value of the vector of the knowledge base and the weight inner product of the tan h value and the input gate;
4.3) obtaining h for each neural Unitt+1Vector, all ht+1Inputting the vector into an average value pool layer;
and 4.4) inputting the vector of the average value pool into a regression classification layer, and classifying the average value vector by using a regression classification method to obtain a two-classification model of the long-term and short-term memory neural network.
The invention has the following beneficial effects:
the invention establishes the classification model of automatic identification of the abnormal logs based on the deep neural network, automatically generates the classification model of the abnormal logs based on the long-term and short-term memory neural network model, converts the original manual identification of the abnormal logs into the automatic identification of the abnormal logs, reduces the error risk caused by the manual identification of the abnormal logs and reduces the labor and time costs of the manual identification of the logs.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is a flow chart of a method of an embodiment of the present invention;
FIG. 2 is a schematic diagram of a skip-gram model according to an embodiment of the invention;
FIG. 3 is a schematic diagram of a CBOW model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the LSTM neural unit model structure according to an embodiment of the present invention;
fig. 5 is a schematic diagram of the classification of LSTM according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, a method for automatically identifying a distributed abnormal log based on deep learning includes:
(1) and (5) a model training stage.
Firstly, preprocessing an original log to enable the log to meet the input requirement of a word2vec model. Firstly, intercepting the time stamp of each log, sequencing the log messages in the log file by using the time stamp character strings in the logs, and then filtering the time stamp character strings in each log.
Based on the preprocessed log, a word vector word2vec model training mode of each word in the log is obtained by using word2vec model training, a skip-gram or CBOW word model calculation mode is used, and a negative sampling model is adopted for training to obtain a word vector. During the training of the word2vec model, a dictionary is established to represent all non-repetitive words, an N-dimensional vector (the element of the vector is 0 or 1) is used to represent each word in the dictionary, and the vector is used for calculation during the training of the model.
In the skip-gram model, the probability of adjacent words is trained according to a given word (the central word w (t)) in a sentence by using a negative sampling model. Wherein the maximum distance from the neighboring word to the central word is called the window, as shown in fig. 2.
In fig. 2 w represents a word and t represents the position of the word w in a sentence. The core word w (t) uses a maximum likelihood estimation method to calculate the probabilities of the t-2, t-1, t +1, and t +2 position words.
In the CBOW model, just as opposed to the skip-gram model, the probability of a headword is trained by a negative sampling model of its neighbors, as shown in FIG. 3.
Training a negative sampling model (taking a skip-gram model as an example): first assume a core word wcAnd word w in the windowoProbability and central word appearing in window and all words not appearing in window simultaneouslykThe probabilities which do not occur simultaneously are independent of each other, and then the probability calculation model is as the formula (1).
Figure BDA0002465943890000061
Wherein D is woAnd wcA flag of whether or not within the same window, 1 indicates that one window is present and 0 indicates that there is no more window. I.e. by the central word wcCalculating to obtain the word w in the windowoThe probability of (c).
After text data is input into a word2vec model, the model firstly counts the occurrence frequency of words according to the text, then carries out probability calculation on each word according to a formula (1), and finally outputs vector representations of all words. The word vector obtained through word2vec model training shows that the larger the cosine value of two different word vectors is, the more similar the semantics of the two words are.
The dimension of the word vector may be set according to the size of the data amount, and is generally set to 100. And converting sentences in the log text into sentence vectors by using the obtained word vectors, and inputting the sentence vectors into the long-term and short-term memory neural network model to train so as to obtain the two classification models.
Training of long-short term memory neural network model (as shown in fig. 4): and finally, sequentially inputting the sentence vectors into the long-term and short-term memory neural network model according to the time sequence.
When the long-short term memory neural network model is trained, word vectors are input into the neural unit, and knowledge is processed by an input gate, an output gate and a forgetting gateThe information is stored in the knowledge base C. And outputs the knowledge after the current neural unit processing to ht+1In (1). Wherein h ist+1Is input into the next neural unit. The activation function of the forgetting gate is sigmoid function value range [0,1]As weight parameters, 0 represents forgetting, 1 represents total remembering, the parameter between 0 and 1 represents partial forgetting, and the data after the current vector forgetting is taken as the vector inner product of the weight and the knowledge base, so as to realize the partial forgetting of old knowledge.
The input gate is a set of the input sentence vectors and the output vectors of the previous neural unit. Firstly, the memory weight is obtained by the sigmoid function of the current vector and the vector inner product. Secondly, the tanh value of the current vector obtains knowledge. Thirdly, the vector inner product of the memory weight and the knowledge obtains the latest and partially forgotten knowledge. Finally, the new knowledge is merged into the knowledge base. The output gate is the current neural unit output obtained by taking the tan h value of the vector of the knowledge base and carrying out the inner product of the tan h value and the weight of the input gate.
Obtain h for each neural unit as in FIG. 4t+1And (5) vector quantity. Finally all h are addedt+1And inputting the vector into an average value pool layer for further processing. As shown in fig. 5.
The effect of the average pool in FIG. 5 is to pool all ht+1Vector, h to be adjacentt+1The vectors are averaged.
And finally, inputting the vector of the average value pool into a regression classification layer, and classifying the average value vector II by using a regression classification method.
(2) And a model application phase.
At this stage, a new log file is preprocessed, word vectors are extracted, sentences in the log file are converted into sentence vectors, the sentence vectors are input into the trained long-term and short-term memory neural network model, and the new log file refers to a log of which the log time point is behind a training log. The trained long and short term memory neural network model can judge whether the input logs are abnormal logs, and finally the logs judged to be abnormal can be output to an appointed file for the operation and maintenance staff to look up.
Software in the operating environment of a distributed system generates a large amount of log data each day. Depending on the enterprise architecture, the location of the log data store generated by the software on the distributed system is different. The log data on the log server stores the running state information of the software, including error information of the software, correct running information of the software, interactive information of the software and the like.
The application system scene of the method mainly comprises (1) a distributed system (2) and a workstation in a log server (3) system. The data generated by the system can be analyzed on the basis of the server of the distributed data storage, and a corresponding software working model is extracted. The maintenance personnel judges whether the log information output by the program later contains errors or not based on the system: if the corresponding state log information of the program is not contained in the established model of the program, the exception of the program during the operation is shown.
Automatically generating an abnormal log classification model based on the long-term and short-term memory neural network model; according to the method, all log information is trained into a digital vector by using a word2vec model, the digital vector is input into a long-term and short-term memory neural network model, and a two-classification model is obtained through training.
The invention converts the original manual identification abnormal log into the automatic identification abnormal log, reduces the error risk caused by the manual identification of the abnormal log, and reduces the labor and time cost of the manual identification of the log. The beneficial effects of the key technical points are described as follows:
(1) the process of manually searching abnormal logs in the log set is simplified, and automatic classification of the logs is realized;
(2) based on massive log data, a log classification model generated during software operation is established;
(3) the text file is convenient for maintenance personnel to manually check and understand;
(4) the error of manually screening abnormal logs is reduced;
(5) the labor cost and the time cost of frequently updating the log features are reduced.
It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

Claims (3)

1. A distributed abnormal log automatic identification method based on deep learning is characterized by comprising the following steps:
1) acquiring normal and abnormal log sets and preprocessing the normal and abnormal log sets; intercepting a timestamp of each log, sequencing log messages in a log file by using timestamp character strings in the logs, and then filtering the timestamp character strings in each log;
2) based on the preprocessed log, training by using a word2vec model to obtain a word vector of each word in the log;
3) converting sentences in the log text into sentence vectors by using the obtained word vectors;
4) inputting the sentence vectors into a long-term and short-term memory neural network model to train to obtain a two-classification model;
5) preprocessing a new log file, extracting word vectors, converting sentences in the log file into sentence vectors, inputting the sentence vectors into a trained long-short term memory neural network model, and judging whether the input log is an abnormal log or not; the new log file is a file of which the occurrence time of the log message is after the log file is trained.
2. The method for automatically identifying the distributed abnormal logs based on the deep learning as claimed in claim 1, wherein in the step 2), word vectors of each word in the logs are obtained by using word2vec model training, the word2vec model training mode uses a skip-gram or CBOW word model calculation mode, and a negative sampling model is adopted for training to obtain the word vectors.
3. The method for automatically identifying the distributed abnormal logs based on the deep learning as claimed in claim 1, wherein the training process of the step 4) middle and long short term memory neural network model is as follows:
4.1) each neural unit input vector X is a sentence vector, and the sentence vectors are sequentially input into the long-term and short-term memory neural network model according to the time sequence;
4.2) after the input vector of each nerve unit is processed by a forgetting gate, an input gate and an output gate, the knowledge information is stored in a knowledge base C, and the knowledge after the current nerve unit is processed is output to ht+1In the meantime, h outputted from the last neural unitt+1The knowledge of (a) is input into the next neural unit;
the activation function of the forgetting gate is a sigmoid function, and the data after the current vector forgetting is used as the vector inner product of the weight and the knowledge base to realize the partial forgetting of the old knowledge;
the input gate is a set of input sentence vectors and output vectors of a previous neural unit, and specifically comprises the following steps: firstly, obtaining a memory weight by a sigmoid function of a current vector and a vector inner product, and secondly, obtaining knowledge by tanh value of the current vector; thirdly, obtaining the latest knowledge which is partially forgotten by the vector inner product of the memory weight and the knowledge; finally, merging the new knowledge into a knowledge base;
the output gate is the current neural unit output obtained by taking the tan h value of the vector of the knowledge base and the weight inner product of the tan h value and the input gate;
4.3) obtaining h for each neural Unitt+1Vector, all ht+1Inputting the vector into an average value pool layer;
and 4.4) inputting the vector of the average value pool into a regression classification layer, and classifying the average value vector by using a regression classification method to obtain a two-classification model of the long-term and short-term memory neural network.
CN202010333973.2A 2020-04-24 2020-04-24 Distributed abnormal log automatic identification method based on deep learning Pending CN111611218A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010333973.2A CN111611218A (en) 2020-04-24 2020-04-24 Distributed abnormal log automatic identification method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010333973.2A CN111611218A (en) 2020-04-24 2020-04-24 Distributed abnormal log automatic identification method based on deep learning

Publications (1)

Publication Number Publication Date
CN111611218A true CN111611218A (en) 2020-09-01

Family

ID=72194717

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010333973.2A Pending CN111611218A (en) 2020-04-24 2020-04-24 Distributed abnormal log automatic identification method based on deep learning

Country Status (1)

Country Link
CN (1) CN111611218A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112487406A (en) * 2020-12-02 2021-03-12 中国电子科技集团公司第三十研究所 Network behavior analysis method based on machine learning
CN112698977A (en) * 2020-12-29 2021-04-23 下一代互联网重大应用技术(北京)工程研究中心有限公司 Server fault positioning method, device, equipment and medium
CN112711665A (en) * 2021-01-18 2021-04-27 武汉大学 Log anomaly detection method based on density weighted integration rule
CN113239684A (en) * 2021-06-04 2021-08-10 清华大学 Method and device for automatically identifying abnormal log based on partial mark
CN113468035A (en) * 2021-07-15 2021-10-01 创新奇智(重庆)科技有限公司 Log anomaly detection method and device, training method and device and electronic equipment
US20230004750A1 (en) * 2021-06-30 2023-01-05 International Business Machines Corporation Abnormal log event detection and prediction
CN116069540A (en) * 2023-02-24 2023-05-05 北京关键科技股份有限公司 Acquisition, analysis and processing method and device for running state of software and hardware parts of system

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156003A (en) * 2016-06-30 2016-11-23 北京大学 A kind of question sentence understanding method in question answering system
US20170068709A1 (en) * 2015-09-09 2017-03-09 International Business Machines Corporation Scalable and accurate mining of control flow from execution logs across distributed systems
CN106815639A (en) * 2016-12-27 2017-06-09 东软集团股份有限公司 The abnormal point detecting method and device of flow data
CN107239445A (en) * 2017-05-27 2017-10-10 中国矿业大学 The method and system that a kind of media event based on neutral net is extracted
US20180033144A1 (en) * 2016-09-21 2018-02-01 Realize, Inc. Anomaly detection in volumetric images
CN107885853A (en) * 2017-11-14 2018-04-06 同济大学 A kind of combined type file classification method based on deep learning
CN108399201A (en) * 2018-01-30 2018-08-14 武汉大学 A kind of Web user access path prediction technique based on Recognition with Recurrent Neural Network
US10049321B2 (en) * 2014-04-04 2018-08-14 Knowmtech, Llc Anti-hebbian and hebbian computing with thermodynamic RAM
CN108763542A (en) * 2018-05-31 2018-11-06 中国华戎科技集团有限公司 A kind of Text Intelligence sorting technique, device and computer equipment based on combination learning
US10416264B2 (en) * 2016-11-22 2019-09-17 Hyperfine Research, Inc. Systems and methods for automated detection in magnetic resonance images
CN110427298A (en) * 2019-07-10 2019-11-08 武汉大学 A kind of Automatic Feature Extraction method of distributed information log
CN110502389A (en) * 2019-07-01 2019-11-26 无锡天脉聚源传媒科技有限公司 A kind of server exception monitoring method, system, device and storage medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10049321B2 (en) * 2014-04-04 2018-08-14 Knowmtech, Llc Anti-hebbian and hebbian computing with thermodynamic RAM
US20170068709A1 (en) * 2015-09-09 2017-03-09 International Business Machines Corporation Scalable and accurate mining of control flow from execution logs across distributed systems
CN106156003A (en) * 2016-06-30 2016-11-23 北京大学 A kind of question sentence understanding method in question answering system
US20180033144A1 (en) * 2016-09-21 2018-02-01 Realize, Inc. Anomaly detection in volumetric images
US10416264B2 (en) * 2016-11-22 2019-09-17 Hyperfine Research, Inc. Systems and methods for automated detection in magnetic resonance images
CN106815639A (en) * 2016-12-27 2017-06-09 东软集团股份有限公司 The abnormal point detecting method and device of flow data
CN107239445A (en) * 2017-05-27 2017-10-10 中国矿业大学 The method and system that a kind of media event based on neutral net is extracted
CN107885853A (en) * 2017-11-14 2018-04-06 同济大学 A kind of combined type file classification method based on deep learning
CN108399201A (en) * 2018-01-30 2018-08-14 武汉大学 A kind of Web user access path prediction technique based on Recognition with Recurrent Neural Network
CN108763542A (en) * 2018-05-31 2018-11-06 中国华戎科技集团有限公司 A kind of Text Intelligence sorting technique, device and computer equipment based on combination learning
CN110502389A (en) * 2019-07-01 2019-11-26 无锡天脉聚源传媒科技有限公司 A kind of server exception monitoring method, system, device and storage medium
CN110427298A (en) * 2019-07-10 2019-11-08 武汉大学 A kind of Automatic Feature Extraction method of distributed information log

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MIN DU ET.AL: ""DeepLog: Anomaly Detection and Diagnosis from System Logs"", 《PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112487406A (en) * 2020-12-02 2021-03-12 中国电子科技集团公司第三十研究所 Network behavior analysis method based on machine learning
CN112698977A (en) * 2020-12-29 2021-04-23 下一代互联网重大应用技术(北京)工程研究中心有限公司 Server fault positioning method, device, equipment and medium
CN112698977B (en) * 2020-12-29 2024-03-29 赛尔网络有限公司 Method, device, equipment and medium for positioning server fault
CN112711665A (en) * 2021-01-18 2021-04-27 武汉大学 Log anomaly detection method based on density weighted integration rule
CN112711665B (en) * 2021-01-18 2022-04-15 武汉大学 Log anomaly detection method based on density weighted integration rule
CN113239684A (en) * 2021-06-04 2021-08-10 清华大学 Method and device for automatically identifying abnormal log based on partial mark
US20230004750A1 (en) * 2021-06-30 2023-01-05 International Business Machines Corporation Abnormal log event detection and prediction
CN113468035A (en) * 2021-07-15 2021-10-01 创新奇智(重庆)科技有限公司 Log anomaly detection method and device, training method and device and electronic equipment
CN113468035B (en) * 2021-07-15 2023-09-29 创新奇智(重庆)科技有限公司 Log abnormality detection method, device, training method, device and electronic equipment
CN116069540A (en) * 2023-02-24 2023-05-05 北京关键科技股份有限公司 Acquisition, analysis and processing method and device for running state of software and hardware parts of system

Similar Documents

Publication Publication Date Title
CN111611218A (en) Distributed abnormal log automatic identification method based on deep learning
US20220405592A1 (en) Multi-feature log anomaly detection method and system based on log full semantics
CN108256074B (en) Verification processing method and device, electronic equipment and storage medium
CN111309912A (en) Text classification method and device, computer equipment and storage medium
CN112270379A (en) Training method of classification model, sample classification method, device and equipment
CN111860981B (en) Enterprise national industry category prediction method and system based on LSTM deep learning
CN112036185B (en) Method and device for constructing named entity recognition model based on industrial enterprise
CN113468317B (en) Resume screening method, system, equipment and storage medium
CN113986864A (en) Log data processing method and device, electronic equipment and storage medium
US20210004603A1 (en) Method and apparatus for determining (raw) video materials for news
CN112579414A (en) Log abnormity detection method and device
CN115359799A (en) Speech recognition method, training method, device, electronic equipment and storage medium
CN113672732A (en) Method and device for classifying business data
CN113806198B (en) System state diagnosis method based on deep learning
CN115170027A (en) Data analysis method, device, equipment and storage medium
CN112579777B (en) Semi-supervised classification method for unlabeled text
CN112417852B (en) Method and device for judging importance of code segment
CN110866172B (en) Data analysis method for block chain system
CN115758211B (en) Text information classification method, apparatus, electronic device and storage medium
CN113407716B (en) Human behavior text data set construction and processing method based on crowdsourcing
CN114610613A (en) Online real-time micro-service call chain abnormity detection method
CN113900935A (en) Automatic defect identification method and device, computer equipment and storage medium
CN112698977B (en) Method, device, equipment and medium for positioning server fault
CN113872794B (en) IT operation and maintenance platform system based on cloud resource support and operation and maintenance method thereof
CN118093785A (en) Distributed collaboration-oriented avionic fault knowledge fusion method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200901

RJ01 Rejection of invention patent application after publication