CN115277180B - Block chain log anomaly detection and tracing system - Google Patents

Block chain log anomaly detection and tracing system Download PDF

Info

Publication number
CN115277180B
CN115277180B CN202210882913.5A CN202210882913A CN115277180B CN 115277180 B CN115277180 B CN 115277180B CN 202210882913 A CN202210882913 A CN 202210882913A CN 115277180 B CN115277180 B CN 115277180B
Authority
CN
China
Prior art keywords
log
model
template
sequence
time sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210882913.5A
Other languages
Chinese (zh)
Other versions
CN115277180A (en
Inventor
牛伟纳
张小松
廖旭涵
赵丽睿
周孝笑
朱宇坤
张然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202210882913.5A priority Critical patent/CN115277180B/en
Publication of CN115277180A publication Critical patent/CN115277180A/en
Application granted granted Critical
Publication of CN115277180B publication Critical patent/CN115277180B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2463/00Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00
    • H04L2463/146Tracing the source of attacks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Algebra (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Hardware Design (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The present invention relates to the field of blockchain applications. The system for detecting and tracing the abnormal state of the blockchain log is provided, and aims to solve the problem of the lack of the abnormal state detection function of the data in the current blockchain architecture, and can safely and reliably realize the data detection with high accuracy. Extracting a template from the data log, and counting the number features; training a model through the characteristic representation of the log, wherein the characteristic is divided into a number characteristic and a time sequence characteristic; for a log sequence to be detected, firstly, processing data through a data processing module, combining a model trained by a quantity time sequence model training module after the data processing, outputting a numerical value of 0-1 by the model, respectively recording the numerical value as a time sequence model deviation degree and a quantity deviation degree, and comprehensively calculating the final deviation degree; and writing the logs exceeding the deviation threshold value into a table, giving threat marks, giving a log sequence to which the threat logs belong as tracing output, and if abnormal false alarms are found during audit, marking the abnormal false alarms to enable the system to dynamically adjust the threshold value, so that the accuracy is increased.

Description

Block chain log anomaly detection and tracing system
Technical Field
The invention belongs to the field of block chain data security, and provides a block chain log anomaly detection and tracing system.
Background
Blockchain technology is one of the most prevalent technologies today and has been widely used in many scenarios in finance, supply chain, etc. Blockchain technology can be generally divided into three implementation forms of public chains, alliance chains and private chains. In the early stage of blockchain application, a public chain is used as a main expression form, all people can participate in supervision, and the authenticity of the uplink information is strongest. But the number of participants is too large, resulting in inefficient operation. When the enterprise is used in a small scale, the private chain is selected to realize the blockchain, the number of people involved in the private chain is small, but the centralization degree is too high, and the system can only generally operate in a single-center industry. The alliance chain combining the advantages of the two is the block chain form selected by most application at present, and the alliance chain is supervised by a plurality of main participation parts, each part can independently control individuals which want to be authorized to participate in the block chain network, and the individuals participate in supervision as a part of the part after being registered. The information on the chain is transparent to all individuals involved, and operations such as data addition and the like are supervised by groups, so that the chain has traceability and non-tamper property. Currently, application targets on using blockchains at home and abroad mainly comprise guarantee of untampereability, credibility integrity and traceability of auditable data.
The log is the most representative auditable data, and is used for recording operation information such as various parameters in the operation period of the system, and a system developer can discover problems and position the problems in time to solve the problems periodically or when abnormal behaviors occur through the audit log. But existing logging systems have some problems. If the system is attacked artificially, the log records can be tampered by the attacker, so that the developer cannot locate errors through false log records, and the difficulty of the developer in repairing the system and locating the problems is increased. In addition, a widely used log abnormality detection method is generally that developers detect abnormalities by means of keyword searching, regular expressions and the like in combination with log abnormality levels according to their domain knowledge. However, this approach relies heavily on manual work, which is more difficult as the system becomes larger and more complex.
The log is time sequence text data, which is composed of time stamp and text message, and records the operation state of the service in real time, and the log has a certain number of corresponding relations, for example, if several files are opened, the files should be closed, if the log execution sequence is wrong or the corresponding relations are incorrect, the abnormality is possible. However, the specifications of the current logs are not uniform, the log formats printed by different types of equipment are different, the log data also has the unstructured characteristic, the logs are difficult to process in a batch automatic mode, and the problems make log analysis very difficult.
The log analyzer based on the fixed depth tree is to preprocess original log information through a simple regular expression set by domain knowledge, then search a log group according to a special design rule encoded in the internal node of the tree, if a matched log group is found, the log information can be matched with log events stored in the log group, if no matched log group is found, a new log group is created, finally all logs can be attributed to the log group, which is equivalent to classified logs, and the same form of log extracts a common mode to be a template.
Disclosure of Invention
The invention discloses a block chain automatic log anomaly detection and tracing scheme based on a alliance chain. In conventional blockchain applications, automated anomaly detection of data on a federated chain is not performed, but rather is performed manually by experience. As the variety of data on the federation chain increases, it becomes increasingly complex and relying on manual detection alone is not feasible. Therefore, a reasonable and high-accuracy log anomaly detection technology needs to be researched, and the automation capability of the system is improved. The log anomaly detection and tracing scheme solves the problem of the lack of the data anomaly detection function in the current block chain architecture, and can safely and reliably realize data detection with high accuracy.
The invention adopts the following technical scheme to solve the technical problems:
a blockchain log anomaly detection and tracing system, comprising:
and a data processing module: extracting a template from the data log, wherein the log template comprises a quantitative part and a variable, structuring unstructured log data into a template log which is easy to analyze, and counting according to the number characteristics of the template, wherein the number characteristics are the number of occurrence of words in the template and the number of occurrence of combined words;
the quantity time sequence model training module: training a model through characteristic representation of a log, wherein the characteristic is divided into a number characteristic and a time sequence characteristic, the number characteristic is the number of times of word occurrence and the number of times of word combination occurrence in the template, and the time sequence characteristic is the sequence of the log;
and the deviation degree calculating module is used for: for a log sequence to be detected, firstly, processing data through a data processing module, combining a model trained by a quantity time sequence model training module after the data processing, outputting a numerical value of 0-1 by the model, respectively recording the numerical value as a time sequence model deviation degree and a quantity deviation degree, and comprehensively calculating the final deviation degree;
an anomaly tracing module: and writing the logs exceeding the deviation threshold value into a table, giving threat marks, giving a log sequence to which the threat logs belong as tracing output, and if abnormal false alarms are found during audit, marking the abnormal false alarms to enable the system to dynamically adjust the threshold value, so that the accuracy is increased.
In the above technical solution, the data processing module adopts a drain log template extractor and combines multidimensional feature combinations to output statistical features, and specifically includes:
1) A drain log template extractor extracts templates from the existing log of the blockchain network;
2) Respectively counting the occurrence times of words and the occurrence times of combined words in the templates for the templates extracted by the drain;
3) When a node uploads a log in a blockchain network, using drain to classify the log into a corresponding template and statistical quantity characteristics;
in the above technical solution, in the number timing model training module:
respectively acquiring time sequence characteristics and quantity characteristics of a log sequence;
training the time sequence features in a GRU model based on an attention mechanism to obtain a time sequence model;
putting the quantity features into a decision tree based on gradient lifting for training to obtain a quantity model;
and (5) saving the time sequence model and the number model with highest precision in the training process.
A attention-mechanism-based GRU model comprising the steps of:
A. the log is text data, the extracted template is also a text template, semantic conversion is needed before the text is input into the model, and the input log template text is converted into a log template vector by adopting a semantic vector trained by glove; the log is text, the program cannot process, the vector is number, and the program can process. The glove vocabulary is a one-to-one correspondence between words and numbers, and the words can be converted into numbers by looking up a table.
B. A sliding window mode is adopted, and batch log template vectors are converted into log template sequence vectors;
C. inputting the log template sequence vector into a model, and allowing the model to learn time sequence characteristics;
D. and (5) saving a training result to obtain a time sequence model.
In the technical proposal, an abnormality tracing module,
1) Setting a threshold value, judging the deviation value, and marking the deviation value exceeding the threshold value as abnormal (marked as 1);
2) The false-alarm data can be marked as false alarm (marked as 0), whether the threshold is adjusted is judged according to whether the abnormal quantity below the deviation value of the false alarm is particularly small or not in a certain time, and if the abnormal quantity is particularly small, the threshold is lower, and the improvement is needed;
3) The log data to be detected is processed by a data processing module and then is input into a trained model in a quantity time sequence model training module to obtain a deviation value;
4) Judging whether the mark is abnormal according to the threshold value;
5) And if the log sequence is abnormal, tracing to output the log sequence related to the abnormality.
In the technical proposal, tracing the abnormal output process,
1) Caching the one-to-one correspondence between the original text of the log to be tested and the vector in the memory;
2) If the mark is abnormal, obtaining original log text information of the log vector and related log sequence information through table lookup;
3) Otherwise, the buffer is emptied, and the next log is detected.
By adopting the technical scheme, the invention has the following beneficial effects:
1. the log data is stored on the alliance chain, so that the reliability and reliability of log data audit can be ensured.
2. And by combining the advantages of deep learning and machine learning, log abnormality is automatically detected by using the time sequence characteristics and the quantity characteristics of the logs.
3. The method effectively removes irrelevant features and influences and improves the accuracy of results by a comprehensive deviation degree calculation method of attention mechanism, quantity feature screening and final re-weight distribution in time sequence features.
Drawings
FIG. 1 is a basic flow of blockchain log anomaly detection and tracing;
FIG. 2 illustrates a process for log anomaly detection;
FIG. 3 is a block chain log anomaly detection and tracing system architecture
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the particular embodiments described herein are illustrative only and are not intended to limit the invention, i.e., the embodiments described are merely some, but not all, of the embodiments of the invention.
The detailed description of the embodiments of the invention is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention.
It is noted that relational terms such as "first" and "second", and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The implementation of the blockchain log anomaly detection and tracing system comprises the following steps:
and a data processing module: extracting a log template from the unstructured log, and counting the occurrence times of fixed words in the template, wherein the occurrence times of the fixed words are combined.
The quantity time sequence model training module: and converting the log sequence into a digital vector, and inputting the digital vector into a time sequence model for training. And meanwhile, inputting the quantity features counted by the data processing module into a quantity model for training, and storing a training result.
And the deviation degree calculating module is used for: and inputting the log sequence to be tested into the trained results in the number time sequence model training module, respectively calculating the time sequence deviation degree and the number deviation degree, and then calculating the comprehensive deviation degree.
An anomaly tracing module: and marking whether the deviation value is abnormal or not according to the deviation value, and if so, tracing to output a related log sequence, wherein an operator can dynamically adjust the deviation threshold value through feedback.
The main flow of the scheme for the four modules comprises the following steps:
A. the data processing module extracts log templates from a large amount of unstructured log data existing on the blockchain network, and counts the occurrence times of fixed words and the occurrence times of fixed word combinations corresponding to each template.
B. And converting the log text sequence into a digital vector through a Glove vocabulary, temporarily caching the digital vector, inputting sequences containing time sequence information into a time sequence model, inputting the quantity features of template statistics corresponding to the sequences into a quantity model for training respectively, and storing a training result with highest accuracy.
C. The log sequence to be measured is processed by a data processing module, then is input into a trained model in a number time sequence model, the time sequence deviation degree t and the number deviation degree n are calculated respectively, the weight distribution is carried out on the influence of the time sequence deviation degree and the number deviation degree on the final deviation degree, and the comprehensive deviation degree y=w1t+w2n is calculated.
D. And determining whether the mark is abnormal according to whether the value y of the comprehensive deviation degree is larger than a set threshold value m, if so, outputting an abnormal log sequence through log associated information cached in the memory, otherwise, clearing the cache.
Further, in the process of processing the transaction in step B, first, the log template is parsed, and we use a drain log parser. Firstly, replacing conventional variable information with a mask, and then carrying out classification aggregation on prefix similarity according to the length of the log to finally obtain a log template.
Specifically, in our system, we combine both the timing and quantity features of the log sequence. The timing characteristic is the order of log execution, such as the order of log execution after new files are written and then deleted. The number features are that the file is opened several times and closed several times, and the relation of the number is corresponding.
On this basis, we use classical, high-accuracy deep learning algorithms and machine learning methods:
1) In the time sequence training process, a GRU (Gated Recurrent Unit) algorithm based on an attention mechanism is used, a certain specific sequence can be gathered according to the action exerted by the sequence on the input time sequence, irrelevant sequence noise is ignored, and a good time sequence model is obtained through training.
2) In the process of quantity training, a method of gradient lifting decision tree is used for screening quantity one-dimensional features and quantity two-dimensional features, irrelevant features are removed, and finally effective quantity models with different anomalies are obtained.
In step D, the threshold is initially set to a low value by human, and contains as much anomaly information as possible. The algorithm of personnel feedback dynamic adjustment threshold value is to observe the false alarm rate, namely if the false alarm number is far more than the abnormal number under the condition that the false alarm deviation value is smaller than or equal to the false alarm deviation value and is larger than or equal to the threshold value, if so, the threshold value is updated to be the false alarm deviation value.
The technical scheme of the invention is further described as follows:
1. extraction of log templates
In order to solve the problem that the specifications of the current log are not uniform, log formats printed by different types of equipment are different, and log data are unstructured, so that the log is convenient for personnel to analyze. The log is extracted into the log template by using the log analyzer, so that the types of the processed logs are clearer, and the processing difficulty is reduced. The main steps of the log parser include the following five steps.
1. Pretreatment: the obvious portions are mask replaced using regular expressions.
2. Log length classification: the logs are classified according to the number of tokens in the original log.
3. Sorting logs: the logs are classified according to the preset log depth and are generally set to be 4, fine adjustment can be performed according to actual scenes, and the depth can influence the number and the accuracy of the nodes traversed by searching.
4. Journal classification: at the position ofAfter categorizing, simseq=according to the similarity algorithm
Figure SMS_1
Wherein
Figure SMS_2
I-th letter representing log sequence 1, < ->
Figure SMS_3
Representing the i-th letter of log sequence 2,
Figure SMS_4
judging whether the sequences belong to the class, if not, adding the classes, wherein t1 and t2 refer to letters corresponding to the same positions of the two sequences, and n is the length of a longer sequence in sequence comparison.
2. Period of log data presence:
the purpose of log anomaly detection is to trace the source anomaly log, locate the threat and check in time. However, the process logs in the anomaly detection are converted into digital vectors, the digital vectors are not readable, and due to the large number of processed logs, all log sequence relations cannot be saved, so that how the period is set and how the content of the cache is selected are of great significance to the anomaly detection of the logs. The specific caching process is as follows:
step one, marking a serial number on an original log in a data processing module, and establishing a cache table, wherein the table entry is a content log serial number-log template, and the original log is replaced by the serial number in the subsequent intermediate process, so that the utilization rate of time and space is improved.
Step two, in the number time sequence model training module, converting an original log text through a glove vocabulary to obtain a digital vector, and establishing a cache table for the log serial number and the digital vector, wherein the table entry is the log serial number-digital vector.
And thirdly, calculating the deviation degree of the log to be tested in a deviation degree calculation module, and establishing a corresponding cache table entry as the log serial number-comprehensive deviation degree.
And step four, in the abnormal tracing module, if the deviation exceeds a threshold value, returning to the log sequence according to the log sequence number and the window size originally set by the system, and emptying the cache table.
3. Format in which log data exists
In the process of log exception handling, four basic data formats are mainly:
1) Raw log data: receiving block blk _ -354458 src:/10.250.19.102:39325 dest:/10.250.19.102:50010.
2) Log template: receiving block [ ID ] src: [. Times ] dest:/[ IPANDPORT ].
3) glove vocabulary: receiving: [ 300-dimension number vector ], block: [ 300-dimension number vector ], src: [ 300-dimension number vector ], dest: [ 300-dimension number vector ], and a row of journaled corresponding vector is formed by adding each word.
4) The number features are as follows: receiving 1, block:1, src:1, dest:1, receiving-Block 1, receiving-src 1 …
Expressed as vectors [1, … ], normalized, and calculated as the sum of the number of occurrences, and finally expressed as vectors [1/n,1/n,1/n, … ], where Receiving-block is the combination of the words Receiving and block, and Receiving-src is the combination of the words Receiving and src.
Examples
The specific data execution process is as follows:
step one, inputting an original log Receiving block blk _ -354458 src:/10.250.19.102:39325 dest:/10.250.19.102:50010 into a data processing module, classifying to obtain a log template Receivingblock [ ID ] src: [ x ] dest:/[ IPANDPORT ], and counting the number characteristics of Receiving:1, block:1, src:1, dest:1, receiving-block 1, receiving-src 1 …, normalized to obtain vector [1/n,1/n,1/n,1/n,1/n,1/n, … ].
Step two, converting the original log receiving block blk _ -354458 src:/10.250.19.102:39325 dest:/10.250.19.102:50010 into a 300-dimensional digital vector through table lookup (glove vocabulary), and obtaining a log sequence. And (3) inputting the log sequence into a time sequence model, inputting the digital vector obtained in the step one into a quantity model to respectively obtain a time sequence deviation degree (0-1) and a quantity deviation degree (0-1), and then calculating to obtain a comprehensive deviation degree (0-1).
The model will train a very complex function during the training phase by the input log template vector and the corresponding labels, e.g., normal log label 0 and abnormal label 1. After training, the log template vector is input, which is equivalent to using the trained function to obtain an output result, wherein the value of the output result is 0-1.
And step three, judging whether the comprehensive deviation degree reaches a threshold value, if so, marking the log as abnormal, and outputting other original logs related to the input log instance.

Claims (4)

1. The utility model provides a block chain log anomaly detection and traceability system which characterized in that:
and a data processing module: extracting a template from the data log, wherein the log template comprises a quantitative part and a variable, unstructured log data is structured into a template log which is easy to analyze, and according to the template, the number characteristics are counted, wherein the number characteristics are the number of word occurrences in the template and the number of word occurrences in combination;
the quantity time sequence model training module: training a model through characteristic representation of a log, wherein the characteristic is divided into a number characteristic and a time sequence characteristic, the number characteristic is the number of times of word occurrence and the number of times of word combination occurrence in the template, and the time sequence characteristic is the sequence of the log;
and the deviation degree calculating module is used for: for a log sequence to be detected, firstly, processing data through a data processing module, combining a model trained by a quantity time sequence model training module after the data processing, outputting a numerical value of 0-1 by the model, respectively recording the numerical value as a time sequence model deviation degree and a quantity deviation degree, and comprehensively calculating the final deviation degree;
an anomaly tracing module: writing the logs exceeding the deviation threshold value into a table, giving threat marks, giving a log sequence to which the threat logs belong as tracing output, and if abnormal false alarms are found during audit, marking the abnormal false alarms to enable a system to dynamically adjust the threshold value, so that the accuracy is increased;
in the number timing model training module:
respectively acquiring time sequence characteristics and quantity characteristics of a log sequence;
training the time sequence features in a GRU model based on an attention mechanism to obtain a time sequence model;
putting the quantity features into a decision tree based on gradient lifting for training to obtain a quantity model;
saving a time sequence model and a number model with highest precision in the training process;
a attention-mechanism-based GRU model comprising the steps of:
A. the log is text data, the extracted template is also a text template, semantic conversion is needed before the text is input into the model, and the input log template text is converted into a log template vector by adopting a semantic vector trained by glove;
B. a sliding window mode is adopted, and batch log template vectors are converted into log template sequence vectors;
C. inputting the log template sequence vector into a model, and allowing the model to learn time sequence characteristics;
D. and (5) saving a training result to obtain a time sequence model.
2. The blockchain log anomaly detection and tracing system of claim 1, wherein: the data processing module adopts a drain log template extractor and combines multidimensional feature combination to output statistical features, and the method specifically comprises the following steps:
1) A drain log template extractor extracts templates from the existing log of the blockchain network;
2) Respectively counting the occurrence times of words and the occurrence times of combined words in the templates for the templates extracted by the drain;
3) When the node uploads the log in the blockchain network, the log is classified into a corresponding template and quantity characteristics by using drain for statistics.
3. The blockchain log anomaly detection and tracing system of claim 1, wherein: an abnormality tracing module for tracing the abnormality of the object,
1) Setting a threshold value, judging the deviation value, and marking the deviation value exceeding the threshold value as abnormal;
2) The false alarm data can be marked as false alarm, whether the threshold value is adjusted is judged according to whether the abnormal quantity below the deviation value of the false alarm is particularly small or not in a certain time, and if the abnormal quantity is particularly small, the threshold value is lower, and the abnormal quantity needs to be improved;
3) The log data to be detected is processed by a data processing module and then is input into a trained model in a quantity time sequence model training module to obtain a deviation value;
4) Judging whether the mark is abnormal according to the threshold value;
5) And if the log sequence is abnormal, tracing to output the log sequence related to the abnormality.
4. A blockchain log anomaly detection and tracing system as in claim 3 wherein: the process of outputting the abnormal state is traced,
1) Caching the one-to-one correspondence between the original text of the log to be tested and the vector in the memory;
2) If the mark is abnormal, obtaining original log text information of the log vector and related log sequence information through table lookup;
3) Otherwise, the buffer is emptied, and the next log is detected.
CN202210882913.5A 2022-07-26 2022-07-26 Block chain log anomaly detection and tracing system Active CN115277180B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210882913.5A CN115277180B (en) 2022-07-26 2022-07-26 Block chain log anomaly detection and tracing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210882913.5A CN115277180B (en) 2022-07-26 2022-07-26 Block chain log anomaly detection and tracing system

Publications (2)

Publication Number Publication Date
CN115277180A CN115277180A (en) 2022-11-01
CN115277180B true CN115277180B (en) 2023-04-28

Family

ID=83768725

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210882913.5A Active CN115277180B (en) 2022-07-26 2022-07-26 Block chain log anomaly detection and tracing system

Country Status (1)

Country Link
CN (1) CN115277180B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115794465B (en) * 2022-11-10 2023-12-19 上海鼎茂信息技术有限公司 Log abnormality detection method and system
CN116074092B (en) * 2023-02-07 2024-02-20 电子科技大学 Attack scene reconstruction system based on heterogram attention network
CN116405326B (en) * 2023-06-07 2023-10-20 厦门瞳景智能科技有限公司 Information security management method and system based on block chain

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209168A (en) * 2020-01-14 2020-05-29 中国人民解放军陆军炮兵防空兵学院郑州校区 Log sequence anomaly detection framework based on nLSTM-self attention
CN111930903A (en) * 2020-06-30 2020-11-13 山东师范大学 System anomaly detection method and system based on deep log sequence analysis
CN113434357A (en) * 2021-05-17 2021-09-24 中国科学院信息工程研究所 Log abnormity detection method and device based on sequence prediction
CN114020726A (en) * 2021-11-26 2022-02-08 中国电力科学研究院有限公司 Log auditing method, system, equipment and medium based on multivariate log data analysis
EP3979080A1 (en) * 2020-09-30 2022-04-06 Mastercard International Incorporated Methods and systems for predicting time of server failure using server logs and time-series data
WO2022087389A1 (en) * 2020-10-23 2022-04-28 Coinbase Crypto Services, LLC Blockchain orchestrator computer system
CN114610515A (en) * 2022-03-10 2022-06-10 电子科技大学 Multi-feature log anomaly detection method and system based on log full semantics
CN114676021A (en) * 2022-04-28 2022-06-28 中国工商银行股份有限公司 Job log monitoring method and device, computer equipment and storage medium
CN114741369A (en) * 2022-04-28 2022-07-12 浙江大学滨江研究院 System log detection method of graph network based on self-attention mechanism

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560015A (en) * 2020-12-17 2021-03-26 北京百度网讯科技有限公司 Password updating method, device, equipment and storage medium of electronic equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209168A (en) * 2020-01-14 2020-05-29 中国人民解放军陆军炮兵防空兵学院郑州校区 Log sequence anomaly detection framework based on nLSTM-self attention
CN111930903A (en) * 2020-06-30 2020-11-13 山东师范大学 System anomaly detection method and system based on deep log sequence analysis
EP3979080A1 (en) * 2020-09-30 2022-04-06 Mastercard International Incorporated Methods and systems for predicting time of server failure using server logs and time-series data
WO2022087389A1 (en) * 2020-10-23 2022-04-28 Coinbase Crypto Services, LLC Blockchain orchestrator computer system
CN113434357A (en) * 2021-05-17 2021-09-24 中国科学院信息工程研究所 Log abnormity detection method and device based on sequence prediction
CN114020726A (en) * 2021-11-26 2022-02-08 中国电力科学研究院有限公司 Log auditing method, system, equipment and medium based on multivariate log data analysis
CN114610515A (en) * 2022-03-10 2022-06-10 电子科技大学 Multi-feature log anomaly detection method and system based on log full semantics
CN114676021A (en) * 2022-04-28 2022-06-28 中国工商银行股份有限公司 Job log monitoring method and device, computer equipment and storage medium
CN114741369A (en) * 2022-04-28 2022-07-12 浙江大学滨江研究院 System log detection method of graph network based on self-attention mechanism

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Nivedita Mishra,Sharnil Pandya.Internet of Things Applications,Security Challenges,Attacks,Instrusion Detection,and Future visions:A Systematic Review.《IEEE》.2021,全文. *
Xinqiang Li,Weina Niu,Xiaosong Zhang,Runzi Zhang,Zhenqi Yu,Zimu Li.Improving performance of Log Anomaly Detection with semantic and Time Features based on BiLSTM-Attention.《IEEE》.2022,全文. *
王青文.面向公交车时序数据的异常检测算法研究.《中国优秀硕士论文全文数据库》.2022,全文. *

Also Published As

Publication number Publication date
CN115277180A (en) 2022-11-01

Similar Documents

Publication Publication Date Title
CN115277180B (en) Block chain log anomaly detection and tracing system
CN109697162B (en) Software defect automatic detection method based on open source code library
CN106357618B (en) Web anomaly detection method and device
CN111639497B (en) Abnormal behavior discovery method based on big data machine learning
CN110351301B (en) HTTP request double-layer progressive anomaly detection method
CN112491796B (en) Intrusion detection and semantic decision tree quantitative interpretation method based on convolutional neural network
CN111460167A (en) Method for positioning pollution discharge object based on knowledge graph and related equipment
CN108763931A (en) Leak detection method based on Bi-LSTM and text similarity
CN111881983B (en) Data processing method and device based on classification model, electronic equipment and medium
CN108470022B (en) Intelligent work order quality inspection method based on operation and maintenance management
CN111798312A (en) Financial transaction system abnormity identification method based on isolated forest algorithm
CN114124482B (en) Access flow anomaly detection method and equipment based on LOF and isolated forest
CN110011990B (en) Intelligent analysis method for intranet security threats
CN114201374A (en) Operation and maintenance time sequence data anomaly detection method and system based on hybrid machine learning
CN114844840A (en) Distributed external network flow data detection method based on calculation likelihood ratio
CN112131249A (en) Attack intention identification method and device
CN113407644A (en) Enterprise industry secondary industry multi-label classifier based on deep learning algorithm
CN113779590B (en) Source code vulnerability detection method based on multidimensional characterization
CN114285587B (en) Domain name identification method and device and domain name classification model acquisition method and device
CN117370548A (en) User behavior risk identification method, device, electronic equipment and medium
CN116756659A (en) Intelligent operation and maintenance management method, device, equipment and storage medium
CN115842645A (en) UMAP-RF-based network attack traffic detection method and device and readable storage medium
CN115618085A (en) Interface data exposure detection method based on dynamic label
CN113259398B (en) Account security detection method based on mail log data
CN111882135B (en) Internet of things equipment intrusion detection method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant