CN114726589A - Alarm data fusion method - Google Patents

Alarm data fusion method Download PDF

Info

Publication number
CN114726589A
CN114726589A CN202210267375.9A CN202210267375A CN114726589A CN 114726589 A CN114726589 A CN 114726589A CN 202210267375 A CN202210267375 A CN 202210267375A CN 114726589 A CN114726589 A CN 114726589A
Authority
CN
China
Prior art keywords
alarm
preset
similarity
time window
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210267375.9A
Other languages
Chinese (zh)
Inventor
陶星宇
黄义杰
高翔
肖华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Paienjie Network Security Co ltd
Nanjing Polytechnic Institute
Original Assignee
Jiangsu Paienjie Network Security Co ltd
Nanjing Polytechnic Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Paienjie Network Security Co ltd, Nanjing Polytechnic Institute filed Critical Jiangsu Paienjie Network Security Co ltd
Priority to CN202210267375.9A priority Critical patent/CN114726589A/en
Publication of CN114726589A publication Critical patent/CN114726589A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0604Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time
    • H04L41/0622Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time based on time
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • H04L41/064Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an alarm data fusion method, which comprises the steps of preprocessing obtained alarm data into a preset format, namely combining all alarm sequences into an alarm time window set according to a preset time difference; carrying out multiple attribute similarity calculation on the sub-time window set; substituting the calculated similarity of various attributes into a preset judgment matrix, calculating a characteristic value and a corresponding characteristic vector of the judgment matrix, fusing the alarm data of the sub-time window set reaching a preset similarity threshold value, and inputting the fused data into a fused data set; if the sub-time window set does not reach the preset similarity threshold value, directly inputting the sub-time window set into the fusion data set; and combining the fused data sets of all the sub time window sets into a reduced alarm data set for output. The invention can solve the problem that a great deal of redundant or misinformed alarms generally exist in the alarm data and find out key safety events.

Description

Alarm data fusion method
Technical Field
The invention relates to the technical field of network security, in particular to an alarm data fusion method.
Background
With the increasing network security, the research on the intrusion detection field has become a research hotspot in the whole computer science field. Intrusion detection is developed from the earliest proposal to the present, and various detection technologies are continuously developed and matured, such as detection technologies based on detection mechanisms and detection data sources. Related products are also increasingly rich in host-based, network-based IDS, distributed IDS, and the like. In addition, researchers at home and abroad have also conducted a great deal of research on intrusion detection methods. The traditional safety protection system has low efficiency on processing a large number of alarms, has high error rate and is easy to ignore key alarm information. The alarm fusion technology is provided for reducing redundant alarms and false alarms in the alarm data generated by the IDS and providing valuable alarm data for the alarm correlation analysis of the next stage. The alarm fusion technology is mainly characterized in that high phases are combined
The similarity alarm data are combined to reduce redundant and false alarm data.
Disclosure of Invention
1. The technical problem to be solved is as follows:
aiming at the technical problem, the invention provides an alarm data fusion method, which is used for carrying out similarity calculation on attributes of repeated and low-level data in a large amount of alarm data generated by an attack event and adopting
2. The technical scheme is as follows:
an alarm data fusion method is characterized in that: preprocessing the obtained alarm data into a preset format, namely all alarm sequences; dividing all alarm sequences according to alarm time, and dividing a previous alarm with a time difference smaller than a preset interval threshold value into a previous time window i-1; if the time difference is larger than or equal to a preset interval threshold, dividing the alarm to the starting point of the next alarm time to obtain the current sub-time window i; on the basis, all alarm sequences are divided into n sub-time window sets, and the n sub-time window sets are combined into an alarm time window set;
carrying out multiple attribute similarity calculation on the sub-time window set; the attribute similarity comprises calculation of IP addresses, port numbers, detection occurrence time and attack type similarity; substituting the calculated similarity of various attributes into a preset judgment matrix, calculating the eigenvalue of the judgment matrix and the corresponding eigenvector, and solving the maximum eigenvalue and the corresponding eigenvector of the judgment matrix; fusing the alarm data of the sub-time window set reaching the preset similarity threshold value, and then inputting the fused data into a fused data set; if the sub-time window set does not reach the preset similarity threshold value, directly inputting the sub-time window set into the fusion data set;
and combining the fused data sets of all the sub time window sets into a reduced alarm data set for output.
Further, the preprocessing specifically comprises extracting key attributes of alarm data from the original data set; converting the format of the original data into a unified sequence according to the intrusion detection message exchange format to obtain all alarm sequences; the key attributes include a characteristic string, an alarm category, an alarm date, an alarm timestamp, a source IP, a source port, a destination IP, and a destination port.
Further, the similarity to the IP address in the attribute similarity calculation is calculated as:
Figure BDA0003552370180000021
(1) in the formula, l is a plurality of continuous same digits, and epsilon is a preset IP similarity threshold; l is the number of consecutive identical digits, l ∈ [1,32 ];
the port similarity is calculated as:
Figure BDA0003552370180000022
(2) in the formula, alert port represents a port number, and alert1.port is a port number with a port number of 1;
the detection occurrence time similarity is as follows:
Figure BDA0003552370180000023
(3) in the formula, Tmin is a preset alarm time minimum threshold, Tmax is a preset alarm time maximum threshold, wherein the time interval is alert1.t ime-alert2.time, namely two continuous alarm time differences;
the attack type similarity is calculated as:
Figure BDA0003552370180000024
(4) type represents the type of alarm.
Further, the preset judgment matrix is a ═ aij)n*nWherein a isijThe importance of the preset key attribute i to the similarity j is [1, 9]The integers in the interval, wherein the numbers 1,3, 5, 7, 9 respectively indicate that the weights are equally important, more important, very important and absolutely important, and 2, 4, 6, 8 are between the two adjacent judgments.
3. Has the advantages that:
the invention provides an alarm data fusion method, which aims at the problem that a large number of redundant or false alarm alarms generally exist in alarm data and key safety events are difficult to find out from the alarm data. Aiming at the fact that certain relation exists among the attributes of the alarm data, the relative importance of each attribute field is different, namely, a similarity matrix among the alarm data is constructed by using an attribute similarity calculation method to replace a traditional similarity measurement method in spectral clustering, and better clustering can be achieved under the condition that the relation among the alarm data is maintained. The method can realize better clustering fusion under the condition of not destroying the relation between alarms, reduce information loss, improve the fusion rate and reduce the false alarm rate of alarm data.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The present invention will be described in detail with reference to the accompanying drawings.
As shown in fig. 1, an alarm data fusion method is characterized in that: preprocessing the obtained alarm data into a preset format, namely all alarm sequences; dividing all alarm sequences according to alarm time, and dividing a previous alarm with a time difference smaller than a preset interval threshold value into a previous time window i-1; if the time difference is larger than or equal to a preset interval threshold, dividing the alarm to the starting point of the next alarm time to obtain the current sub-time window i; based on the method, all the alarm sequences are divided into n sub-time window sets, and the n sub-time window sets are combined into an alarm time window set.
When a port is attacked by DoS, a large number of same or similar alarms can be generated in a short time, generally speaking, the alarm triggering time interval is short and the distribution is concentrated under the same complete continuous attack, and the alarms triggered by the same attack event and different attack events can be effectively divided through the method.
Carrying out multiple attribute similarity calculation on the sub-time window set; the attribute similarity comprises calculation of IP addresses, port numbers, detection occurrence time and attack type similarity; substituting the calculated similarity of various attributes into a preset judgment matrix, calculating the eigenvalue of the judgment matrix and the corresponding eigenvector, and solving the maximum eigenvalue and the corresponding eigenvector of the judgment matrix; fusing the alarm data of the sub-time window set reaching the preset similarity threshold value, and then inputting the fused data into a fused data set; if the sub-time window set does not reach the preset similarity threshold value, directly inputting the sub-time window set into the fusion data set;
and combining the fused data sets of all the sub time window sets into a reduced alarm data set for output.
When the attribute similarity is calculated, because the expressed meanings of the numerical types are greatly different due to different attributes of the numerical types, different attributes need to be calculated by adopting a plurality of similarity calculation methods, and the four attributes of the calculated similarity are respectively an IP address, a port number, detection occurrence time and an attack type.
Further, the preprocessing specifically comprises extracting key attributes of alarm data from the original data set; converting the format of the original data into a unified sequence according to the intrusion detection message exchange format to obtain all alarm sequences; the key attributes include a feature string, an alarm category, an alarm date, an alarm timestamp, a source IP, a source port, a destination IP, and a destination port.
Further, the similarity to the IP address in the attribute similarity calculation is calculated as:
Figure BDA0003552370180000031
(1) in the formula, l is a plurality of continuous same digits, and epsilon is a preset IP similarity threshold; l is the number of consecutive identical bits, l ∈ [1,32 ].
The function of l is to balance the probability of whether two IP addresses belong to the same subnet, if the two IP addresses are in the same subnet and have larger similarity, the larger the value of l is, the more the attack is proved to be from the same attack source or the same attack target, the IP address of the same attack source is similar, and the IP of the same attack target is similar.
The port similarity is calculated as:
Figure BDA0003552370180000041
(2) in the formula, alert port represents a port number, and alert1.port is a port number with a port number of 1. If the port numbers are the same, the similarity is 1, otherwise, the similarity is 0.
The detection occurrence time similarity is as follows:
Figure BDA0003552370180000042
(3) in the formula, Tmin is a preset minimum alarm time threshold, Tmax is a preset maximum alarm time threshold, and the time interval is 1.t ime-alert2.time, which is a difference between two continuous alarm times. And calculating the similarity of the time attributes through the difference of the two alarm time.
The attack type similarity is calculated as:
Figure BDA0003552370180000043
(4) type represents the type of alarm. If the attack types are the same, the similarity is 1, otherwise, the similarity is 0.
Further, the preset judgment matrix is a ═ aij)n*nWherein a isijThe importance of the preset key attribute i to the similarity j specifically comprises [1, 9]The integers in the interval, wherein the numbers 1,3, 5, 7, 9 respectively indicate that the weights are equally important, more important, very important and absolutely important, and 2, 4, 6, 8 are between the two adjacent judgments.
Although the present invention has been described with reference to the preferred embodiments, it should be understood that various changes and modifications can be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (4)

1. An alarm data fusion method is characterized in that: preprocessing the obtained alarm data into a preset format, namely all alarm sequences; dividing all alarm sequences according to alarm time, and dividing a previous alarm with a time difference smaller than a preset interval threshold value into a previous time window i-1; if the time difference is larger than or equal to a preset interval threshold, dividing the alarm to the starting point of the next alarm time to obtain the current sub-time window i; on the basis, all alarm sequences are divided into n sub-time window sets, and the n sub-time window sets are combined into an alarm time window set;
carrying out multiple attribute similarity calculation on the sub-time window set; the attribute similarity comprises calculation of IP addresses, port numbers, detection occurrence time and attack type similarity; substituting the calculated similarity of various attributes into a preset judgment matrix, calculating the eigenvalue of the judgment matrix and the corresponding eigenvector, and solving the maximum eigenvalue and the corresponding eigenvector of the judgment matrix; fusing the alarm data of the sub-time window set reaching the preset similarity threshold value, and then inputting the fused data into a fused data set; if the sub-time window set does not reach the preset similarity threshold value, the sub-time window set is directly input into the fusion data set;
and combining the fused data sets of all the sub time window sets into a reduced alarm data set for output.
2. The alarm data fusion method according to claim 1, characterized in that: the preprocessing specifically comprises extracting key attributes of alarm data from an original data set; converting the format of the original data into a unified sequence according to the intrusion detection message exchange format to obtain all alarm sequences; the key attributes include a feature string, an alarm category, an alarm date, an alarm timestamp, a source IP, a source port, a destination IP, and a destination port.
3. The alarm data fusion method according to claim 1, characterized in that: the similarity calculation of the IP address in the attribute similarity calculation comprises the following steps:
Figure FDA0003552370170000011
(1) in the formula, l is a plurality of continuous same digits, and epsilon is a preset IP similarity threshold; l is the number of a plurality of consecutive identical bits, l ∈ [1,32 ];
the port similarity is calculated as:
Figure FDA0003552370170000012
(2) in the formula, alert port represents a port number, and alert1.port is a port number with a port number of 1;
the detection occurrence time similarity is as follows:
Figure FDA0003552370170000013
(3) in the formula, Tmin is a preset alarm time minimum threshold, Tmax is a preset alarm time maximum threshold, wherein the time interval is alert1.t ime-alert2.time, namely two continuous alarm time differences;
the attack type similarity is calculated as:
Figure FDA0003552370170000021
(4) type represents the type of alarm.
4. The alarm data fusion method according to claim 1, characterized in that: the preset judgment matrix is A ═ aij)n*nWherein a isijThe importance of the preset key attribute i to the similarity j is [1, 9]The integers in the interval, wherein the numbers 1,3, 5, 7, 9 respectively indicate that the weights are equally important, more important, very important and absolutely important, and 2, 4, 6, 8 are between the two adjacent judgments.
CN202210267375.9A 2022-03-17 2022-03-17 Alarm data fusion method Pending CN114726589A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210267375.9A CN114726589A (en) 2022-03-17 2022-03-17 Alarm data fusion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210267375.9A CN114726589A (en) 2022-03-17 2022-03-17 Alarm data fusion method

Publications (1)

Publication Number Publication Date
CN114726589A true CN114726589A (en) 2022-07-08

Family

ID=82236834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210267375.9A Pending CN114726589A (en) 2022-03-17 2022-03-17 Alarm data fusion method

Country Status (1)

Country Link
CN (1) CN114726589A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116701610A (en) * 2023-08-03 2023-09-05 成都大成均图科技有限公司 Effective alarm condition identification method and device based on emergency multisource alarm

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103532949A (en) * 2013-10-14 2014-01-22 刘胜利 Self-adaptive trojan communication behavior detection method on basis of dynamic feedback
CN107517216A (en) * 2017-09-08 2017-12-26 瑞达信息安全产业股份有限公司 A kind of network safety event correlating method
CN108833139A (en) * 2018-05-22 2018-11-16 桂林电子科技大学 A kind of OSSEC alert data polymerization divided based on category attribute
CN110688892A (en) * 2019-08-20 2020-01-14 武汉烽火众智数字技术有限责任公司 Portrait identification alarm method and system based on data fusion technology
US10610160B1 (en) * 2014-04-17 2020-04-07 Cerner Innovation, Inc. Stream-based alarm filtering
US20200322368A1 (en) * 2019-04-03 2020-10-08 Deutsche Telekom Ag Method and system for clustering darknet traffic streams with word embeddings
CN111814897A (en) * 2020-07-20 2020-10-23 辽宁大学 Time series data classification method based on multi-level shape
WO2021098021A1 (en) * 2019-11-20 2021-05-27 珠海格力电器股份有限公司 Data anomaly statistical alarm method and device, and electronic equipment
CN113420802A (en) * 2021-06-04 2021-09-21 桂林电子科技大学 Alarm data fusion method based on improved spectral clustering
CN113422763A (en) * 2021-06-04 2021-09-21 桂林电子科技大学 Alarm correlation analysis method constructed based on attack scene
CN114024830A (en) * 2021-11-05 2022-02-08 哈尔滨理工大学 Grubbs-based alarm correlation method

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103532949A (en) * 2013-10-14 2014-01-22 刘胜利 Self-adaptive trojan communication behavior detection method on basis of dynamic feedback
US10610160B1 (en) * 2014-04-17 2020-04-07 Cerner Innovation, Inc. Stream-based alarm filtering
CN107517216A (en) * 2017-09-08 2017-12-26 瑞达信息安全产业股份有限公司 A kind of network safety event correlating method
CN108833139A (en) * 2018-05-22 2018-11-16 桂林电子科技大学 A kind of OSSEC alert data polymerization divided based on category attribute
US20200322368A1 (en) * 2019-04-03 2020-10-08 Deutsche Telekom Ag Method and system for clustering darknet traffic streams with word embeddings
CN110688892A (en) * 2019-08-20 2020-01-14 武汉烽火众智数字技术有限责任公司 Portrait identification alarm method and system based on data fusion technology
WO2021098021A1 (en) * 2019-11-20 2021-05-27 珠海格力电器股份有限公司 Data anomaly statistical alarm method and device, and electronic equipment
CN111814897A (en) * 2020-07-20 2020-10-23 辽宁大学 Time series data classification method based on multi-level shape
CN113420802A (en) * 2021-06-04 2021-09-21 桂林电子科技大学 Alarm data fusion method based on improved spectral clustering
CN113422763A (en) * 2021-06-04 2021-09-21 桂林电子科技大学 Alarm correlation analysis method constructed based on attack scene
CN114024830A (en) * 2021-11-05 2022-02-08 哈尔滨理工大学 Grubbs-based alarm correlation method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
李洪成;吴晓平;: "基于自扩展时间窗的告警多级聚合与关联方法", 工程科学与技术, no. 01 *
李洪敏;张建平;黄晓芳;卢敏;: "基于序列模式的多步攻击挖掘算法的研究", 兵工自动化, no. 09 *
段祥雯;杨兵;张怡;: "防网络攻击警报信息实时融合处理技术研究与实现", 信息网络安全, no. 07 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116701610A (en) * 2023-08-03 2023-09-05 成都大成均图科技有限公司 Effective alarm condition identification method and device based on emergency multisource alarm

Similar Documents

Publication Publication Date Title
CN108076040B (en) APT attack scene mining method based on killer chain and fuzzy clustering
Gogoi et al. MLH-IDS: a multi-level hybrid intrusion detection method
US9824195B2 (en) Calculating consecutive matches using parallel computing
CN113422763B (en) Alarm correlation analysis method constructed based on attack scene
CN113420802B (en) Alarm data fusion method based on improved spectral clustering
CN112333195B (en) APT attack scene reduction detection method and system based on multi-source log correlation analysis
CN111709022B (en) Hybrid alarm association method based on AP clustering and causal relationship
CN113364787B (en) Botnet flow detection method based on parallel neural network
US7017185B1 (en) Method and system for maintaining network activity data for intrusion detection
CN114726589A (en) Alarm data fusion method
CN113064932A (en) Network situation assessment method based on data mining
US11140123B2 (en) Community detection based on DNS querying patterns
Yu et al. Design of DDoS attack detection system based on intelligent bee colony algorithm
CN116132311B (en) Network security situation awareness method based on time sequence
CN107124410A (en) Network safety situation feature clustering method based on machine deep learning
CN111629027A (en) Trusted file storage processing method based on block chain
Yang et al. Alerts analysis and visualization in network-based intrusion detection systems
CN114024830A (en) Grubbs-based alarm correlation method
CN111901137A (en) Method for mining multi-step attack scene by using honeypot alarm log
CN111556014B (en) Network attack intrusion detection method adopting full-text index
Ismail et al. Enhanced Recursive Feature Elimination for IoT Intrusion Detection Systems
CN111049801B (en) Firewall strategy detection method
CN113850222A (en) Method for realizing vehicle-mounted bus signal classification and monitoring by adopting support vector machine
CN114422389B (en) High-speed real-time network data monitoring method based on hash and hardware acceleration
Katkar et al. Experiments on detection of Denial of Service attacks using REPTree

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination