CN110581850A

CN110581850A - Gene detection method based on network flow

Info

Publication number: CN110581850A
Application number: CN201910849042.5A
Authority: CN
Inventors: 李春光; 章丽娟; 刘旭; 胡漪逸; 孟凯强; 王亚龙; 赵治博; 朱晓贝; 李维超
Original assignee: Henan Rongpan Network Technology Co Ltd
Current assignee: Henan Rongpan Network Technology Co Ltd
Priority date: 2019-09-09
Filing date: 2019-09-09
Publication date: 2019-12-17

Abstract

The invention provides a gene detection method based on network flow, belonging to the technical field of network communication, and the gene detection method based on network flow specifically comprises the following steps: network-based traffic data analysis; detecting intrusion detection system information; further mining of intrusion data; common network attack modes and characteristic analysis; network attack behaviors are restrained; the task manager of the host is checked. According to the method, a high-performance machine learning network malicious gene training tree model is adopted, the maliciousness of network traffic is judged, the malicious traffic is analyzed in all directions, and effective support is provided for identifying and discovering network security threats in a network and the like; completely recording the network original data traffic, and providing original data for functions of abnormal traffic mining, behavior analysis, tracing evidence obtaining and the like; by adopting an accurate application layer protocol analysis technology, the detection accuracy and efficiency can be greatly improved, and the anomaly detection capability based on protocol analysis can be brought.

Description

Gene detection method based on network flow

Technical Field

The invention belongs to the technical field of network communication, and particularly relates to a gene detection method based on network flow.

Background

The internet today plays an increasingly important role in creating and promoting new business pathways along with the development of enterprise networks. Interest demand has spurred global enterprises and governments to develop highly sophisticated information networks. The complex network contains a variety of technologies that are not inherently related to each other, such as distributed data storage systems, encryption, decryption, and authentication technologies, voice and video over IP (VoIP for short), remote and wireless access, web services, and the like. And enterprise networks now become more accessible, such as most enterprise organizations allowing partners of the company's external network to access the company's internal network, allowing consumers to interact with the company's internal network at the time of e-commerce transactions, and allowing employees to access the company's various management systems through virtual private networks.

The application of computer networks has penetrated every corner of people's life, and network crimes are also spreading continuously, and corresponding safety naturally becomes more and more important along with the continuous expansion of application, and higher requirements are put forward on network safety.

In fact, the current situation of information and network security is very severe, and the following problems have been troubling network users for a long time. Worm viruses, DoS/DDoS (denial of service/distributed denial of service) attacks, and network abuse are currently several major threats to Internet security. The flooding of worm viruses such as Redcode II, Nimda and the like causes a great deal of network resource waste, and even service can not be normally carried out. Distributed denial of service (DDoS) attacks are a very destructive attack method that has emerged in recent years, and the combination of viruses and hacking techniques has made the impact of viruses on network economics to develop into all first-number network security problems. One of the conditions of network abuse is that some users inside the network frequently exchange large files (mainly files unrelated to work, such as audio and video files) for services such as FTP provided by a company at some time, thereby occupying the bandwidth of company services, reducing the effective utilization rate of resources, and the like.

However, the existing network detection method cannot detect and process network security problems such as distributed denial of service (DDoS) attacks, network probing, resource abuse, and the like by monitoring network traffic in real time.

Therefore, it is necessary to invent a method for detecting gene based on network traffic.

Disclosure of Invention

in order to solve the technical problems, the invention provides a network traffic gene detection method, which aims to solve the problem that the existing network detection method cannot realize detection and processing of network security problems such as distributed denial of service (DDoS) attack, network detection, resource abuse and the like by monitoring network traffic in real time.

The average length of the message of the normal flow is-stable data, and scanning attack can generate a large amount of short messages, so that once a large amount of scanning attack occurs, the average message length can be obviously changed, and the average message length can be used as a measure. Flood type DOS attacks are similar to scanning attacks, i.e., attacks typically produce a large number of specific flows that alter the statistical performance of the network traffic population. The flow statistics research shows that besides the average length measurement, the network flow statistics characteristics, such as average TCP message length, average UDP message length, total TCP flow proportion, total UDP flow proportion, and the ratio of various TCP and UDP application flows to total flow, have stable statistics values in a large-scale network, and when abnormal behaviors occur, the values of the statistics values will obviously change, so that the statistics values can be selected as the measurement indexes of abnormal detection to judge the occurrence of the abnormal behaviors.

A gene detection method based on network flow specifically comprises the following steps:

The method comprises the following steps: network-based traffic data analysis;

Step two: detecting intrusion detection system information;

Step three: further mining of intrusion data;

Step four: common network attack modes and characteristic analysis;

Step five: network attack behaviors are restrained;

step six: the task manager of the host is checked.

Preferably, in the step one, the network-based traffic analysis technology is mainly a traffic analysis technology based on SNMP and NetFlow.

Preferably, in step two, the detection technologies in the intrusion detection system mainly include abuse intrusion detection and anomaly detection, and a statistical analysis process in anomaly detection and an expert system-based technology and a statistical analysis-based technology.

Preferably, in step three, the data mining technology in the intrusion detection system; a process of extracting information and knowledge hidden in and potentially useful in a large amount of data using an analysis tool; the method mainly comprises three processing modes of association analysis, sequence analysis, clustering analysis and the like.

Preferably, in step four, the commonly used network attack mode and the characteristics of various attack techniques are mainly the characteristics and modes of DDOS attack and network scanning attack and worm attack.

Preferably, in step five, the network attack behavior is suppressed by an Access Control List (ACL).

Preferably, in step six, the task manager of the inspection host finds out the non-compliant traffic application and performs disinfection and cleaning.

Compared with the prior art, the invention has the following beneficial effects: the gene detection method based on the network flow is widely applied to the technical field of network communication. According to the invention, through flow statistic research, the network flow statistic characteristics are found, besides the average length measure, when abnormal behavior occurs, the numerical values of the statistics will obviously change, so that the statistics can be selected as the measure index of abnormal detection to judge the occurrence of the abnormal behavior of the flow; meanwhile, the invention adopts a high-performance machine learning network malicious gene training tree model to judge the maliciousness of the network flow, carries out all-dimensional analysis on the malicious flow and provides effective support for identifying and discovering infected hosts, vulnerability attacks, worm attacks, Trojan threats, SCADA attacks, SQL injection, unknown network security threats and the like in the network; and completely recording the network original data traffic, and providing original data for functions of abnormal traffic mining, behavior analysis, tracing evidence obtaining and the like. The system has data backtracking analysis and retrieval capability, can check data flow of different time periods, provides behavior characteristics of different dimensions and levels such as protocols, hosts, domain names, countries and the like, and provides efficient and accurate analysis and mining of various network security events for users; the method adopts big data and machine learning algorithm, combines network gene abnormality detection technology, and provides effective support for abnormal flow behavior analysis by using the principle of forward and reverse contrast analysis of malicious flow high-frequency genes and normal flow high-frequency genes; by adopting an accurate application layer protocol analysis technology, the detection accuracy and efficiency can be greatly improved, and the anomaly detection capability based on protocol analysis can be brought.

Drawings

FIG. 1 is a flow chart of a gene detection method based on network traffic.

FIG. 2 is a relational diagram based on the network flow gene detection technique.

fig. 3 is a diagram of TCP connections.

Fig. 4 is a packet size distribution statistical diagram.

FIG. 5 is a task manager diagram.

Detailed Description

The invention is further described below with reference to the accompanying drawings:

As shown in figures 1 and 2

S101: network-based traffic data analysis;

S102: detecting intrusion detection system information;

S103: further mining of intrusion data;

S104: common network attack modes and characteristic analysis;

s105: network attack behaviors are restrained;

s106: the task manager of the host is checked.

In this embodiment, specifically, in S101, the network-based traffic analysis technology is mainly based on SNMP and NetFlow.

In this embodiment, specifically, in S102, the detection technologies in the intrusion detection system mainly include an abuse intrusion detection and an anomaly detection, a statistical analysis process in the anomaly detection, an expert system-based technology, and a statistical analysis-based technology.

In this embodiment, specifically, in S103, the data mining technology in the intrusion detection system; a process of extracting information and knowledge hidden in and potentially useful in a large amount of data using an analysis tool; the method mainly comprises three processing modes of association analysis, sequence analysis, clustering analysis and the like.

In this embodiment, specifically, in S104, the commonly used network attack manner and the characteristics of various attack techniques mainly include characteristics and manners of DDOS attack, network scanning attack, and worm attack.

In this embodiment, specifically, in S105, the network attack behavior is suppressed through an Access Control List (ACL).

In this embodiment, specifically, in S106, the task manager of the inspection host finds out the traffic application condition that is not compliant, and performs virus killing and cleaning.

The invention is further illustrated by the following examples and figures, embodiments of which include, but are not limited to, the following example 1:

The scanning attack is carried out on one machine. The flow chart over 20 minutes is shown in fig. 3 below. At 3 to 7 minutes, a peak suddenly appeared, which showed a possible signature of a scanning attack. First, the unidirectional traffic is TCP traffic and the destination port is 80 ports.

Such a large amount of traffic flowing to 80 ports is abnormal traffic because 80 ports are often used by the HTTP protocol.

Another feature is the average packet size, as shown in fig. 4, where 57.6% of the packets are smaller than 64bytes, this type of packet has little space to store data except for the header trailer, and TCP applications using such small packets are essentially only telnet and ssh applications. In this view, this should be a TCP-SYN attack.

In addition, referring to table 1, it can be determined that this is a scanning attack, because statistically, traffic is distributed from a source address to a plurality of destination addresses.

Open start schoolroom	Terminal furnace	Mutual determining of Tong	source IP	Source service	Source interface	Destination IP	Object support	Destination interface	Size and breadth	sealing bag
											13:11:45	15:05:20	TCP	10.159.58.6	2115	0	10.159.254.6	80	0	48	102
13:11:46	20:45:55	TCP	10.159.58.6	2115	0	10.159.254.7	80	0	48	26
											13:11:46	16:42:20	TCP	10.159.58.6	2115	0	10.159.254.8	80	0	48	30
13:11:47	16:42:20	TCP	10.159.58.6	2115	0	10.159.254.9	80	0	48	30
											13:11:48	13:54:10	TCP	10.159.58.6	2115	0	10.159.254.10	80	0	48	20
13:11:49	13:54:10	TCP	10.159.58.6	2115	0	10.159.254.11	80	0	48	28
											13:11:50	04:26:31	TCP	10.159.58.6	2115	0	10.159.254.12	80	0	48	6
13:11:52	04:26:31	TCP	10.159.58.6	2115	0	10.159.254.13	80	0	48	10
											13:11:53	03:12:04	TCP	10.159.58.6	2115	0	10.159.254.14	80	0	48	46
13:11:54	09:40:20	TCP	10.159.58.6	2115	0	10.159.254.15	80	0	48	82

Table 1 data flow details

During gene test, the discovery system generates an alarm to prompt 10.159.40.28 hosts in the network to have attack behavior, the flow table (table 2) is checked to discover that the hosts have a large number of data packets for carrying out SNMP data communication on network equipment, because the computers are not network management servers of companies, the data packets are not required to appear under normal conditions, the data packets are randomly checked on the computers, and the Solarwidds are found to be illegally installed on the hosts.

Open start schoolroom	Terminal furnace	Mutual determining of Tong	Source IP	Source service	source interface	Destination IP	Object support	Destination interface	size and breadth
										09:04:41	09:04:42	UDP	10.159.40.208	162	0	10.159.3.6	161	0	256
09:04:41	09:04:42	UDP	10.159.40.208	162	0	10.159.3.6	161	0	256
										09:04:41	09:04:42	TCP	10.159.40.208	8630	0	10.159.254.89	80	0	1500
09:04:42	09:04:43	UDP	10.159.40.208	162	0	10.159.3.7	161	0	256
										09:04:42	09:04:43	UDP	10.159.40.208	162	0	10.159.3.7	161	0	256
09:04:42	09:04:43	UDP	10.159.40.208	162	0	10.159.3.7	161	0	256

Table 2 data traffic trust information

the detection system sends out an abnormal flow alarm after a few minutes, and prompts a Trojan program to a computer in the local area network 10.159.36.15 to find out communication abnormality. From table 5, it can be seen that a host with IP 10.159.36.15 has a lot of TCP communication with the Remote host 202.204.23.45, and it can be seen that the communication between the host and the Remote host changes regularly, the local host tries to make TCP connection from its 4000 port to a different port of the Remote server 202.204.23.45, and the 4000 port is a special port, and is a connection port of a Remote-anytime Trojan horse. The system therefore sends an alarm message. For which we checked 10.159.36.15 for this host.

Open start schoolroom	Terminal furnace	Mutual determining of Tong	source IP	source service	Source interface	Destination IP	object support	Destination interface	Size and breadth
										09:04:41	09:04:42	TCP	10.159.36.15	4000	0	202.204.23.45	1829	0	1500
09:04:41	09:04:42	TCP	10.159.36.15	4000	0	202.204.23.45	1829	0	1020
										09:04:41	09:04:42	TCP	10.159.36.15	4000	0	202.204.23.45	1829	0	1500
09:04:42	09:04:43	TCP	10.159.36.200	1821	0	10.159.254.15	13667	0	40
										09:04:42	09:04:43	TCP	10.159.36.15	4000	0	202.204.23.45	1464	0	1500
09:04:42	09:04:43	TCP	10.159.36.15	4000	0	202.204.23.45	1465	0	1500

table 3 data flow details

The task manager of the host is first checked for a slave. exe program (fig. 5), which is not the process that the host should have, and therefore suspects that the host may be installed with a trojan program, a Norton antivirus software program is installed for checking, and the antivirus software alarms that Remote-accelerating trojan software is found. The following is a summary of the horse Remote-incubation found: the Remote control program is a Remote control program produced by TWD Industries, can perform Remote desktop management through TCP/IP, comprises files such as Master.exe (client), Slave.exe (server), uninstant _ slave (unloading bar if the server is operated carelessly), and the like in Remote-analysis, is used as a Remote control tool, has very powerful functions, can control a Remote PC using a local area network or the Internet, has the functions of Remote file transmission, real-time screen capture, registry editing, startup control, text chat and the like, has the size of the server only being KB 83, and has a default port of 400.

The system designed by the design scheme can meet daily requirements of enterprises in the gene detection process, can better solve network attack behaviors such as abnormal flow, scanning attack, DDOS attack and the like in the enterprises, and has higher practical value for early discovery of worm viruses and the like.

A problem has also been found in the testing process, that is, if there is a corresponding Proxy service in the enterprise, many operations of the Proxy server will be similar to the attack type defined by us, for example, there will be many small packets of clients to communicate with the server to determine whether the server is in an active state, and sometimes the server will send some test packets to many clients. These are also considered attack behaviors detected by our system and therefore an exception is required in the database for this type of service.

The technical solutions of the present invention or similar technical solutions designed by those skilled in the art based on the teachings of the technical solutions of the present invention are all within the scope of the present invention.

Claims

1. A network flow gene detection method is characterized by comprising the following steps:

The method comprises the following steps: network-based traffic data analysis;

Step two: detecting intrusion detection system information;

Step three: further mining of intrusion data;

step four: common network attack modes and characteristic analysis;

Step five: network attack behaviors are restrained;

Step six: the task manager of the host is checked.

2. The method for gene detection based on network flow according to claim 1, wherein in step one, the analysis technique based on network flow is mainly based on SNMP and NetFlow; some basic characteristic data of the network traffic extracted from the network traffic, such as the size of the traffic, the information of the packet length, the information of the protocol, the information of the port traffic, the information of the TCP flag bit, etc.; these basic features describe the operational state of the network traffic in more detail.

3. The method according to claim 1, wherein in the second step, the detection technologies in the intrusion detection system mainly include abuse intrusion detection and anomaly detection, and statistical analysis process in anomaly detection, expert system-based technologies and statistical analysis-based technologies.

4. The method for gene detection based on network traffic as claimed in claim 1, wherein in step three, the data mining technique in the intrusion detection system; a process of extracting information and knowledge hidden in and potentially useful in a large amount of data using an analysis tool; the method mainly comprises three processing modes of association analysis, sequence analysis, clustering analysis and the like; for a particular attack, a subset of the basic features involved in the attack are used as features to describe the attack. For example, for SYNFLOOD attack, the information such as pkts/s, average packet length, number of SYN packets, etc. can be selected by the combined characteristics. The characteristics of the attack behavior are learned and trained by using the data of the previous basic characteristic set, so that normal and abnormal models of the combined characteristics of the attack behavior can be obtained in real time. The model can be used for detecting the attack behavior on the network in real time.

5. The method for gene detection based on network traffic as claimed in claim 1, wherein in step four, the commonly used network attack mode and the characteristics of various attack techniques are mainly DDOS attack, network scanning attack and worm attack; learning a data set of known attack types and behaviors can also optimize the attack combination characteristics selected by people, so that the characteristics of the attack behaviors can be reflected better.

6. the method as claimed in claim 1, wherein in step five, the network attack behavior is suppressed by accessing the control list (ACL), and the data set is obtained by extracting the network traffic in real time, which truly reflects the real-time status of the network, so that a platform for cooperative operation and control can be provided for the anomaly detection system between different management domains in the network by sharing the data set.

7. the method according to claim 1, wherein in step six, the task manager of the inspection host finds out the non-compliant traffic application and performs disinfection and cleaning.

8. the method as claimed in claim 1, wherein in step two, about 100 reserved entries are reserved in the basic feature set for future expansion. The reserved items and the extracted contents form a basic feature set with 256 items.