CN114024748A - Efficient Ethernet workshop flow identification method combining active node library and machine learning - Google Patents

Efficient Ethernet workshop flow identification method combining active node library and machine learning Download PDF

Info

Publication number
CN114024748A
CN114024748A CN202111302612.2A CN202111302612A CN114024748A CN 114024748 A CN114024748 A CN 114024748A CN 202111302612 A CN202111302612 A CN 202111302612A CN 114024748 A CN114024748 A CN 114024748A
Authority
CN
China
Prior art keywords
flow
ethernet
active node
node library
machine learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111302612.2A
Other languages
Chinese (zh)
Other versions
CN114024748B (en
Inventor
胡晓艳
舒卓卓
童钟奇
程光
吴桦
龚俭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202111302612.2A priority Critical patent/CN114024748B/en
Publication of CN114024748A publication Critical patent/CN114024748A/en
Application granted granted Critical
Publication of CN114024748B publication Critical patent/CN114024748B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2483Traffic characterised by specific attributes, e.g. priority or QoS involving identification of individual flows
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides an efficient Ethernet flow identification method combining an active node library and machine learning, which is divided into four parts, wherein the first part is the structure of the active node library; the second part is the training of the recognition model, the third part is the comparison and analysis by using different machine learning algorithms, and the model obtained after the training of the machine learning algorithm which is most suitable for classification is selected as the recognition model; the fourth part is Ethernet flow identification, the concrete content is that flow is divided into TCP flow and UDP flow after being screened by an active node library and then is input into an identification model for identification, and meanwhile, the node information in the Ethernet active node library is updated according to the identification result. The invention can effectively identify the Ethernet flow existing in the current network, and the accuracy rate of the monitoring effect reaches 99%. And the network manager can conveniently monitor the Ethernet workshop network traffic.

Description

Efficient Ethernet workshop flow identification method combining active node library and machine learning
Technical Field
The invention belongs to the technical field of network space security, and relates to a high-efficiency Ethernet workshop flow identification method combining an active node library and machine learning.
Background
The block chain is a distributed account book technology which is commonly maintained by multiple parties and ensures the transmission and access safety through cryptography. The system can realize functions of consistent data storage, difficult tampering, denial prevention and the like in the account book. The block chain technology provides a new solution for further solving the trust problem, the security problem and the efficiency problem in the internet, and also brings new opportunities and challenges for the development of industries such as finance and the like.
After the Chinese clever first proposes the blockchain technique. Various blockchain industries such as bitcoin, ether house and the like, which take encrypted digital currency as the head, develop rapidly. According to statistics of China institute of electronic information industry development, the industrial scale of the block chain in China in 2020 has reached 48.5 billion yuan, and the growth rate reaches 48.5% compared with the last year. With the rapid development of the entire industrial scale, potential safety supervision problems in the blockchain are also exposed. Firstly, the block chain digital currency provides a safe and stable money washing way for crimes such as money washing, Lesog virus and the like, and promotes the development of dark nets and black products to a great extent; secondly, the block chain digital currency enables the trans-national fund transfer to be simpler, and the stability of financial markets of various countries is influenced; finally, due to the decentralized and non-falsifiable property of the blockchain, the blockchain is often used for storage and transmission of sensitive information, which seriously affects the health of the network ecological environment. The abuse of the block chain not only harms the national security and the social stability, but also brings great threat and challenge to the network security supervision.
As a representative application in the blockchain, the bitcoin realizes the application development of the blockchain by a script engine. This also makes the bitcoin limited by the expressive power of the scripting language, and it is difficult to maintain complex contract development, so its performance is greatly limited; on the basis of an Etherum Virtual Machine (EVM), the Etherum Virtual Machine (Etherum Virtual Machine) abstracts a block chain system into a state Machine based on transaction, and supports the recording of any information and the execution of any function by using a smart programming language. In the 23 rd global public chain technology evaluation index released by the institute of development of electronic information industry in China, Etheng is the first place in the applicability evaluation of 37 public chains. Compared with other block chain implementation schemes such as bitcoin and the like, the ether house can better support block chain distributed application development and has higher research value and research significance.
However, dissymmetry with the rapid development of the blockchain industry is the lag behind blockchain regulatory techniques. Most of the existing researches on the block chain security problem are directed to the exploration of a block chain technology, such as a block chain attack mode, a block chain design bug, a block chain application direction and the like, and the analysis on the block chain security problem on the network traffic supervision level is lacked. However, ether works as the most applicable blockchain platform, and will be developed with the maturity of blockchain technology. The method has the advantages that the Ethernet workshop network flow is measured and analyzed, and the exploration of the Ethernet workshop safety supervision scheme has important significance on the Ethernet workshop network safety and even the block chain network safety.
Therefore, the invention collects the Ethernet lane flow in the network by constructing the active Ethernet lane nodes in the network. Then, the traffic is divided into TCP traffic and UDP traffic to respectively correspond to the identification characteristics. And (4) completing the identification and the distinction of normal flow and Ether flow by using a random forest algorithm.
Disclosure of Invention
In order to effectively supervise the Ethernet workshop and realize the identification of the Ethernet workshop flow, the invention provides an efficient Ethernet workshop flow identification method combining an active node library and machine learning. Aiming at the problem of Ethernet flow characteristic concealment, a high-efficiency Ethernet flow identification method combining an active node library and machine learning is provided. The method firstly uses the core node library of the Etherhouse to initialize the active node library according to the inherent small world characteristic of the Etherhouse. Then, an active node library is constructed on the basis of the core node library, and the active node library comprises Ethernet nodes in an active state; corresponding characteristics are respectively extracted aiming at the Ethernet workshop node discovery process based on UDP and the Ethernet workshop data transmission process based on TCP, and the Ethernet workshop flow is further identified by a machine learning method; and finally, combining the selected characteristics and a model generated by training, filtering the flow through the active node library, and inputting the flow into the recognition model to finish the recognition of the flow of the Ether workshop. In order to achieve the purpose, the invention provides the following technical scheme:
an efficient Ethernet flow identification method combining an active node library and machine learning comprises the following steps:
(1) the active node base stores the ethernet house node information in the current area based on the assumption that the total number of ethernet house nodes in the supervised area tends to converge. And collecting the core node information of the Ethernet workshop to initialize an active node library and obtain the flow of the Ethernet workshop.
(2) Flow characteristics of Ethernet UDP flow and TCP flow are respectively selected, and corresponding flow characteristics are extracted for supplementing according to Ethernet NDP protocol and RLPx characteristics.
(3) Accurate identification of the flow of the Ether house is realized by a machine learning method, and a data set is constructed to test and evaluate the obtained model.
(4) On the basis of the constructed active node library and the obtained identification model, identifying the flow input of the Ethengfang and correspondingly updating the active node library;
further, the step (1) specifically includes the following substeps:
(1.1) acquiring all currently-disclosed Ethernet house core node information through a web crawler, and storing the information in a core node library in an IP address form;
(1.2) initializing the active node library according to the collected information of the core node library;
(1.3) dynamically updating the information of the known nodes through the information of the nodes in the active node library, thereby obtaining the information of the Ethernet workshop nodes in the whole supervision area;
(1.4) setting an expiration time, and eliminating the Ethernet nodes which are not active in the long-time active node library, so that the timeliness of the active node library is ensured, and the efficiency of flow screening is improved;
(1.5) modifying a tool NodeFinder for detecting the ethernet house nodes to communicate with the detected ethernet house nodes;
(1.7) capturing the Etherhouse traffic at the intermediate router.
Further, the step (2) specifically includes the following sub-steps:
(2.1) reflecting the characteristic correlation according to the mutual information, and respectively selecting the first 10 characteristics with the highest mutual information value in the Ethernet UDP flow and the TCP flow;
(2.2) analyzing the structure of the Ethernet UDP flow data packet to obtain UDP flow characteristics;
and (2.3) analyzing the encryption handshake EncHandshake process of the RPLx protocol to obtain the flow characteristics of the Ethernet TCP.
Further ethernet TCP and UDP traffic characteristics we have selected are shown in table 1 and table 2 below:
Figure BDA0003338926600000031
Figure BDA0003338926600000041
further, the step (3) specifically includes the following sub-steps:
(3.1) combining the obtained Ether house flow with various application flows in the public data set VPN-non VPN to form a data set ETI required by the experiment;
(3.2) data set according to 8: 2 into a training set and a test set, using four machine learning algorithms: the method used is evaluated from multiple indexes by support vector machines, random forests, logistic regression, and K nearest neighbors.
Further, the step (4) specifically includes the following sub-steps:
(4.1) screening unidentified flow through an active node library, inputting an identification model, and judging whether the output flow is the flow of the Ethengfang;
and (4.2) updating the node information of the Ethernet active node library according to the identification result.
Compared with the prior art, the invention has the following advantages and beneficial effects:
(1) the invention can effectively identify the Ethernet flow existing in the current network, and the accuracy rate of the monitoring effect reaches 99%. And the network manager can conveniently monitor the Ethernet workshop network traffic.
(2) In the invention, TCP and UDP flows are separated, and data packet structure analysis and other work are respectively carried out on the TCP and UDP flows, so that the most suitable classification characteristics are obtained, and the monitoring accuracy is effectively improved by combining the use judgment of mutual information.
(3) The method constructs the Ethernet workshop active node library, and screens out potential Ethernet workshop flow through the active node library. Compared with the flow detection method without active node library filtering, the detection accuracy rate, the accuracy rate and other indexes are improved by 3% on average. The time consumed for detecting the same number of ether house traffic is less than 50% of the traffic detection method without filtering by the active node library.
(4) The method for screening the flow through the Ethernet workshop active node library can effectively avoid negative influence on the identification performance.
Drawings
FIG. 1 is a schematic diagram of an experimental environment setup;
FIG. 2 is a schematic view of an identification framework;
FIG. 3 is a schematic representation of performance of different machine learning algorithms before and after screening using an active node library for UDP stream identification;
FIG. 4 is a schematic representation of performance of different machine learning algorithms before and after screening using an active node library for UDP flow identification;
fig. 5 identifies a time-consuming graph, wherein (a) UDP traffic identification efficiency graph and (b) TCP traffic identification efficiency graph.
Detailed Description
The technical solutions provided by the present invention will be described in detail below with reference to specific examples, and it should be understood that the following specific embodiments are only illustrative of the present invention and are not intended to limit the scope of the present invention.
Example 1: the invention provides an efficient Ethernet workshop flow identification method combining an active node library and machine learning, wherein an identification frame is divided into four parts as shown in figure 2, the first part is the structure of the active node library, the specific content is that core node information for ensuring the Ethernet workshop to stably run is stored in a core node library, the initialization of the active node library is carried out, the searching, adding, deleting and other operations of the active nodes are completed, and the active node library is constructed. Then, the flow collection unit is deployed through the active node library to collect the Ethernet flow; the second part is the training of the recognition model, and the specific content is that after the Ethernet flow is divided into TCP and UDP flows, the data packet structures of the Ethernet flow are respectively subjected to correlation analysis, the Ethernet flow recognition characteristics which are most suitable for classification are obtained through actual data verification, and meanwhile, the mutual information measuring unit is used for carrying out characteristic screening work. After the selection of the characteristics is completed, dividing the obtained Ether house flow and background flow into a training set test set by using the obtained Ether house flow and background flow as data sets; the third part is to use different machine learning algorithms to perform comparative analysis, and select a model obtained after training the machine learning algorithm which is most suitable for classification as an identification model; the fourth part is Ethernet flow identification, and the concrete content is that flow is divided into TCP and UDP flows after being screened by an active node library and then is input into an identification model for identification.
Specifically, the method for quickly identifying the bit currency dug botnet flow comprises the following steps:
(1) and constructing an active node library, and building an experimental environment to collect the flow of the relevant Ether workshop.
The specific process of the step is as follows:
(1.1) acquiring all currently-disclosed Ethernet core node information by using a web crawler, and storing the information in a core node library in the form of an IP address;
(1.2) initializing an active node library by using the information acquired by the core node library, continuously searching the currently existing active nodes of the Ethern, and dynamically updating the node information of the active node library;
(1.3) setting an expiration time for each active node, removing Ethernet nodes which are not active for a long time, and ensuring the timeliness of an active node library;
(1.4) modifying a tool NodeFinder for detecting the ethernet house nodes to communicate with the detected ethernet house nodes;
(1.5) capturing the Ethernet traffic on the intermediate router through Wireshark software;
(1.6) various application traffic in the public data set VPN-non VPN is used as background flow.
(2) Dividing the original Ethernet flow into TCP and UDP flows, analyzing the data packet structures of the two flows, extracting the characteristics which can be used for identifying and classifying the complete flow data, using mutual information and selecting the characteristics, and reserving the characteristics which can be used for recording and identifying and classifying.
The specific process in this step is as follows:
(2.1) dividing the original Ether house flow into TCP and UDP flows;
and (2.2) screening characteristics by using mutual information indexes on the basis of 80 common flow statistical characteristics proposed by Draper et al. And respectively selecting the first ten characteristics with the highest mutual information of TCP and UDP flows.
And (2.3) analyzing the structure of the Ethernet UDP traffic data packet, wherein the lengths of the data packets in the UDP flow have strict sequence relation, and the lengths of each type of data packets have different and stable distribution. The length of the first eight packets in the UDP stream is extracted as a feature.
(2.3) feature names of the 18 features of the extracted EtherFang UDP traffic and corresponding meanings of the features are as follows
Shown in Table 3
Figure BDA0003338926600000071
(2.4) analyzing the Ethernet TCP traffic interaction process, and finding that the Ethernet TCP stream contains a plurality of data packets with equal payload length. The payload of the packet carrying the header is typically a combination of packets 32B, 1B and 12B. The method is characterized in that the average load length of two data packets and the proportion of the data packets with the load lengths of 32B, 1B and 12B in the total data packets in the encryption handshake stage are used as the characteristics.
(2.5) the feature names of the 12 features of the Ethernet TCP flow extracted and the corresponding meanings of the features are as follows
Shown in Table 4
Figure BDA0003338926600000072
(3) And after the selection of the features is completed, dividing the obtained Ether house traffic and background traffic into a training set test set by using the previously obtained Ether house traffic and background traffic as data sets. And performing comparative analysis by using different machine learning algorithms, and selecting the model which is obtained after the machine learning algorithm which is most suitable for classification is trained as the recognition model.
The specific process in this step is as follows:
(3.1) constructing a Etherhouse traffic data set using the data collected in step (1), and mapping the data set to a database with 8: the ratio of 2 is divided into a training set and a test set. And selecting the random forest algorithm with the highest accuracy by comparing parameters such as accuracy of algorithm models such as random forests, K neighbors and naive Bayes. And meanwhile, comparing the recognition effects before and after screening the flow by using an active node library method. As can be seen, the identification accuracy after screening by the active node library method is improved by 3% compared with the average accuracy, and the specific analysis results are shown in FIGS. 3 and 4.
And (3.2) carrying out time-consuming evaluation by combining the active node library and a machine learning identification method with Ethernet traffic identification and detection of a traditional method. Compared with the traditional detection method, the time consumption is reduced by more than 50% by combining the active node library and the machine learning identification method, and the specific analysis result is shown in figure 5.
(4) And after screening the flow through the active node library, dividing the flow into TCP and UDP flows, inputting the TCP and UDP flows into an identification model for identification, and updating the information of the active node library of the Ethernet workshop according to the result of identifying the flow.
The method specifically comprises the following steps:
and (4.1) extracting the source and destination IP address of the flow to be detected, inputting the IP address into an active node library, and judging whether the active node library contains the IP address.
And (4.2) if the relevant IP addresses are contained, dividing the flow into TCP and UDP flows, lifting the relevant characteristics, and then respectively putting the TCP and UDP flows into the identification model obtained in the step (3) for judgment and identification.
And (4.3) if the traffic is identified to be the Ethernet traffic and the source and destination IPs are not in the active node library, adding relevant IP address information to the active node library as a new active node.
(4.4) setting an active time for the active node, and if the nodes which do not respond are beyond the active time, deleting the nodes from the active node library.
The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications are also considered to be within the scope of the present invention.

Claims (6)

1. An efficient Ethernet flow identification method combining an active node library and machine learning is characterized by comprising the following steps:
(1) based on the assumption that the total number of Ether house nodes in the supervision area tends to be convergent, storing the Ether house node information in the current area by using the active node library, collecting the Ether house core node information, initializing the active node library, and acquiring Ether house flow;
(2) flow characteristics of Ethernet UDP flow and TCP flow are respectively selected, and corresponding flow characteristics are extracted for supplementing according to Ethernet NDP protocol and RLPx characteristics;
(3) accurate identification of the flow of the Etheng is realized by a machine learning method, and a data set is constructed to test and evaluate the obtained model;
(4) and identifying the flow input of the Ether house based on the constructed active node library and the obtained identification model.
2. The efficient Ether house traffic identification method combining the active node library and the machine learning according to claim 1, characterized in that, the step (1) collects the Ether house core node information to initialize the active node library and obtain the Ether house traffic; the method specifically comprises the following substeps:
(1.1) acquiring all currently-disclosed Ethernet house core node information through a web crawler, and storing the information in a core node library in an IP address form;
(1.2) initializing the active node library according to the collected information of the core node library;
(1.3) dynamically updating the information of the known nodes through the information of the nodes in the active node library, thereby obtaining the information of the Ethernet workshop nodes in the whole supervision area;
(1.4) setting an expiration time, removing the inactive Ether house nodes in the long-time active node library,
(1.5) modifying a tool NodeFinder for detecting the ethernet house nodes to communicate with the detected ethernet house nodes;
(1.7) capturing the Etherhouse traffic at the intermediate router.
3. The efficient Etherhouse traffic identification method in combination with active node base and machine learning according to claim 1, wherein step (2) comprises the following sub-steps:
(2.1) analyzing the structure of the Ethernet UDP flow data packet to obtain UDP flow characteristics;
(2.2) analyzing the encryption handshake EncHandshake process of the RPLx protocol to obtain the flow characteristics of the Ethernet TCP;
and (2.3) reflecting the characteristic correlation according to the mutual information, respectively selecting the first 10 characteristics with the highest mutual information value in the Ethernet UDP flow and the TCP flow, and adding the relevant characteristics obtained in the steps (2.1) and (2.2) as the finally selected characteristics.
4. The method of efficient Etherhouse traffic recognition in conjunction with active node library and machine learning of claim 3, wherein said steps are performed in a manner that is consistent with said steps
Suitable useful characteristics in (2.3) are shown in the following table:
Figure FDA0003338926590000021
Figure FDA0003338926590000022
5. the efficient Etherhouse traffic identification method in combination with active node base and machine learning according to claim 1, wherein said step (3) comprises the following sub-steps:
(3.1) combining the obtained Ether house flow with various application flows in the public data set VPN-non VPN to form a data set ETI required by the experiment;
(3.2) data set in 8: 2 into a training set and a test set, using four machine learning algorithms: the method used is evaluated from multiple indexes by support vector machines, random forests, logistic regression, and K nearest neighbors.
6. The efficient Etherhouse traffic identification method in combination with active node base and machine learning according to claim 1, wherein said step (4) comprises the following sub-steps:
(4.1) screening unrecognized traffic through an active node library, inputting the traffic into a recognition model, judging whether the output traffic is the traffic of the Ethengfang,
and (4.2) updating the node information of the Ethernet active node library according to the identification result.
CN202111302612.2A 2021-11-04 2021-11-04 Efficient Ethernet traffic identification method combining active node library and machine learning Active CN114024748B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111302612.2A CN114024748B (en) 2021-11-04 2021-11-04 Efficient Ethernet traffic identification method combining active node library and machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111302612.2A CN114024748B (en) 2021-11-04 2021-11-04 Efficient Ethernet traffic identification method combining active node library and machine learning

Publications (2)

Publication Number Publication Date
CN114024748A true CN114024748A (en) 2022-02-08
CN114024748B CN114024748B (en) 2024-04-30

Family

ID=80061397

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111302612.2A Active CN114024748B (en) 2021-11-04 2021-11-04 Efficient Ethernet traffic identification method combining active node library and machine learning

Country Status (1)

Country Link
CN (1) CN114024748B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115442291A (en) * 2022-08-19 2022-12-06 南京理工大学 Ethernet-oriented active network topology sensing method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102315974A (en) * 2011-10-17 2012-01-11 北京邮电大学 Stratification characteristic analysis-based method and apparatus thereof for on-line identification for TCP, UDP flows
CN111082995A (en) * 2019-12-25 2020-04-28 中国科学院信息工程研究所 Ethernet workshop network behavior analysis method, corresponding storage medium and electronic device
US20200311583A1 (en) * 2019-04-01 2020-10-01 Hewlett Packard Enterprise Development Lp System and methods for fault tolerance in decentralized model building for machine learning using blockchain
CN111865823A (en) * 2020-06-24 2020-10-30 东南大学 Light-weight Ether house encrypted flow identification method
CN112910918A (en) * 2021-02-26 2021-06-04 南方电网科学研究院有限责任公司 Industrial control network DDoS attack traffic detection method and device based on random forest
CN113344562A (en) * 2021-08-09 2021-09-03 四川大学 Method and device for detecting Etheng phishing accounts based on deep neural network
CN113469275A (en) * 2021-07-21 2021-10-01 东南大学 Refined classification method for ether house behavior traffic

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102315974A (en) * 2011-10-17 2012-01-11 北京邮电大学 Stratification characteristic analysis-based method and apparatus thereof for on-line identification for TCP, UDP flows
US20200311583A1 (en) * 2019-04-01 2020-10-01 Hewlett Packard Enterprise Development Lp System and methods for fault tolerance in decentralized model building for machine learning using blockchain
CN111082995A (en) * 2019-12-25 2020-04-28 中国科学院信息工程研究所 Ethernet workshop network behavior analysis method, corresponding storage medium and electronic device
CN111865823A (en) * 2020-06-24 2020-10-30 东南大学 Light-weight Ether house encrypted flow identification method
CN112910918A (en) * 2021-02-26 2021-06-04 南方电网科学研究院有限责任公司 Industrial control network DDoS attack traffic detection method and device based on random forest
CN113469275A (en) * 2021-07-21 2021-10-01 东南大学 Refined classification method for ether house behavior traffic
CN113344562A (en) * 2021-08-09 2021-09-03 四川大学 Method and device for detecting Etheng phishing accounts based on deep neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡晓艳等: "基于活跃节点库的以太坊加密流量识别方法", 网络空间安全, vol. 11, no. 8, 25 August 2020 (2020-08-25), pages 34 - 39 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115442291A (en) * 2022-08-19 2022-12-06 南京理工大学 Ethernet-oriented active network topology sensing method

Also Published As

Publication number Publication date
CN114024748B (en) 2024-04-30

Similar Documents

Publication Publication Date Title
CN109450842B (en) Network malicious behavior recognition method based on neural network
Silveira et al. URCA: Pulling out anomalies by their root causes
CN109117634A (en) Malware detection method and system based on network flow multi-view integration
CN113344562B (en) Method and device for detecting Etheng phishing accounts based on deep neural network
CN102420723A (en) Anomaly detection method for various kinds of intrusion
Sivamohan et al. An effective recurrent neural network (RNN) based intrusion detection via bi-directional long short-term memory
CN107483451B (en) Method and system for processing network security data based on serial-parallel structure and social network
CN112800424A (en) Botnet malicious traffic monitoring method based on random forest
Chen et al. An effective metaheuristic algorithm for intrusion detection system
Bian et al. Host in danger? detecting network intrusions from authentication logs
CN110519228B (en) Method and system for identifying malicious cloud robot in black-production scene
CN117454376A (en) Industrial Internet data security detection response and tracing method and device
CN114024748B (en) Efficient Ethernet traffic identification method combining active node library and machine learning
CN113518073B (en) Method for rapidly identifying bit currency mining botnet flow
Praneeth et al. Security: intrusion prevention system using deep learning on the internet of vehicles
CN112235254B (en) Rapid identification method for Tor network bridge in high-speed backbone network
Hammerschmidt et al. Reliable machine learning for networking: Key issues and approaches
CN110458209A (en) A kind of escape attack method and device for integrated Tree Classifier
CN116304252A (en) Communication network fraud prevention method based on graph structure clustering
Kumar et al. Machine learning based traffic classification using low level features and statistical analysis
CN113468555A (en) Method, system and device for identifying client access behavior
Atmojo et al. A New Approach for ARP Poisoning Attack Detection Based on Network Traffic Analysis
Erokhin et al. The Dataset Features Selection for Detecting and Classifying Network Attacks
Lin et al. Behaviour classification of cyber attacks using convolutional neural networks
Difaizi et al. URL Based Malicious Activity Detection Using Machine Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant