CN114024748A - Efficient Ethernet workshop flow identification method combining active node library and machine learning - Google Patents
Efficient Ethernet workshop flow identification method combining active node library and machine learning Download PDFInfo
- Publication number
- CN114024748A CN114024748A CN202111302612.2A CN202111302612A CN114024748A CN 114024748 A CN114024748 A CN 114024748A CN 202111302612 A CN202111302612 A CN 202111302612A CN 114024748 A CN114024748 A CN 114024748A
- Authority
- CN
- China
- Prior art keywords
- flow
- ethernet
- active node
- node library
- machine learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000010801 machine learning Methods 0.000 title claims abstract description 29
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 13
- 238000012549 training Methods 0.000 claims abstract description 10
- RTZKZFJDLAIYFH-UHFFFAOYSA-N Diethyl ether Chemical compound CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 claims description 51
- 238000012216 screening Methods 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 7
- 238000007637 random forest analysis Methods 0.000 claims description 5
- 238000002474 experimental method Methods 0.000 claims description 2
- 238000007477 logistic regression Methods 0.000 claims description 2
- 230000001502 supplementing effect Effects 0.000 claims description 2
- 238000012706 support-vector machine Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 3
- 238000012544 monitoring process Methods 0.000 abstract description 3
- 238000011161 development Methods 0.000 description 8
- 238000001514 detection method Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000010835 comparative analysis Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005406 washing Methods 0.000 description 2
- 241000700605 Viruses Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013524 data verification Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2441—Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2483—Traffic characterised by specific attributes, e.g. priority or QoS involving identification of individual flows
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1433—Vulnerability analysis
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/16—Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/50—Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computer Hardware Design (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention provides an efficient Ethernet flow identification method combining an active node library and machine learning, which is divided into four parts, wherein the first part is the structure of the active node library; the second part is the training of the recognition model, the third part is the comparison and analysis by using different machine learning algorithms, and the model obtained after the training of the machine learning algorithm which is most suitable for classification is selected as the recognition model; the fourth part is Ethernet flow identification, the concrete content is that flow is divided into TCP flow and UDP flow after being screened by an active node library and then is input into an identification model for identification, and meanwhile, the node information in the Ethernet active node library is updated according to the identification result. The invention can effectively identify the Ethernet flow existing in the current network, and the accuracy rate of the monitoring effect reaches 99%. And the network manager can conveniently monitor the Ethernet workshop network traffic.
Description
Technical Field
The invention belongs to the technical field of network space security, and relates to a high-efficiency Ethernet workshop flow identification method combining an active node library and machine learning.
Background
The block chain is a distributed account book technology which is commonly maintained by multiple parties and ensures the transmission and access safety through cryptography. The system can realize functions of consistent data storage, difficult tampering, denial prevention and the like in the account book. The block chain technology provides a new solution for further solving the trust problem, the security problem and the efficiency problem in the internet, and also brings new opportunities and challenges for the development of industries such as finance and the like.
After the Chinese clever first proposes the blockchain technique. Various blockchain industries such as bitcoin, ether house and the like, which take encrypted digital currency as the head, develop rapidly. According to statistics of China institute of electronic information industry development, the industrial scale of the block chain in China in 2020 has reached 48.5 billion yuan, and the growth rate reaches 48.5% compared with the last year. With the rapid development of the entire industrial scale, potential safety supervision problems in the blockchain are also exposed. Firstly, the block chain digital currency provides a safe and stable money washing way for crimes such as money washing, Lesog virus and the like, and promotes the development of dark nets and black products to a great extent; secondly, the block chain digital currency enables the trans-national fund transfer to be simpler, and the stability of financial markets of various countries is influenced; finally, due to the decentralized and non-falsifiable property of the blockchain, the blockchain is often used for storage and transmission of sensitive information, which seriously affects the health of the network ecological environment. The abuse of the block chain not only harms the national security and the social stability, but also brings great threat and challenge to the network security supervision.
As a representative application in the blockchain, the bitcoin realizes the application development of the blockchain by a script engine. This also makes the bitcoin limited by the expressive power of the scripting language, and it is difficult to maintain complex contract development, so its performance is greatly limited; on the basis of an Etherum Virtual Machine (EVM), the Etherum Virtual Machine (Etherum Virtual Machine) abstracts a block chain system into a state Machine based on transaction, and supports the recording of any information and the execution of any function by using a smart programming language. In the 23 rd global public chain technology evaluation index released by the institute of development of electronic information industry in China, Etheng is the first place in the applicability evaluation of 37 public chains. Compared with other block chain implementation schemes such as bitcoin and the like, the ether house can better support block chain distributed application development and has higher research value and research significance.
However, dissymmetry with the rapid development of the blockchain industry is the lag behind blockchain regulatory techniques. Most of the existing researches on the block chain security problem are directed to the exploration of a block chain technology, such as a block chain attack mode, a block chain design bug, a block chain application direction and the like, and the analysis on the block chain security problem on the network traffic supervision level is lacked. However, ether works as the most applicable blockchain platform, and will be developed with the maturity of blockchain technology. The method has the advantages that the Ethernet workshop network flow is measured and analyzed, and the exploration of the Ethernet workshop safety supervision scheme has important significance on the Ethernet workshop network safety and even the block chain network safety.
Therefore, the invention collects the Ethernet lane flow in the network by constructing the active Ethernet lane nodes in the network. Then, the traffic is divided into TCP traffic and UDP traffic to respectively correspond to the identification characteristics. And (4) completing the identification and the distinction of normal flow and Ether flow by using a random forest algorithm.
Disclosure of Invention
In order to effectively supervise the Ethernet workshop and realize the identification of the Ethernet workshop flow, the invention provides an efficient Ethernet workshop flow identification method combining an active node library and machine learning. Aiming at the problem of Ethernet flow characteristic concealment, a high-efficiency Ethernet flow identification method combining an active node library and machine learning is provided. The method firstly uses the core node library of the Etherhouse to initialize the active node library according to the inherent small world characteristic of the Etherhouse. Then, an active node library is constructed on the basis of the core node library, and the active node library comprises Ethernet nodes in an active state; corresponding characteristics are respectively extracted aiming at the Ethernet workshop node discovery process based on UDP and the Ethernet workshop data transmission process based on TCP, and the Ethernet workshop flow is further identified by a machine learning method; and finally, combining the selected characteristics and a model generated by training, filtering the flow through the active node library, and inputting the flow into the recognition model to finish the recognition of the flow of the Ether workshop. In order to achieve the purpose, the invention provides the following technical scheme:
an efficient Ethernet flow identification method combining an active node library and machine learning comprises the following steps:
(1) the active node base stores the ethernet house node information in the current area based on the assumption that the total number of ethernet house nodes in the supervised area tends to converge. And collecting the core node information of the Ethernet workshop to initialize an active node library and obtain the flow of the Ethernet workshop.
(2) Flow characteristics of Ethernet UDP flow and TCP flow are respectively selected, and corresponding flow characteristics are extracted for supplementing according to Ethernet NDP protocol and RLPx characteristics.
(3) Accurate identification of the flow of the Ether house is realized by a machine learning method, and a data set is constructed to test and evaluate the obtained model.
(4) On the basis of the constructed active node library and the obtained identification model, identifying the flow input of the Ethengfang and correspondingly updating the active node library;
further, the step (1) specifically includes the following substeps:
(1.1) acquiring all currently-disclosed Ethernet house core node information through a web crawler, and storing the information in a core node library in an IP address form;
(1.2) initializing the active node library according to the collected information of the core node library;
(1.3) dynamically updating the information of the known nodes through the information of the nodes in the active node library, thereby obtaining the information of the Ethernet workshop nodes in the whole supervision area;
(1.4) setting an expiration time, and eliminating the Ethernet nodes which are not active in the long-time active node library, so that the timeliness of the active node library is ensured, and the efficiency of flow screening is improved;
(1.5) modifying a tool NodeFinder for detecting the ethernet house nodes to communicate with the detected ethernet house nodes;
(1.7) capturing the Etherhouse traffic at the intermediate router.
Further, the step (2) specifically includes the following sub-steps:
(2.1) reflecting the characteristic correlation according to the mutual information, and respectively selecting the first 10 characteristics with the highest mutual information value in the Ethernet UDP flow and the TCP flow;
(2.2) analyzing the structure of the Ethernet UDP flow data packet to obtain UDP flow characteristics;
and (2.3) analyzing the encryption handshake EncHandshake process of the RPLx protocol to obtain the flow characteristics of the Ethernet TCP.
Further ethernet TCP and UDP traffic characteristics we have selected are shown in table 1 and table 2 below:
further, the step (3) specifically includes the following sub-steps:
(3.1) combining the obtained Ether house flow with various application flows in the public data set VPN-non VPN to form a data set ETI required by the experiment;
(3.2) data set according to 8: 2 into a training set and a test set, using four machine learning algorithms: the method used is evaluated from multiple indexes by support vector machines, random forests, logistic regression, and K nearest neighbors.
Further, the step (4) specifically includes the following sub-steps:
(4.1) screening unidentified flow through an active node library, inputting an identification model, and judging whether the output flow is the flow of the Ethengfang;
and (4.2) updating the node information of the Ethernet active node library according to the identification result.
Compared with the prior art, the invention has the following advantages and beneficial effects:
(1) the invention can effectively identify the Ethernet flow existing in the current network, and the accuracy rate of the monitoring effect reaches 99%. And the network manager can conveniently monitor the Ethernet workshop network traffic.
(2) In the invention, TCP and UDP flows are separated, and data packet structure analysis and other work are respectively carried out on the TCP and UDP flows, so that the most suitable classification characteristics are obtained, and the monitoring accuracy is effectively improved by combining the use judgment of mutual information.
(3) The method constructs the Ethernet workshop active node library, and screens out potential Ethernet workshop flow through the active node library. Compared with the flow detection method without active node library filtering, the detection accuracy rate, the accuracy rate and other indexes are improved by 3% on average. The time consumed for detecting the same number of ether house traffic is less than 50% of the traffic detection method without filtering by the active node library.
(4) The method for screening the flow through the Ethernet workshop active node library can effectively avoid negative influence on the identification performance.
Drawings
FIG. 1 is a schematic diagram of an experimental environment setup;
FIG. 2 is a schematic view of an identification framework;
FIG. 3 is a schematic representation of performance of different machine learning algorithms before and after screening using an active node library for UDP stream identification;
FIG. 4 is a schematic representation of performance of different machine learning algorithms before and after screening using an active node library for UDP flow identification;
fig. 5 identifies a time-consuming graph, wherein (a) UDP traffic identification efficiency graph and (b) TCP traffic identification efficiency graph.
Detailed Description
The technical solutions provided by the present invention will be described in detail below with reference to specific examples, and it should be understood that the following specific embodiments are only illustrative of the present invention and are not intended to limit the scope of the present invention.
Example 1: the invention provides an efficient Ethernet workshop flow identification method combining an active node library and machine learning, wherein an identification frame is divided into four parts as shown in figure 2, the first part is the structure of the active node library, the specific content is that core node information for ensuring the Ethernet workshop to stably run is stored in a core node library, the initialization of the active node library is carried out, the searching, adding, deleting and other operations of the active nodes are completed, and the active node library is constructed. Then, the flow collection unit is deployed through the active node library to collect the Ethernet flow; the second part is the training of the recognition model, and the specific content is that after the Ethernet flow is divided into TCP and UDP flows, the data packet structures of the Ethernet flow are respectively subjected to correlation analysis, the Ethernet flow recognition characteristics which are most suitable for classification are obtained through actual data verification, and meanwhile, the mutual information measuring unit is used for carrying out characteristic screening work. After the selection of the characteristics is completed, dividing the obtained Ether house flow and background flow into a training set test set by using the obtained Ether house flow and background flow as data sets; the third part is to use different machine learning algorithms to perform comparative analysis, and select a model obtained after training the machine learning algorithm which is most suitable for classification as an identification model; the fourth part is Ethernet flow identification, and the concrete content is that flow is divided into TCP and UDP flows after being screened by an active node library and then is input into an identification model for identification.
Specifically, the method for quickly identifying the bit currency dug botnet flow comprises the following steps:
(1) and constructing an active node library, and building an experimental environment to collect the flow of the relevant Ether workshop.
The specific process of the step is as follows:
(1.1) acquiring all currently-disclosed Ethernet core node information by using a web crawler, and storing the information in a core node library in the form of an IP address;
(1.2) initializing an active node library by using the information acquired by the core node library, continuously searching the currently existing active nodes of the Ethern, and dynamically updating the node information of the active node library;
(1.3) setting an expiration time for each active node, removing Ethernet nodes which are not active for a long time, and ensuring the timeliness of an active node library;
(1.4) modifying a tool NodeFinder for detecting the ethernet house nodes to communicate with the detected ethernet house nodes;
(1.5) capturing the Ethernet traffic on the intermediate router through Wireshark software;
(1.6) various application traffic in the public data set VPN-non VPN is used as background flow.
(2) Dividing the original Ethernet flow into TCP and UDP flows, analyzing the data packet structures of the two flows, extracting the characteristics which can be used for identifying and classifying the complete flow data, using mutual information and selecting the characteristics, and reserving the characteristics which can be used for recording and identifying and classifying.
The specific process in this step is as follows:
(2.1) dividing the original Ether house flow into TCP and UDP flows;
and (2.2) screening characteristics by using mutual information indexes on the basis of 80 common flow statistical characteristics proposed by Draper et al. And respectively selecting the first ten characteristics with the highest mutual information of TCP and UDP flows.
And (2.3) analyzing the structure of the Ethernet UDP traffic data packet, wherein the lengths of the data packets in the UDP flow have strict sequence relation, and the lengths of each type of data packets have different and stable distribution. The length of the first eight packets in the UDP stream is extracted as a feature.
(2.3) feature names of the 18 features of the extracted EtherFang UDP traffic and corresponding meanings of the features are as follows
Shown in Table 3
(2.4) analyzing the Ethernet TCP traffic interaction process, and finding that the Ethernet TCP stream contains a plurality of data packets with equal payload length. The payload of the packet carrying the header is typically a combination of packets 32B, 1B and 12B. The method is characterized in that the average load length of two data packets and the proportion of the data packets with the load lengths of 32B, 1B and 12B in the total data packets in the encryption handshake stage are used as the characteristics.
(2.5) the feature names of the 12 features of the Ethernet TCP flow extracted and the corresponding meanings of the features are as follows
Shown in Table 4
(3) And after the selection of the features is completed, dividing the obtained Ether house traffic and background traffic into a training set test set by using the previously obtained Ether house traffic and background traffic as data sets. And performing comparative analysis by using different machine learning algorithms, and selecting the model which is obtained after the machine learning algorithm which is most suitable for classification is trained as the recognition model.
The specific process in this step is as follows:
(3.1) constructing a Etherhouse traffic data set using the data collected in step (1), and mapping the data set to a database with 8: the ratio of 2 is divided into a training set and a test set. And selecting the random forest algorithm with the highest accuracy by comparing parameters such as accuracy of algorithm models such as random forests, K neighbors and naive Bayes. And meanwhile, comparing the recognition effects before and after screening the flow by using an active node library method. As can be seen, the identification accuracy after screening by the active node library method is improved by 3% compared with the average accuracy, and the specific analysis results are shown in FIGS. 3 and 4.
And (3.2) carrying out time-consuming evaluation by combining the active node library and a machine learning identification method with Ethernet traffic identification and detection of a traditional method. Compared with the traditional detection method, the time consumption is reduced by more than 50% by combining the active node library and the machine learning identification method, and the specific analysis result is shown in figure 5.
(4) And after screening the flow through the active node library, dividing the flow into TCP and UDP flows, inputting the TCP and UDP flows into an identification model for identification, and updating the information of the active node library of the Ethernet workshop according to the result of identifying the flow.
The method specifically comprises the following steps:
and (4.1) extracting the source and destination IP address of the flow to be detected, inputting the IP address into an active node library, and judging whether the active node library contains the IP address.
And (4.2) if the relevant IP addresses are contained, dividing the flow into TCP and UDP flows, lifting the relevant characteristics, and then respectively putting the TCP and UDP flows into the identification model obtained in the step (3) for judgment and identification.
And (4.3) if the traffic is identified to be the Ethernet traffic and the source and destination IPs are not in the active node library, adding relevant IP address information to the active node library as a new active node.
(4.4) setting an active time for the active node, and if the nodes which do not respond are beyond the active time, deleting the nodes from the active node library.
The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications are also considered to be within the scope of the present invention.
Claims (6)
1. An efficient Ethernet flow identification method combining an active node library and machine learning is characterized by comprising the following steps:
(1) based on the assumption that the total number of Ether house nodes in the supervision area tends to be convergent, storing the Ether house node information in the current area by using the active node library, collecting the Ether house core node information, initializing the active node library, and acquiring Ether house flow;
(2) flow characteristics of Ethernet UDP flow and TCP flow are respectively selected, and corresponding flow characteristics are extracted for supplementing according to Ethernet NDP protocol and RLPx characteristics;
(3) accurate identification of the flow of the Etheng is realized by a machine learning method, and a data set is constructed to test and evaluate the obtained model;
(4) and identifying the flow input of the Ether house based on the constructed active node library and the obtained identification model.
2. The efficient Ether house traffic identification method combining the active node library and the machine learning according to claim 1, characterized in that, the step (1) collects the Ether house core node information to initialize the active node library and obtain the Ether house traffic; the method specifically comprises the following substeps:
(1.1) acquiring all currently-disclosed Ethernet house core node information through a web crawler, and storing the information in a core node library in an IP address form;
(1.2) initializing the active node library according to the collected information of the core node library;
(1.3) dynamically updating the information of the known nodes through the information of the nodes in the active node library, thereby obtaining the information of the Ethernet workshop nodes in the whole supervision area;
(1.4) setting an expiration time, removing the inactive Ether house nodes in the long-time active node library,
(1.5) modifying a tool NodeFinder for detecting the ethernet house nodes to communicate with the detected ethernet house nodes;
(1.7) capturing the Etherhouse traffic at the intermediate router.
3. The efficient Etherhouse traffic identification method in combination with active node base and machine learning according to claim 1, wherein step (2) comprises the following sub-steps:
(2.1) analyzing the structure of the Ethernet UDP flow data packet to obtain UDP flow characteristics;
(2.2) analyzing the encryption handshake EncHandshake process of the RPLx protocol to obtain the flow characteristics of the Ethernet TCP;
and (2.3) reflecting the characteristic correlation according to the mutual information, respectively selecting the first 10 characteristics with the highest mutual information value in the Ethernet UDP flow and the TCP flow, and adding the relevant characteristics obtained in the steps (2.1) and (2.2) as the finally selected characteristics.
5. the efficient Etherhouse traffic identification method in combination with active node base and machine learning according to claim 1, wherein said step (3) comprises the following sub-steps:
(3.1) combining the obtained Ether house flow with various application flows in the public data set VPN-non VPN to form a data set ETI required by the experiment;
(3.2) data set in 8: 2 into a training set and a test set, using four machine learning algorithms: the method used is evaluated from multiple indexes by support vector machines, random forests, logistic regression, and K nearest neighbors.
6. The efficient Etherhouse traffic identification method in combination with active node base and machine learning according to claim 1, wherein said step (4) comprises the following sub-steps:
(4.1) screening unrecognized traffic through an active node library, inputting the traffic into a recognition model, judging whether the output traffic is the traffic of the Ethengfang,
and (4.2) updating the node information of the Ethernet active node library according to the identification result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111302612.2A CN114024748B (en) | 2021-11-04 | 2021-11-04 | Efficient Ethernet traffic identification method combining active node library and machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111302612.2A CN114024748B (en) | 2021-11-04 | 2021-11-04 | Efficient Ethernet traffic identification method combining active node library and machine learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114024748A true CN114024748A (en) | 2022-02-08 |
CN114024748B CN114024748B (en) | 2024-04-30 |
Family
ID=80061397
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111302612.2A Active CN114024748B (en) | 2021-11-04 | 2021-11-04 | Efficient Ethernet traffic identification method combining active node library and machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114024748B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115442291A (en) * | 2022-08-19 | 2022-12-06 | 南京理工大学 | Ethernet-oriented active network topology sensing method |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102315974A (en) * | 2011-10-17 | 2012-01-11 | 北京邮电大学 | Stratification characteristic analysis-based method and apparatus thereof for on-line identification for TCP, UDP flows |
CN111082995A (en) * | 2019-12-25 | 2020-04-28 | 中国科学院信息工程研究所 | Ethernet workshop network behavior analysis method, corresponding storage medium and electronic device |
US20200311583A1 (en) * | 2019-04-01 | 2020-10-01 | Hewlett Packard Enterprise Development Lp | System and methods for fault tolerance in decentralized model building for machine learning using blockchain |
CN111865823A (en) * | 2020-06-24 | 2020-10-30 | 东南大学 | Light-weight Ether house encrypted flow identification method |
CN112910918A (en) * | 2021-02-26 | 2021-06-04 | 南方电网科学研究院有限责任公司 | Industrial control network DDoS attack traffic detection method and device based on random forest |
CN113344562A (en) * | 2021-08-09 | 2021-09-03 | 四川大学 | Method and device for detecting Etheng phishing accounts based on deep neural network |
CN113469275A (en) * | 2021-07-21 | 2021-10-01 | 东南大学 | Refined classification method for ether house behavior traffic |
-
2021
- 2021-11-04 CN CN202111302612.2A patent/CN114024748B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102315974A (en) * | 2011-10-17 | 2012-01-11 | 北京邮电大学 | Stratification characteristic analysis-based method and apparatus thereof for on-line identification for TCP, UDP flows |
US20200311583A1 (en) * | 2019-04-01 | 2020-10-01 | Hewlett Packard Enterprise Development Lp | System and methods for fault tolerance in decentralized model building for machine learning using blockchain |
CN111082995A (en) * | 2019-12-25 | 2020-04-28 | 中国科学院信息工程研究所 | Ethernet workshop network behavior analysis method, corresponding storage medium and electronic device |
CN111865823A (en) * | 2020-06-24 | 2020-10-30 | 东南大学 | Light-weight Ether house encrypted flow identification method |
CN112910918A (en) * | 2021-02-26 | 2021-06-04 | 南方电网科学研究院有限责任公司 | Industrial control network DDoS attack traffic detection method and device based on random forest |
CN113469275A (en) * | 2021-07-21 | 2021-10-01 | 东南大学 | Refined classification method for ether house behavior traffic |
CN113344562A (en) * | 2021-08-09 | 2021-09-03 | 四川大学 | Method and device for detecting Etheng phishing accounts based on deep neural network |
Non-Patent Citations (1)
Title |
---|
胡晓艳等: "基于活跃节点库的以太坊加密流量识别方法", 网络空间安全, vol. 11, no. 8, 25 August 2020 (2020-08-25), pages 34 - 39 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115442291A (en) * | 2022-08-19 | 2022-12-06 | 南京理工大学 | Ethernet-oriented active network topology sensing method |
Also Published As
Publication number | Publication date |
---|---|
CN114024748B (en) | 2024-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109450842B (en) | Network malicious behavior recognition method based on neural network | |
Silveira et al. | URCA: Pulling out anomalies by their root causes | |
CN109117634A (en) | Malware detection method and system based on network flow multi-view integration | |
CN113344562B (en) | Method and device for detecting Etheng phishing accounts based on deep neural network | |
CN102420723A (en) | Anomaly detection method for various kinds of intrusion | |
Sivamohan et al. | An effective recurrent neural network (RNN) based intrusion detection via bi-directional long short-term memory | |
CN107483451B (en) | Method and system for processing network security data based on serial-parallel structure and social network | |
CN112800424A (en) | Botnet malicious traffic monitoring method based on random forest | |
Chen et al. | An effective metaheuristic algorithm for intrusion detection system | |
Bian et al. | Host in danger? detecting network intrusions from authentication logs | |
CN110519228B (en) | Method and system for identifying malicious cloud robot in black-production scene | |
CN117454376A (en) | Industrial Internet data security detection response and tracing method and device | |
CN114024748B (en) | Efficient Ethernet traffic identification method combining active node library and machine learning | |
CN113518073B (en) | Method for rapidly identifying bit currency mining botnet flow | |
Praneeth et al. | Security: intrusion prevention system using deep learning on the internet of vehicles | |
CN112235254B (en) | Rapid identification method for Tor network bridge in high-speed backbone network | |
Hammerschmidt et al. | Reliable machine learning for networking: Key issues and approaches | |
CN110458209A (en) | A kind of escape attack method and device for integrated Tree Classifier | |
CN116304252A (en) | Communication network fraud prevention method based on graph structure clustering | |
Kumar et al. | Machine learning based traffic classification using low level features and statistical analysis | |
CN113468555A (en) | Method, system and device for identifying client access behavior | |
Atmojo et al. | A New Approach for ARP Poisoning Attack Detection Based on Network Traffic Analysis | |
Erokhin et al. | The Dataset Features Selection for Detecting and Classifying Network Attacks | |
Lin et al. | Behaviour classification of cyber attacks using convolutional neural networks | |
Difaizi et al. | URL Based Malicious Activity Detection Using Machine Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |