CN112448947B - Network anomaly determination method, equipment and storage medium - Google Patents

Network anomaly determination method, equipment and storage medium Download PDF

Info

Publication number
CN112448947B
CN112448947B CN202011247194.7A CN202011247194A CN112448947B CN 112448947 B CN112448947 B CN 112448947B CN 202011247194 A CN202011247194 A CN 202011247194A CN 112448947 B CN112448947 B CN 112448947B
Authority
CN
China
Prior art keywords
network
data
group
information entropy
risk index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011247194.7A
Other languages
Chinese (zh)
Other versions
CN112448947A (en
Inventor
白岩
李拓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qianxin Technology Group Co Ltd
Secworld Information Technology Beijing Co Ltd
Original Assignee
Qianxin Technology Group Co Ltd
Secworld Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qianxin Technology Group Co Ltd, Secworld Information Technology Beijing Co Ltd filed Critical Qianxin Technology Group Co Ltd
Priority to CN202011247194.7A priority Critical patent/CN112448947B/en
Publication of CN112448947A publication Critical patent/CN112448947A/en
Application granted granted Critical
Publication of CN112448947B publication Critical patent/CN112448947B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/12Network monitoring probes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Abstract

The present disclosure provides a method, a device and a storage medium for determining network anomaly, wherein the method comprises the following steps: acquiring network flow data; dividing the network traffic data into a plurality of groups of subdata according to a time period corresponding to the network traffic data; extracting the characteristics of multiple dimensions of each group of subdata; for each group of subdata, respectively calculating a group of information entropies corresponding to the characteristics of each dimension in the characteristics of the multiple dimensions; for each group of subdata, comparing each information entropy in a group of information entropies of each characteristic with a pre-established data range of an information entropy base line corresponding to the characteristic; calculating a difference value between a target information entropy which is not within the data range and an information entropy which is the smallest difference value between a set of information entropies and the target information entropy; calculating a risk index corresponding to each feature according to the difference value corresponding to each feature; and determining whether the network is abnormal or not according to the risk indexes corresponding to the characteristics. The method can improve the identification efficiency of the network abnormity.

Description

Network anomaly determination method, equipment and storage medium
Technical Field
The present invention relates to the field of network security technologies, and in particular, to a method, a device, and a storage medium for determining a network anomaly.
Background
Currently, some network anomalies caused by communication protocols are generally difficult to detect, for example, the network anomalies caused by the Modbus protocol. The Modbus protocol is a widely used industrial control protocol. However, when a specific industrial control system is implemented, since a developer does not have security knowledge or is unaware of security problems, various security holes may exist in the system using the Modbus protocol. For example, during communication, a node under malicious control may send out illegal data. The function code is an important content in the Modbus protocol, and the abuse of the function code is a main factor causing the abnormity of the Modbus network; illegal message lengths, short-cycle useless commands, and incorrect message lengths may cause system anomalies. Currently, white list rules are generally adopted to detect Modbus protocol anomalies. For example, a white list rule is established for attributes such as a source IP, a destination IP, a function code, and an operation address, and an alarm is generated for a message that does not match the white list rule. The detection method only aims at the microscopic detection of a single Modbus message, and does not consider the macroscopic characteristics such as the time characteristic, the frequency characteristic and the like of an industrial control system. For example, some packets, which appear once alone, are normal, but many occurrences in a short time are likely to be attacks. White list rules are not detectable for such attacks. It can be seen that a method for detecting network anomalies is yet to be proposed.
Disclosure of Invention
In view of this, one or more embodiments of the present disclosure provide a method, a device, and a storage medium for determining a network anomaly, so as to solve the problem in the related art that it is not possible to detect whether a network is anomalous based on a macro feature of network traffic data.
One or more embodiments of the present disclosure provide a network anomaly determination method, including: acquiring network flow data; dividing the network traffic data into a plurality of groups of subdata according to a time period corresponding to the network traffic data; extracting the characteristics of multiple dimensions of each group of subdata; for each group of subdata, respectively calculating a group of information entropies corresponding to the characteristics of each dimension in the characteristics of the multiple dimensions; for each group of subdata, comparing each information entropy in a group of information entropies of each characteristic with a pre-established data range of an information entropy base line corresponding to the characteristic; calculating a difference value between a target information entropy which is not in the data range and an information entropy which is the smallest difference value between the target information entropy and the information entropy in the group of information entropies; calculating a risk index corresponding to each feature according to the difference value corresponding to each feature; and determining whether the network is abnormal or not according to the risk indexes corresponding to the characteristics.
Optionally, the features include at least any three of: operating system fingerprint, equipment identification field in the inquiry message, function code field in the inquiry message, data packet length in the inquiry message, equipment identification field in the response message, function code field in the response message and data packet length of the response message.
Optionally, the risk index corresponding to each feature is calculated by the following formula:
Figure GDA0003817755020000021
wherein, score (x) represents a risk index corresponding to the characteristic, x represents the difference, δ is a preset threshold, n is a real number preset to be greater than 1, k is a weight coefficient, and k is an integer greater than 1.
Optionally, determining whether the network is abnormal according to the risk index corresponding to each feature includes: calculating a comprehensive risk index according to the risk index corresponding to each characteristic; determining whether the network is abnormal or not according to the comprehensive risk index; wherein the comprehensive risk index is calculated by the following formula:
Figure GDA0003817755020000022
wherein, score i And (5) representing the risk index corresponding to the characteristic i, and m represents the number of the characteristics participating in the operation.
Optionally, determining whether the network is abnormal according to the risk index corresponding to each feature includes: if the calculated risk index corresponding to any one feature is larger than a first preset value, determining that the network is abnormal; or if the calculated comprehensive risk index is larger than a second preset value, determining that the network is abnormal, wherein the first preset value is larger than the second preset value.
Optionally, extracting features of multiple dimensions of each group of sub-data includes: for each group of subdata, identifying a Modbus protocol according to the characteristics of the Modbus protocol; pairing the query message and the response message of the Modbus protocol; and analyzing and extracting key fields and characteristics in the query message and the response message of the Modbus protocol.
Optionally, the extracting the features of the multiple dimensions of each group of sub-data further includes: identifying the type of an operating system according to TCP packet header information in a Transmission Control Protocol (TCP) handshaking process; identifying the manufacturer information of the network card according to the MAC address; identifying equipment manufacturer information according to data characteristics of the Modbus application layer; and carrying out hash operation on the operating system type, the network card manufacturer information and the equipment manufacturer information to obtain a hash value.
Optionally, the method further includes: acquiring network flow sample data; dividing the network traffic sample data into a plurality of groups of sub-sample data according to a time period corresponding to the network traffic sample data; and respectively establishing an information entropy baseline corresponding to the feature of each dimension in the features of the multiple dimensions for each group of subsample data.
One or more embodiments of the present disclosure also provide an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the program, the processor implements any one of the above network anomaly determination methods.
One or more embodiments of the present disclosure also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform any one of the above network anomaly determination methods.
The network anomaly determination method provided by one or more embodiments of the present disclosure groups network traffic data according to the time of network traffic data generation to obtain multiple groups of sub-data, respectively calculates the information entropy corresponding to each of the multiple dimensions of the sub-data in each group, compares the calculated information entropy with the pre-established value range of the information entropy baseline, calculates the difference between the target information entropy outside the value range of the information entropy baseline and the information entropy closest to the value in the information entropy in the group, determines the risk index corresponding to each of the features according to the difference, and determines whether the network is anomalous according to the risk index, thereby achieving the purpose of identifying the network anomaly according to the characteristics of the network traffic data and improving the identification efficiency of the network anomaly.
Drawings
Fig. 1 is a flow diagram illustrating a network anomaly determination method in accordance with one or more embodiments of the present disclosure;
fig. 2 is a block diagram of an electronic device shown in accordance with one or more embodiments of the present disclosure.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present disclosure are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are only used for convenience of expression and should not be construed as limitations of the embodiments of the present disclosure, and the following embodiments do not describe this. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", and the like are used merely to indicate relative positional relationships, and when the absolute position of the object being described is changed, the relative positional relationships may also be changed accordingly.
Fig. 1 is a flowchart illustrating a network anomaly determination method according to one or more embodiments of the present disclosure, as shown in fig. 1, the method including:
step 101: acquiring network flow data;
for example, by connecting the bypassed network packet sniffers to the mirror ports of the switch. After sniffing, the data packet can be analyzed at a high level, and information such as a connection state and an application layer protocol is recorded. A session table may be maintained, and the connected state information is recorded through session table entries, where the session table is a hash table, each table entry represents a data stream, and the table entries include various information uniquely identifying a data stream, such as MAC (Media Access Control Address ), IP (Internet Protocol, internet Protocol) Address, TTL (Time To Live), source and destination operating system types, and application layer Protocol types.
Step 102: dividing the network traffic data into a plurality of groups of subdata according to a time period corresponding to the network traffic data;
taking the network flow data of the Modbus protocol as an example, the protocol is an industrial control protocol, and the data characteristics are greatly influenced by the production period of a factory. The time slices can be reasonably divided according to the production cycle of the factory, and each time slice is called a time period. For example, 8% in factory, 00-12. Due to the significant difference between the characteristics of the start-up time and the midday break time industrial control protocol, the working day of 8. Within a time period, a time window is capable of framing a time series according to a specified unit length, thereby calculating a statistical indicator within the frame. For example, a slider corresponding to a designated length may be slid on the scale, and data in the slider may be fed back every unit of sliding.
For example, the time period corresponding to the network traffic data is 8.
Step 103: extracting the characteristics of multiple dimensions of each group of subdata;
step 104: for each group of subdata, respectively calculating a group of information entropies corresponding to the features of each dimension in the features of the multiple dimensions;
the following takes the device _ id field of the Modbus protocol as an example to explain the calculation process of the information entropy:
and establishing a hash table for quickly searching the device _ id.
Monitoring Modbus flow in a period of time, and extracting the device _ id value of each Modbus message. Querying the device _ id value through a hash table, and if the device _ id appears before, increasing the count of the corresponding node in the hash table by 1; otherwise, a new hash node is created and the reference count is set to 1. The purpose of this step is to count the counts that occur for each different device _ id value over a period of time.
And traversing the hash table, and calculating the occurrence probability of each device _ id value. The probability calculation formula is the percentage of each device _ id count in all device _ id counts.
Calculating a final information entropy value: the formula for calculating the entropy value is as follows:
Figure GDA0003817755020000051
wherein, P represents the probability of a certain device _ id, and all P log P are accumulated and taken as negative numbers, thus obtaining the overall information entropy value.
It should be noted that, each group of sub-data includes a plurality of characteristics of a certain dimension acquired in a time period, so in step 104, for each group of sub-data, the characteristic corresponding information entropy of each dimension of the characteristics of the plurality of dimensions is calculated as a plurality of information entropies.
Step 105: for each group of subdata, comparing each information entropy in a group of information entropies of each characteristic with a pre-established data range of an information entropy base line corresponding to the characteristic;
for example, the information entropy baselines corresponding to the features may be constructed in advance based on sample data of the network traffic data.
In step 105, each information entropy in the set of information entropies of each feature may be sequentially compared to a pre-established value range of the information entropy baseline corresponding to each feature to determine whether each information entropy is within the data range.
Step 106: calculating a difference value between a target information entropy which is not in the data range and an information entropy which is the smallest difference value between the target information entropy and the information entropy in the group of information entropies;
for example, for a certain characteristic a in a certain set of sub-data (the time period corresponding to the sub-data is 9-00 a.m. 1 ,h 2 …h i Suppose, wherein h 3 Not feature a at 9 am: the information entropy base line corresponding to 00-10 3 In a set of information entropies and h 3 The information entropy with the closest value is h 1 Then calculate h 3 And h 1 The difference between them is the above difference.
Step 107: calculating a risk index corresponding to each feature according to the difference value corresponding to each feature;
for example, the difference corresponding to each feature may measure the fluctuation of each feature over a period of time, and the larger the fluctuation is, the larger the risk that the network abnormality may be caused is, so that the risk index corresponding to each feature and the difference corresponding to each feature form a positive correlation relationship, and the risk index corresponding to each feature is calculated according to the difference corresponding to each feature by using a preset coefficient.
Step 108: and determining whether the network is abnormal or not according to the risk indexes corresponding to the characteristics.
For example, whether the network is abnormal or not may be determined according to whether the risk index corresponding to each feature exceeds a preset numerical value or not, or alternatively, a comprehensive risk index may be calculated according to the risk index corresponding to each feature through a preset algorithm, and whether the network is abnormal or not may be determined according to whether the comprehensive risk index exceeds another preset numerical value or not.
The network anomaly determination method provided by one or more embodiments of the present disclosure groups network traffic data according to the time of network traffic data generation to obtain multiple groups of sub-data, respectively calculates the information entropy corresponding to each of the multiple dimensions of the sub-data in each group, compares the calculated information entropy with the pre-established value range of the information entropy baseline, calculates the difference between the target information entropy outside the value range of the information entropy baseline and the information entropy closest to the value in the information entropy in the group, determines the risk index corresponding to each of the features according to the difference, and determines whether the network is anomalous according to the risk index, thereby achieving the purpose of identifying the network anomaly according to the characteristics of the network traffic data and improving the identification efficiency of the network anomaly.
In one or more embodiments of the disclosure, the features may include at least any three of: operating system fingerprint, equipment identification field in the inquiry message, function code field in the inquiry message, data packet length in the inquiry message, equipment identification field in the response message, function code field in the response message and data packet length of the response message.
Still taking the Modbus protocol as an example, the characteristics of multiple dimensions may include, for example: a device _ id field, a function code field, and a packet length in a Query message of the Modbus protocol, a device _ id field, a function code field, a packet length in a Response message, and an operating system fingerprint.
In one or more embodiments of the present disclosure, the risk index corresponding to each feature may be calculated by the following formula:
Figure GDA0003817755020000071
wherein, score (x) represents a risk index corresponding to the characteristic, x represents the difference, δ is a preset threshold, n is a real number preset to be greater than 1, k is a weight coefficient, and k is an integer greater than 1. If the deviation degree of the information entropy is smaller than the threshold value delta, the information entropy is not considered to be an attack, the parameter can be configured by a system user, and the smaller delta is, the lower the missing report rate is, and the higher the false report rate is. n may be typically built in at the factory according to the operating context of the software. To increase the computation speed, n can be chosen to be the natural logarithm base e or 2.K can be preset by the equipment manufacturer.
For example, in the stage of detecting network abnormality, the network traffic data is divided into a plurality of sub-data according to the equal time period TAnd a group for calculating information entropy for each dimension (denoted as i) of each group t, and for each calculated information entropy group (called an information entropy group because the calculated dimension has a plurality of characteristics), denoted as h 1 ,h 2 …h i . For example, for information entropy h i H in (1) i1 According to h i1 The time period and the characteristic are searched, the information entropy base line database is searched, and h is searched i1 Determining the information entropy h according to the numerical range of the information entropy baseline i1 Within a reasonable range. If h i1 If the entropy is not within the range of reasonable information entropy, h is calculated i1 The difference from the range of information entropy in hi that is numerically closest to it is denoted as x.
In one or more embodiments of the present disclosure, determining whether an abnormality exists in the network according to the risk index corresponding to each feature may include:
calculating a comprehensive risk index according to the risk index corresponding to each characteristic;
determining whether the network is abnormal or not according to the comprehensive risk index;
wherein the comprehensive risk index is calculated by the following formula:
Figure GDA0003817755020000081
wherein, score i And m represents the number of the features participating in the operation. It can be seen that the network anomaly determination method provided in one or more embodiments of the present disclosure has a low computational complexity, and performs computation based on the above multidimensional characteristics, and has a high detection rate for functional code injection attack, denial of service attack, and functional code abuse attack on a Modbus protocol, and a certain detection capability for buffer overflow attack.
In one or more embodiments of the present disclosure, determining whether an abnormality exists in the network according to the risk index corresponding to each of the features may include:
if the calculated risk index corresponding to any one of the characteristics is larger than a first preset value, determining that the network is abnormal;
or if the calculated comprehensive risk index is larger than a second preset value, determining that the network is abnormal, wherein the first preset value is larger than the second preset value. For example, for the features of the plurality of dimensions, the risk index corresponding to each feature is calculated, and the obtained result of the risk index is represented as a set S.
Defining a threshold value eta 1 (as an example of the first preset value described above), η 2 (as an example of the second preset value described above), and η 12
If there is at least one h i ,h i Is e.g. S, and h i1 If the network is abnormal, the alarm is generated. Or, if
Figure GDA0003817755020000091
And (4) considering the network to be abnormal, and generating an alarm.
In one or more embodiments of the present disclosure, extracting the features of the multiple dimensions of the groups of sub-data may include:
identifying the Modbus protocol for each group of subdata according to the characteristics of the Modbus protocol;
matching the query message and the response message of the Modbus protocol;
and analyzing and extracting key fields and characteristics in the query message and the response message of the Modbus protocol. For example, the device _ id field and the function code field corresponding to the query message of the listed Modbus protocol and the packet length corresponding to the query message of the Modbus protocol may be analyzed and extracted, and the device _ id field, the function code field and the packet length in the response message of the listed Modbus protocol may be analyzed and extracted. The analyzed data can be recorded in the session table entry corresponding to the connection.
In one or more embodiments of the present disclosure, extracting the features of the multiple dimensions of the each group of sub-data may further include:
identifying the type of an operating system according to TCP packet header information in a TCP handshake process; for example, the operating system type may be identified according to TCP header information in a TCP three-way handshake process;
identifying the manufacturer information of the network card according to the MAC address;
identifying equipment manufacturer information according to data characteristics of the Modbus application layer;
and carrying out hash operation on the operating system type, the network card manufacturer information and the equipment manufacturer information to obtain a hash value. For example, an operating system fingerprint may also be recorded in the session table.
In one or more embodiments of the present disclosure, the network anomaly determination method may further include:
acquiring network flow sample data; for example, historical network traffic data may be obtained as network traffic sample data, which may be different from the network traffic data obtained in step 101 above.
Dividing the network traffic sample data into a plurality of groups of sub-sample data according to a time period corresponding to the network traffic sample data; in this step, the dividing manner of the sub-sample data is consistent with the dividing manner of the sub-data in the foregoing, and details are not repeated here.
And respectively establishing an information entropy baseline corresponding to the feature of each dimension in the features of the multiple dimensions for each group of subsample data.
The information entropy learning process of the device _ id field of the Modbus protocol is taken as an example to describe the baseline establishing process.
Dividing the network flow data in the learning period into a plurality of groups according to equal time period T, and recording T 1 ,t 2 ,…,t n And respectively calculating information entropy for device _ id field in each time packet, and recording the information entropy as h 1 ,h 2 ,…,h n
Go through all h, calculate Δ i =h i –h i-1 And a maximum of i Is recorded as Delta max
Traverse all hAnd the difference is smaller than the threshold value delta 1 Are grouped together. The merged information entropy becomes a plurality of groups h a ,(h b ,h c )(h d ,h e …h x );
And traversing each group, and respectively taking the maximum value and the minimum value in the group as reasonable range values of the information entropy.
For example, the following information entropy values:
0.71,0.92,0.93,0.94
the reasonable range of the information entropy obtained after merging according to the threshold value of 0.1 is 0.71,0.92-0.94
After learning is finished, the learning result, namely the corresponding relation between the time period and the reasonable range of the information entropy (namely an example of the numerical range) is recorded in the information entropy baseline database.
In the baseline learning process, seven dimensions, such as operating system fingerprints, device _ id fields and function code fields in Query messages, data packet lengths, device _ id fields, function code fields and data packet lengths in Response messages, are selected to respectively establish an information entropy baseline.
Because of the single and repeated attributes of the industrial production process, the information entropy value presented by the industrial control protocol is relatively fixed, so that the identification of whether the network is abnormal or not based on the information entropy is more suitable for an industrial control system. The information entropy of the seven dimensions is selected, and the main field information of the Modbus protocol is covered, so that the true entropy value can be reflected in a larger range, and the possibility of misinformation is reduced.
To facilitate understanding of the network anomaly determination method of one or more embodiments of the present disclosure, the entire flow of the method is briefly described below by way of an example. In this example, the method includes the following processes:
installing a network data packet sniffer on a mirror image port of a switch in an industrial control network, so that the sniffer can sniff all flow data in the network;
starting a learning mode, learning seven-dimensional information entropy fluctuation ranges of fingerprints of an operating system, a device _ id field, a function code field and a data packet length in each time period and a device _ id field, a function code field, a data packet length and the like in a Response message, and establishing a base line;
and starting a detection mode, and detecting whether information entropies of seven dimensions, such as operating system fingerprints in the current time period, device _ id fields and function code fields in the Query message, the length of the data packet, device _ id fields and function code fields in the Response message, the length of the data packet and the like, deviate from the baseline. And if the information entropy deviation exists, calculating the risk index and the comprehensive risk index of each dimension, and judging whether the network is abnormal or not according to the risk index and/or the total risk index of each dimension.
One or more embodiments of the present disclosure also provide an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the program, the processor implements any one of the above network anomaly determination methods.
One or more embodiments of the present disclosure also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform any one of the above network anomaly determination methods.
It should be noted that the method of the embodiment of the present disclosure may be executed by a single device, such as a computer or a server. The method of the embodiment can also be applied to a distributed scene and completed by the mutual cooperation of a plurality of devices. In such a distributed scenario, one of the multiple devices may only perform one or more steps of the method of the embodiments of the present disclosure, and the multiple devices interact with each other to complete the method.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Fig. 2 is a schematic diagram illustrating a more specific hardware structure of an electronic device according to this embodiment, where the electronic device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static Memory device, a dynamic Memory device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solutions provided by the embodiments of the present specification are implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called by the processor 1010 for execution.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present device and other devices. The communication module can realize communication in a wired mode (for example, USB, network cable, etc.), and can also realize communication in a wireless mode (for example, mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only the components necessary to implement the embodiments of the present disclosure, and need not include all of the components shown in the figures.
Computer-readable media, including both permanent and non-permanent, removable and non-removable media, for storing information may be implemented in any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the idea of the present disclosure, features in the above embodiments or in different embodiments may also be combined, steps may be implemented in any order, and there are many other variations of the different aspects of the present disclosure as described above, which are not provided in detail for the sake of brevity.
In addition, well known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown in the provided figures for simplicity of illustration and discussion, and so as not to obscure the disclosure. Furthermore, devices may be shown in block diagram form in order to avoid obscuring the disclosure, and also in view of the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the present disclosure is to be implemented (i.e., specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the disclosure, it should be apparent to one skilled in the art that the disclosure can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative instead of restrictive.
While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures, such as Dynamic RAM (DRAM), may use the discussed embodiments.
The embodiments of the present disclosure are intended to embrace all such alternatives, modifications and variances that fall within the broad scope of the appended claims. Therefore, any omissions, modifications, equivalents, improvements, and the like that may be made within the spirit and principles of the disclosure are intended to be included within the scope of the disclosure.

Claims (10)

1. A method for determining network anomalies, comprising:
acquiring network flow data;
dividing the network traffic data into a plurality of groups of subdata according to a time period corresponding to the network traffic data;
extracting the characteristics of multiple dimensions of each group of subdata;
for each group of subdata, respectively calculating a group of information entropies corresponding to the characteristics of each dimension in the characteristics of the multiple dimensions;
for each group of subdata, comparing each information entropy in a group of information entropies of each characteristic with a pre-established data range of an information entropy base line corresponding to the characteristic;
calculating a difference value between a target information entropy which is not in the data range and an information entropy which is the smallest difference value between the target information entropy and the information entropy in the group of information entropies;
calculating a risk index corresponding to each feature according to the difference value corresponding to each feature;
and determining whether the network is abnormal or not according to the risk indexes corresponding to the characteristics.
2. The method of claim 1, wherein the features include at least any three of:
operating system fingerprint, equipment identification field in the query message, function code field in the query message, data packet length in the query message, equipment identification field in the response message, function code field in the response message, and data packet length of the response message.
3. The method of claim 1, wherein the risk index for each feature is calculated by the formula:
Figure FDA0003817755010000011
wherein, score (x) represents a risk index corresponding to the characteristic, x represents the difference, δ is a preset threshold, n is a real number preset to be greater than 1, k is a weight coefficient, and k is an integer greater than 1.
4. The method of claim 3, wherein determining whether the network is abnormal according to the risk index corresponding to each feature comprises:
calculating a comprehensive risk index according to the risk index corresponding to each characteristic;
determining whether the network is abnormal or not according to the comprehensive risk index;
wherein the comprehensive risk index is calculated by the following formula:
Figure 1
wherein, score i And m represents the number of the features participating in the operation.
5. The method of claim 4, wherein determining whether the network is abnormal according to the risk index corresponding to each feature comprises:
if the calculated risk index corresponding to any one feature is larger than a first preset value, determining that the network is abnormal;
or if the calculated comprehensive risk index is larger than a second preset value, determining that the network is abnormal, wherein the first preset value is larger than the second preset value.
6. The method of claim 1, wherein extracting features of the sets of sub-data in multiple dimensions comprises:
identifying the Modbus protocol for each group of subdata according to the characteristics of the Modbus protocol;
matching the query message and the response message of the Modbus protocol;
and analyzing and extracting key fields and characteristics in the query message and the response message of the Modbus protocol.
7. The method of claim 6, wherein extracting features of the sets of sub-data in multiple dimensions further comprises:
identifying the type of an operating system according to TCP packet header information in a Transmission Control Protocol (TCP) handshaking process;
identifying network card manufacturer information according to the MAC address;
identifying equipment manufacturer information according to data characteristics of the Modbus application layer;
and carrying out hash operation on the operating system type, the network card manufacturer information and the equipment manufacturer information to obtain a hash value.
8. The method of claim 1, further comprising:
acquiring network flow sample data;
dividing the network traffic sample data into a plurality of groups of sub-sample data according to a time period corresponding to the network traffic sample data;
and respectively establishing an information entropy baseline corresponding to the feature of each dimension in the features of the multiple dimensions for each group of sub-sample data.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the network anomaly determination method according to any one of claims 1 to 8 when executing the program.
10. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the network anomaly determination method according to any one of claims 1 to 8.
CN202011247194.7A 2020-11-10 2020-11-10 Network anomaly determination method, equipment and storage medium Active CN112448947B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011247194.7A CN112448947B (en) 2020-11-10 2020-11-10 Network anomaly determination method, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011247194.7A CN112448947B (en) 2020-11-10 2020-11-10 Network anomaly determination method, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112448947A CN112448947A (en) 2021-03-05
CN112448947B true CN112448947B (en) 2022-10-28

Family

ID=74736201

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011247194.7A Active CN112448947B (en) 2020-11-10 2020-11-10 Network anomaly determination method, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112448947B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113030414A (en) * 2021-03-15 2021-06-25 上海熊猫机械(集团)有限公司 Maintenance cycle prediction method of water quality detector
CN113420652B (en) * 2021-06-22 2023-07-14 中冶赛迪信息技术(重庆)有限公司 Time sequence signal segment abnormality identification method, system, medium and terminal
CN114268383B (en) * 2021-12-21 2023-12-22 瑞德电子(信丰)有限公司 Wireless data transceiver module and testing method thereof
CN114389881A (en) * 2022-01-13 2022-04-22 北京金山云网络技术有限公司 Network abnormal flow detection method and device, electronic equipment and storage medium
CN114637263B (en) * 2022-03-15 2024-01-12 中国石油大学(北京) Abnormal working condition real-time monitoring method, device, equipment and storage medium
CN114466393B (en) * 2022-04-13 2022-07-12 深圳市永达电子信息股份有限公司 Rail transit vehicle-ground communication potential risk monitoring method and system
CN115277491B (en) * 2022-06-15 2023-06-06 中国联合网络通信集团有限公司 Method and device for determining abnormal data and computer readable storage medium
CN114997750B (en) * 2022-08-03 2022-10-25 广东知得失网络科技有限公司 Risk information pushing method, system, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101645884A (en) * 2009-08-26 2010-02-10 西安理工大学 Multi-measure network abnormity detection method based on relative entropy theory
CN103281293A (en) * 2013-03-22 2013-09-04 南京江宁台湾农民创业园发展有限公司 Network flow rate abnormity detection method based on multi-dimension layering relative entropy
CN106357434A (en) * 2016-08-30 2017-01-25 国家电网公司 Detection method, based on entropy analysis, of traffic abnormity of smart grid communication network
CN109347823A (en) * 2018-10-17 2019-02-15 湖南汽车工程职业学院 A kind of CAN bus method for detecting abnormality based on comentropy
CN109951420A (en) * 2017-12-20 2019-06-28 广东电网有限责任公司电力调度控制中心 A kind of multistage flow method for detecting abnormality based on entropy and dynamic linear relationship

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10171491B2 (en) * 2014-12-09 2019-01-01 Fortinet, Inc. Near real-time detection of denial-of-service attacks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101645884A (en) * 2009-08-26 2010-02-10 西安理工大学 Multi-measure network abnormity detection method based on relative entropy theory
CN103281293A (en) * 2013-03-22 2013-09-04 南京江宁台湾农民创业园发展有限公司 Network flow rate abnormity detection method based on multi-dimension layering relative entropy
CN106357434A (en) * 2016-08-30 2017-01-25 国家电网公司 Detection method, based on entropy analysis, of traffic abnormity of smart grid communication network
CN109951420A (en) * 2017-12-20 2019-06-28 广东电网有限责任公司电力调度控制中心 A kind of multistage flow method for detecting abnormality based on entropy and dynamic linear relationship
CN109347823A (en) * 2018-10-17 2019-02-15 湖南汽车工程职业学院 A kind of CAN bus method for detecting abnormality based on comentropy

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于通信行为分析的DNS隧道木马检测方法;罗友强等;《浙江大学学报(工学版)》;20170915;全文 *

Also Published As

Publication number Publication date
CN112448947A (en) 2021-03-05

Similar Documents

Publication Publication Date Title
CN112448947B (en) Network anomaly determination method, equipment and storage medium
US10148540B2 (en) System and method for anomaly detection in information technology operations
EP2725512B1 (en) System and method for malware detection using multi-dimensional feature clustering
CN111277570A (en) Data security monitoring method and device, electronic equipment and readable medium
JP2019061565A (en) Abnormality diagnostic method and abnormality diagnostic device
CN109257390B (en) CC attack detection method and device and electronic equipment
Ferrando et al. Classification of device behaviour in internet of things infrastructures: towards distinguishing the abnormal from security threats
US11949701B2 (en) Network access anomaly detection via graph embedding
CN113328985B (en) Passive Internet of things equipment identification method, system, medium and equipment
JP2019110513A (en) Anomaly detection method, learning method, anomaly detection device, and learning device
CN107682354B (en) Network virus detection method, device and equipment
CN113923003A (en) Attacker portrait generation method, system, equipment and medium
Megantara et al. Feature importance ranking for increasing performance of intrusion detection system
CN112671724B (en) Terminal security detection analysis method, device, equipment and readable storage medium
CN107209834A (en) Malicious communication pattern extraction apparatus, malicious communication schema extraction system, malicious communication schema extraction method and malicious communication schema extraction program
US11050771B2 (en) Information processing apparatus, communication inspecting method and medium
CN112583827B (en) Data leakage detection method and device
Yin et al. Anomaly traffic detection based on feature fluctuation for secure industrial internet of things
CN116112287B (en) Network attack organization tracking method and device based on space-time correlation
CN116170227A (en) Flow abnormality detection method and device, electronic equipment and storage medium
CN114760087A (en) DDoS attack detection method and system in software defined industrial internet
Wan et al. DevTag: A benchmark for fingerprinting IoT devices
CN114070581B (en) Method and device for detecting hidden channel of domain name system
CN116582305A (en) Continuous trust evaluation method for electric power business interaction behavior and related equipment
CN115632875B (en) Malicious flow detection method and system based on multi-feature fusion and real-time analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 332, 3 / F, Building 102, 28 xinjiekouwei street, Xicheng District, Beijing 100088

Applicant after: Qianxin Technology Group Co.,Ltd.

Applicant after: Qianxin Wangshen information technology (Beijing) Co.,Ltd.

Address before: Room 332, 3 / F, Building 102, 28 xinjiekouwei street, Xicheng District, Beijing 100088

Applicant before: Qianxin Technology Group Co.,Ltd.

Applicant before: LEGENDSEC INFORMATION TECHNOLOGY (BEIJING) Inc.

GR01 Patent grant
GR01 Patent grant