CN115580486B

CN115580486B - Network security sensing method and device based on big data

Info

Publication number: CN115580486B
Application number: CN202211449597.9A
Authority: CN
Inventors: 项翔翔; 蒋行健
Original assignee: Ningbo Zhenhai Big Data Investment Development Co ltd
Current assignee: Ningbo Zhenhai Big Data Investment Development Co ltd
Priority date: 2022-11-18
Filing date: 2022-11-18
Publication date: 2023-04-07
Anticipated expiration: 2042-11-18
Also published as: CN115580486A

Abstract

The invention relates to the technical field of information security detection, in particular to a network security sensing method and a device based on big data, which comprises the following steps: the method comprises the steps of constructing flow matrixes of a perception client and a main service end, constructing a flow covariance matrix based on the flow matrixes, solving a characteristic value set and a sub-characteristic set of the flow covariance matrix, calculating a significant value of the sub-characteristic set to the characteristic value set, judging that the main service end has a network invasion risk to the perception client if the significant value is larger than a specified significant threshold, extracting a time flow index set of a served end and the perception client, inputting the time flow index set to a pre-trained network security perception model to execute risk prediction, and obtaining a network invasion risk judgment result of the served end to the perception client. The method can solve the problem of low accuracy of network security threat prediction caused by using a machine learning or deep learning model in the current curing process.

Description

Network security sensing method and device based on big data

Technical Field

The invention relates to the technical field of information security, in particular to a network security sensing method and device based on big data.

Background

The network Security (Cyber Security) means that the hardware, software and data in the system of the network system are protected and are not damaged, changed and leaked due to accidental or malicious reasons, the system continuously, reliably and normally operates, and the network service is not interrupted.

The emphasis points of different network security perception methods are different, wherein the hot method is to predict whether a client has a security risk in advance by monitoring the traffic interaction condition. At present, the mainstream traffic interaction monitoring method mainly collects traffic interaction index data of a client and another server, and then judges whether the server has network threat to the client or not according to the traffic interaction index data through machine learning or deep learning.

Although the method can realize network security perception, whether the client is actively connected or passively connected with the server is not considered, risk judgment is executed by solidified machine learning or deep learning, and the accuracy of network security perception is low.

Disclosure of Invention

The invention provides a big data-based network security sensing method and device, and mainly aims to solve the problem of low accuracy of network security threat prediction caused by the fact that a machine learning or deep learning model is used at present.

In order to achieve the above object, the present invention provides a big data-based network security sensing method, which includes:

receiving a network security perception instruction, and determining a perception client to be detected according to the network security perception instruction;

according to a TCP connection rule, extracting a server which is actively connected with the sensing client at the current moment to obtain a main service end and a server which is passively connected to obtain a served end;

establishing flow matrixes of the perception client and the main service terminal, establishing a flow covariance matrix based on the flow matrixes and solving a characteristic value set of the flow covariance matrix;

selecting sub-features with the importance greater than a specified importance threshold value from the feature value set to obtain a sub-feature set, calculating a significant value of the sub-feature set to the feature value set, and if the significant value is greater than the specified significant threshold value, judging that the main service side has a network invasion risk to the perception client side;

extracting a flow interaction index set of the served terminal and the sensing client terminal, wherein the flow interaction index set comprises flow interaction index values and flow interaction time, and sequencing the flow interaction index set based on the flow interaction time to obtain a time flow index set;

inputting the time flow index set into a pre-trained network security perception model to execute risk prediction, and obtaining a network invasion risk judgment result of a served side to a perception client side, wherein the network security perception model is constructed by a deep learning network and comprises seven layers of structures according to the network connection sequence, the first layer of structure is 128 LSTM units, the second layer of structure is 1 dropout layer, the third layer of structure is 64 improved LSTM units, the fourth layer of structure is 1 dropout layer, the fifth layer of structure is 32 improved LSTM units, the sixth layer of structure is 1 dropout layer, and the seventh layer of structure is a classification layer.

Optionally, the constructing a traffic matrix of the aware client and the primary service end includes:

acquiring IP addresses of the perception client and the main service terminal;

establishing a flow link by taking the IP address of the sensing client as a starting point and the IP address of the main service end as an end point;

setting an acquisition period for acquiring the flow link, and acquiring a flow value of the flow link according to the acquisition period;

correspondingly arranging each flow value according to an acquisition cycle to obtain the flow matrix, wherein the flow matrix is as follows:

wherein,

for the flow matrix, is>

Represents a fifth or fifth party>

Unit matrix for traffic in more than one acquisition cycle>

Denotes the first

The traffic link is ^ er/greater for several acquisition cycles>

The flow value of the flow collection is performed again.

Optionally, the selecting the sub-features with importance greater than a specified importance threshold from the feature value set to obtain a sub-feature set includes:

constructing different feature sets to be selected according to the feature value sets;

calculating the importance score of each group of feature sets to be selected according to an importance calculation formula, wherein the importance calculation formula is as follows:

wherein,

indicates the fifth->

Importance scores for candidate feature sets +>

Is the first->

The number of features of the candidate feature set->

Is numbered for each feature, is>

The number of the characteristic values is the characteristic number of the characteristic value set;

and extracting the feature set to be selected with the importance score larger than the specified importance threshold, repeatedly extracting each feature from the feature set to be selected with the importance score larger than the specified importance threshold, and combining to obtain the sub-feature set.

Optionally, the constructing a traffic covariance matrix based on the traffic matrix and solving an eigenvalue set of the traffic covariance matrix includes:

solving a transposed matrix of the flow matrix, and constructing a flow covariance matrix based on the flow matrix and the transposed matrix, wherein the flow covariance matrix is as follows:

wherein,

represents a flow matrix pick>

Based on the traffic covariance matrix, < > >>

Is transposed matrix, is asserted>

When a flow matrix is constructed, sensing the flow transmission times between a client and a main service end in each acquisition period;

constructing a characteristic equation of the flow covariance matrix, and solving the characteristic equation to obtain a characteristic value set, wherein the characteristic equation is as follows:

wherein,

for a set of characteristic values>

Is a unit diagonal matrix, in combination with a plurality of unit diagonal matrix>

The eigenvector of the flow covariance matrix;

optionally, the constructing different feature sets to be selected according to the feature value set includes:

receiving a preset set characteristic minimum value and a set characteristic maximum value;

and selecting features from the feature value sets which are not repeated, wherein the total number of the features is greater than or equal to the minimum value of the set features and less than or equal to the maximum value of the set features, and different feature sets to be selected are obtained.

Optionally, the traffic interaction index set includes a TCP session establishment success number, a TCP session establishment failure number, an uplink data packet number, a downlink data packet number, an average packet sending length, an average packet receiving length, a port access number of a served terminal, a connection number owned by a server IP, a RST packet receiving and sending number of a served terminal, a RST packet receiving and sending number of a sensing client, and a SYN packet receiving and sending number of a served terminal.

Optionally, the modified LSTM unit comprises:

the original expression of the forgetting gate of the LSTM unit is replaced by the following improved expression:

wherein,

for forgetting that the door is at moment>

In conjunction with a modification of the formula>

Activation function for a forgetting gate>

Is a weight matrix of the forget gate>

Is a forgetting gate bias vector>

Is the output value of the last LSTM output gate, in conjunction with the signal strength of the signal strength sensor>

Is in time>

Time flow indicator of time->

Is in time>

And time &>

The difference value of the two groups of time flow indicators->

Is biased for a preset difference value>

Is the total number of index types of the time flow index set>

Is the first->

The weight value of each index.

Optionally, the extracting, according to the TCP connection rule, a server that the sensing client is actively connected to at the current time to obtain a main server and a server that the sensing client is passively connected to obtain a served server includes:

inquiring TCP messages of a sensing client and all service terminals at the current moment, and judging whether each TCP message in the sensing client is a request connection type or a confirmation connection type;

when the TCP message is in a request connection type, confirming that the corresponding service end is a main service end according to a request destination address of the TCP message;

and when the TCP message is of the confirmed connection type, confirming that the corresponding server is the served terminal according to the confirmed destination address of the TCP message.

Optionally, the calculation method of the significant value is:

wherein,

represents a significant value of the sub-feature set versus the feature set, based on the value of the feature set>

The number of features in the sub-feature set>

Is the characteristic number of the characteristic value set>

Is->

Checking or chi fang checking;

in order to achieve the above object, the present invention further provides a big data based network security sensing apparatus, including:

the server classification module is used for receiving a network security perception instruction, determining a perception client to be detected according to the network security perception instruction, and extracting a server actively connected with the perception client at the current moment to obtain a main server and a server passively connected with the perception client to obtain a served server according to a TCP (transmission control protocol) connection rule;

the eigenvalue solving module is used for constructing flow matrixes of the sensing client and the main service end, constructing a flow covariance matrix based on the flow matrixes and solving an eigenvalue set of the flow covariance matrix;

the main service end risk judgment module is used for selecting the sub-features with the importance greater than a specified important threshold value from the feature value set to obtain a sub-feature set, calculating the significant value of the sub-feature set to the feature value set, and if the significant value is greater than the specified significant threshold value, judging that the main service end has network invasion risk to the perception client;

the risk judgment module of the served terminal is used for extracting a flow interaction index set of the served terminal and the sensing client, wherein the flow interaction index set comprises flow interaction index values and flow interaction time, the flow interaction index set is sorted based on the flow interaction time to obtain a time flow index set, the time flow index set is input into a network security perception model which is trained in advance to execute risk prediction, and a network invasion risk judgment result of the served terminal to the sensing client is obtained, wherein the network security perception model is constructed by a deep learning network and comprises seven layers according to a network sequence connection sequence, the first layer structure comprises 128 LSTM units, the second layer structure comprises 1 dropout layer, the third layer structure comprises 64 improved LSTM units, the fourth layer structure comprises 1 dropout layer, the fifth layer structure comprises 32 improved LSTM units, the sixth layer structure comprises 1 dropout layer, and the seventh layer structure comprises a classification layer.

In order to solve the above problem, the present invention also provides an electronic device, including:

a memory storing at least one instruction; and the processor executes the instructions stored in the memory to realize the big data-based network security perception method.

In order to solve the above problem, the present invention further provides a computer-readable storage medium, where at least one instruction is stored, and the at least one instruction is executed by a processor in an electronic device to implement the big data based network security awareness method described above.

In order to solve the problems in the background art, a network security perception instruction is received, a perception client to be detected is determined according to the network security perception instruction, according to a TCP connection rule, a server which is actively connected with the perception client at the current moment is extracted to obtain a main server and a server which is passively connected with the perception client to obtain a served end, the probability of network threats suffered by the perception client when the active connection server is connected with the passive connection server is different from that of the server, and the probability of network threats suffered by the perception client when the active connection server is connected with the passive connection server is generally smaller than that of the perception client. Therefore, the network security sensing method and device based on big data can solve the problem of low accuracy of network security threat prediction caused by the fact that a machine learning or deep learning model is used in a solidifying mode.

Drawings

Fig. 1 is a schematic flowchart of a big data-based network security awareness method according to an embodiment of the present invention;

FIG. 2 is a functional block diagram of a big data-based network security awareness apparatus according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an electronic device implementing the big data-based network security awareness method according to an embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The embodiment of the application provides a network security perception method based on big data. The execution subject of the big data based network security awareness method includes, but is not limited to, at least one of electronic devices such as a server and a terminal that can be configured to execute the method provided by the embodiments of the present application. In other words, the big data based network security awareness method may be performed by software or hardware installed in a terminal device or a server device, and the software may be a block chain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.

Fig. 1 is a schematic flow chart of a big data-based network security awareness method according to an embodiment of the present invention. In this embodiment, the method for sensing network security based on big data includes:

s1, receiving a network security perception instruction, and determining a perception client to be detected according to the network security perception instruction;

in the embodiment of the invention, the network security perception instruction can be sent by a network administrator or a perception client user. For example, when three pages are opened, a mobile notebook is used for developing related software, and the notebook comprises important commercial confidential programs, so that important information is prevented from being lost or stolen due to hacker intrusion or virus invasion, zhang Sandian generates a network security perception instruction by opening a network security perception button which is pre-installed in a notebook interface when the notebook is started, and the notebook is a perception client to be detected understandably.

S2, extracting a server which is actively connected with the sensing client at the current moment to obtain a main service end and a server which is passively connected to obtain a served end according to a TCP (Transmission control protocol) connection rule;

it should be explained that the embodiment of the present invention considers that the active connection of the sensing client to other servers and the passive connection to other servers have different levels of network risks, generally, the premise of the active connection of the sensing client is generated according to the user requirements, for example, a user accesses a certain webpage or clicks a certain graphical interface button, and if the sensing client is a normal webpage or button, the sensing client does not have a threat to the sensing client, but because the user mistakenly clicks and accesses an illegal webpage, the illegal webpage forcibly establishes traffic transmission with the sensing client, and therefore the forcibly established traffic transmission generally lasts for a long time and is abnormally active within a certain time period, and therefore, the embodiment of the present invention provides a rapid identification method according to the traffic characteristics of the server where the illegal webpage is located.

Further, the extracting, according to the TCP connection rule, a server that the sensing client is actively connected to at the current time to obtain a primary server and a server that the sensing client is passively connected to obtain a served server includes:

Illustratively, 5 TCP messages are total in the sensing client are traversed at the current moment, 2 of the TCP messages are of a request connection type, and 3 of the TCP messages are of a confirmation connection type, so that 2 main service terminals and 3 served terminals can be obtained in sequence, and a subsequent task in the embodiment of the present invention is to identify whether the sensing client has a network security threat to the 2 main service terminals and the 3 served terminals.

S3, constructing flow matrixes of the perception client and the main service terminal, constructing a flow covariance matrix based on the flow matrixes, and solving a characteristic value set of the flow covariance matrix;

in detail, the constructing the traffic matrices of the aware client and the primary service end includes:

acquiring IP addresses of the perception client and the main service terminal;

wherein,

for the flow matrix, is>

Indicates the fifth->

Unit matrix of the flow in individual acquisition cycles->

Is shown as

Logarithmic traffic link first ÷ based on number of collection periods>

A flow value for a sub-execution of a flow acquisition>

And when the traffic matrix is constructed, sensing the traffic transmission times between the client and the main service terminal in each acquisition period.

Illustratively, three notebooks used by zhang have 2 main service terminals, and then a traffic link between the notebooks and each main service terminal is sequentially established, and an acquisition cycle is set. It should be emphasized that the collection period set by the embodiment of the present invention is 24 hours, that is, each day is set as one collection period, and the larger the collection times per day is, the better, the flow value in the collection flow link within 24 hours can be set to be 10000 at least.

Further, the flow value

The flow value is positive or negative, the flow value indicates that the sensing client side pushes the flow to the main service side when the flow value is positive, and the flow value indicates that the sensing client side receives the flow pushed by the main service side when the flow value is negative, so that the traffic volume is reserved in the area of the area corresponding to the traffic volume>

May have a value of [12,0.1,1.2, -67, -79, -0.3,19, …,11,17.2]。

In detail, the constructing a traffic covariance matrix based on the traffic matrix and solving an eigenvalue set of the traffic covariance matrix includes:

wherein,

represents a flow matrix pick>

Based on the traffic covariance matrix, < > >>

Is transposed matrix, combined>

and constructing a characteristic equation of the flow covariance matrix, and solving the characteristic equation to obtain a characteristic value set, wherein the characteristic equation is as follows:

wherein,

is a set of characteristic values, is selected>

Is a unit diagonal matrix, is selected>

Eigenvectors as flow covariance matrices

In the embodiment of the present invention, solving the eigenvalue based on the eigen equation is a disclosed technical implementation means, and is not described herein again.

And S4, selecting the sub-features with the importance greater than a specified importance threshold value from the feature value set to obtain a sub-feature set, calculating the significance value of the sub-feature set to the feature value set, and if the significance value is greater than the specified significance threshold value, judging that the main service terminal has network invasion risk to the perception client terminal.

In detail, the selecting the sub-features with importance greater than a specified importance threshold from the feature value set to obtain a sub-feature set includes:

wherein,

indicates the fifth->

Importance scores for candidate feature sets +>

Is a first->

The number of features of the candidate feature set->

Numbered for each feature>

Further, the constructing different feature sets to be selected according to the feature value sets includes:

Illustratively, there are 10 groups of features in the feature value set, and the minimum value and the maximum value of the set features are 3 and 30, respectively, so that different feature sets to be selected can be obtained by sequentially extracting the 10 groups of features without repetition according to permutation and combination. Further, because the feature values of each group of feature sets to be selected may be different from each other, the importance scores of each group of feature sets to be selected are sequentially calculated according to the importance calculation formula, and features with the importance scores larger than a specified importance threshold are extracted and constructed to obtain the sub-feature sets.

Further, the calculation method of the significant value is as follows:

wherein,

For the number of features of the sub-feature set, <' > H>

For the number of features of the feature set>

Is->

Checking or chi-square checking.

The T-test, also called Student's T test, is to use the T distribution theory to deduce the probability of difference occurrence, so as to compare whether the two groups of data are significant, the chi-square test is to count the deviation degree between the actual observed value and the theoretical deduced value of the sample, the deviation degree between the actual observed value and the theoretical deduced value determines the size of the chi-square value, if the chi-square value is larger, the deviation degree between the two values is larger; conversely, the smaller the deviation between the two. The embodiment of the invention can detect the significance between the sub-feature set and the feature value set by using T-test or chi-square test, and generally, when the significance between the sub-feature set and the feature value set is greater than 0.95, namely 0.95 is the designated significance threshold, which indicates that the primary service end has network invasion risk to the perception client end.

It needs to be explained that, when the sub-feature set and the feature value set have high significance, the main service end has a network invasion risk to the perception client end, because the feature value set of the traffic matrix represents the traffic interaction process between the main service end and the perception client end, under normal conditions, the perception client end actively connects the main service end, generally seeks the help of the main service end, and is unordered and irregular, if the main service end is a program sharing webpage, the program sharing webpage is accessed when Zusanli has part of program bugs to be solved in the software development process; or the main service end downloads the webpage by the video resource, zhang III needs to learn the development algorithm in the software development process, so the corresponding development algorithm is downloaded from the webpage downloaded by the video resource, and the like. The eigenvalue represents the degree of the change frequency of the traffic matrix, and particularly refers to that the traffic matrix generates a constant transformation frequency in the direction indicated by the eigenvector, so that when the sub-feature set and the eigenvalue set have high significance, it indicates that the traffic matrix develops toward a direction indicated by the eigenvector, for example, the traffic change is in the direction of a fixed increasing value and the like in a fixed acquisition period, and in the context of active connection, the traffic interaction process between the main service end and the sensing client end should be unordered but show regularity, so that the main service end may have a risk of security infringement on the sensing client end.

And S5, extracting a flow interaction index set of the served terminal and the perception client terminal, wherein the flow interaction index set comprises flow interaction index values and flow interaction time, and sequencing the flow interaction index set based on the flow interaction time to obtain a time flow index set.

It should be explained that the served end actively seeks to establish a traffic connection with the sensing client, so the risk coefficient is generally greater than that of the primary serving end, and thus another risk sensing method is adopted in the embodiment of the present invention. The traffic interaction index set is a series of indexes of a data transmission process between a served terminal and a sensing client terminal, and includes but is not limited to a TCP session establishment success number, a TCP session establishment failure number, an uplink data packet number, a downlink data packet number, an average packet sending length, an average packet receiving length, a served terminal port access number, a served terminal IP owned connection number, a served terminal RST packet receiving and sending number, a sensing client terminal RST packet receiving and sending number, a served terminal SYN packet receiving and sending number and the like.

And it is understood that each index corresponds to the occurrence time, i.e., the traffic interaction time. For example, the average length of the service end is 20Bytes at 8 months, 10 days and eight nights in 2022.

In the embodiment of the invention, each group of flow interaction indexes are sequenced according to the flow interaction time and the order of the flow interaction indexes, so that a time flow index set comprising time is obtained, such as the average packet sending length of a served end: 20Bytes (eight cents per 10 months and night in 2022), 25Bytes (10 cents per 10 months and night in 8 months and 10 cents in 2022), 500Bytes (20 cents per 10 months and night in 8 months and 10 days in 2022), and the like.

And S6, inputting the time flow index set to a network security perception model trained in advance to execute risk prediction, and obtaining a network invasion risk judgment result of the server side to the perception client side.

It should be explained that the network security perception model is constructed by a deep learning network, and comprises seven layers of structures according to the network connection sequence, wherein the first layer of structure comprises 128 LSTM units, the second layer of structure comprises 1 dropout layer, the third layer of structure comprises 64 improved LSTM units, the fourth layer of structure comprises 1 dropout layer, the fifth layer of structure comprises 32 improved LSTM units, the sixth layer of structure comprises 1 dropout layer, and the seventh layer of structure comprises a classification layer.

It should be explained that LSTM (Long Short-Term Memory) refers to a Long-Short-Term Memory artificial neural network, is a time-cycle neural network, and has an effect of efficiently extracting data features and predicting, therefore, in the embodiment of the present invention, 128 LSTM units are connected end to end in a first layer structure of a network security perception model, and then, in order to prevent an over-fitting phenomenon, while a second layer structure is 1 dropout layer, which is used to appropriately shift out part of model parameters, so that a synergistic effect between time-flow index sets is weakened.

It should be noted that, according to the data characteristics of the time traffic indicator set, the embodiment of the present invention improves the LSTM unit to obtain an improved LSTM unit, and places the improved LSTM unit on the third layer and the fifth layer of the network security awareness model.

In detail, the improved LSTM unit includes:

the expression of the forgetting gate of the LSTM unit is replaced with the following modified formula:

wherein,

for forgetting that the door is at moment>

Is improved formula (iv)>

For the activation function of a forgetting gate>

Is a weight matrix of the forget gate>

Is a forgetting gate bias vector>

Is the output value of the last LSTM output gate, <' > is greater than or equal to>

Is in time>

Time flow indicator of time->

Is based on the time->

And instant->

The difference value of the two groups of time flow indicators->

Is a preset difference offset value>

Is the total number of index types of the time flow index set>

Is the first->

The weight value of each index.

the embodiment of the invention refines the offset vector, because the offset vector of the original forgetting gate expression is a fixed value and can only be obtained through training without considering the influence of the difference between data sets on the offset vector, and because the flow index change frequency of the served end and the sensing client end is high, the introduction moment is high

And time &>

The difference value of two groups of time flow indexes and the change of the weight value adjusting offset vector can improve the prediction accuracy of the network security perception model.

In order to solve the problems in the background art, a network security perception instruction is received, a perception client to be detected is determined according to the network security perception instruction, according to a TCP connection rule, a server which is actively connected with the perception client at the current moment is extracted to obtain a main server and a server which is passively connected with the perception client to obtain a served end, the probability of network threats suffered by the perception client when the active connection server is connected with the passive connection server is different from that of the server, and the probability of network threats suffered by the perception client when the active connection server is connected with the passive connection server is generally smaller than that of the perception client. Therefore, the network security sensing method and device based on big data can solve the problem of low accuracy of network security threat prediction caused by currently solidifying and using a machine learning or deep learning model.

Fig. 2 is a functional block diagram of a big data-based network security awareness apparatus according to an embodiment of the present invention.

The big data based network security awareness apparatus 100 according to the present invention may be installed in an electronic device. According to the realized functions, the network security awareness apparatus 100 based on big data may include a server classification module 101, a eigenvalue solving module 102, a main server risk judgment module 103, and a served end risk judgment module 104. The module of the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and can perform a fixed function, and are stored in a memory of the electronic device.

The server classification module 101 is configured to receive a network security sensing instruction, determine a sensing client to be detected according to the network security sensing instruction, and extract a server actively connected to the sensing client at a current time to obtain a main server and a server passively connected to the sensing client to obtain a served server according to a TCP connection rule;

the eigenvalue solving module 102 is configured to construct traffic matrices of the sensing client and the main service end, construct a traffic covariance matrix based on the traffic matrices, and solve an eigenvalue set of the traffic covariance matrix;

the main service end risk judgment module 103 is configured to select a sub-feature with importance greater than a specified importance threshold from the feature value set to obtain a sub-feature set, calculate a significant value of the sub-feature set to the feature value set, and judge that the main service end has a network invasion risk to the sensing client if the significant value is greater than the specified significant threshold; the calculation method of the significant value comprises the following steps:

wherein,

For the number of features of the sub-feature set, <' > H>

Is the characteristic number of the characteristic value set>

Is->

Checking or chi-square checking;

the served end risk judgment module 104 is configured to extract a traffic interaction index set of the served end and the sensing client, where the traffic interaction index set includes traffic interaction index values and traffic interaction time, sort the traffic interaction index set based on the traffic interaction time to obtain a time traffic index set, input the time traffic index set to a pre-trained network security perception model to perform risk prediction, and obtain a network invasion risk judgment result of the served end to the sensing client, where the network security perception model is constructed by a deep learning network and includes seven layers according to a network sequence connection order, a first layer includes 128 LSTM units, a second layer includes 1 dropout layer, a third layer includes 64 improved LSTM units, a fourth layer includes 1 dropout layer, a fifth layer includes 32 improved LSTM units, a sixth layer includes 1 dropout layer, and the seventh layer includes a classification layer.

In detail, when the modules in the network security awareness apparatus 100 based on big data in the embodiment of the present invention are used, the same technical means as the block chain based product supply chain management method described in fig. 1 above is adopted, and the same technical effects can be produced, which is not described herein again.

Fig. 3 is a schematic structural diagram of an electronic device for implementing a big data-based network security awareness method according to an embodiment of the present invention.

The electronic device 1 may include a processor 10, a memory 11 and a bus 12, and may further include a computer program, such as a big data-based network security awareness method program, stored in the memory 11 and executable on the processor 10.

The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of network security awareness method programs based on big data, but also to temporarily store data that has been output or will be output.

The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (e.g., network security aware method programs based on big data, etc.) stored in the memory 11 and calling data stored in the memory 11.

The bus 12 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus 12 may be divided into an address bus, a data bus, a control bus, etc. The bus 12 is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like.

Fig. 3 shows only an electronic device with components, and it will be understood by those skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than those shown, or some components may be combined, or a different arrangement of components.

For example, although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.

Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used to establish a communication connection between the electronic device 1 and another electronic device.

Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.

It is to be understood that the embodiments described are illustrative only and are not to be construed as limiting the scope of the claims.

The big data-based network security awareness method program stored in the memory 11 of the electronic device 1 is a combination of a plurality of instructions, and when running in the processor 10, the method can realize:

selecting sub-features with the importance greater than a specified importance threshold value from the feature value set to obtain a sub-feature set, calculating a significant value of the sub-feature set to the feature value set, and if the significant value is greater than the specified significant threshold value, judging that the main service side has a network invasion risk to the perception client side; the calculation method of the significant value comprises the following steps:

wherein,

For the number of features of the sub-feature set, <' > H>

Is the characteristic number of the characteristic value set>

Is->

Checking or chi fang checking;

extracting a flow interaction index set of the served terminal and the perception client terminal, wherein the flow interaction index set comprises flow interaction index values and flow interaction time, and sequencing the flow interaction index set based on the flow interaction time to obtain a time flow index set;

Specifically, the specific implementation method of the processor 10 for the instruction may refer to the description of the relevant steps in the embodiments corresponding to fig. 1 to fig. 3, which is not repeated herein.

Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying said computer program code, a recording medium, a usb-disk, a removable hard disk, a magnetic diskette, an optical disk, a computer Memory, a Read-Only Memory (ROM).

The present invention also provides a computer-readable storage medium, storing a computer program which, when executed by a processor of an electronic device, may implement:

wherein,

For the number of features of the sub-feature set, <' > H>

Is the characteristic number of the characteristic value set>

Is->

Checking or chi fang checking;

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A big data-based network security awareness method is characterized by comprising the following steps:

according to a TCP connection rule, extracting a server actively connected with the sensing client at the current moment to obtain a main service end and a server passively connected with the sensing client to obtain a served end;

constructing flow matrixes of the perception client and the main service end, constructing a flow covariance matrix based on the flow matrixes and solving an eigenvalue set of the flow covariance matrix, wherein the constructing the flow covariance matrix based on the flow matrixes and solving the eigenvalue set of the flow covariance matrix comprises the following steps:

wherein,

a traffic covariance matrix, X, representing a traffic matrix X ^T When the matrix is transposed matrix and n is constructed flow matrix, the sensing client and the main server are sensed in each acquisition periodThe number of flow acquisition times of flow transmission between service ends;

wherein, λ is a characteristic value set, E is a unit diagonal matrix, and y is a characteristic vector of a flow covariance matrix;

selecting sub-features with the importance greater than a specified importance threshold value from the feature value set to obtain a sub-feature set, calculating the significance value of the sub-feature set to the feature value set, and if the correlation degree is greater than the specified significance threshold value, judging that the main service side has a network invasion risk to the perception client side; the calculation method of the significant value comprises the following steps:

wherein, T _a Representing the significant value of the sub-feature set to the feature value set, a is the feature number of the sub-feature set, m is the feature number of the feature value set, F _t T-test or chi-square test;

inputting the time flow index set into a pre-trained network security perception model to execute risk prediction, and obtaining a network invasion risk judgment result of a served terminal to a perception client, wherein the network security perception model is constructed by a deep learning network and comprises seven layers of structures according to the network connection sequence, the first layer of structure is 128 LSTM units, the second layer of structure is 1 dropout layer, the third layer of structure is 64 improved LSTM units, the fourth layer of structure is 1 dropout layer, the fifth layer of structure is 32 improved LSTM units, the sixth layer of structure is 1 dropout layer, and the seventh layer of structure is a classification layer, wherein the improved LSTM units comprise:

wherein, f _t For forgetting the formula of the door at time t, σ _a Activation function for forgetting door, e _f Weight matrix for forgetting gate, d _f Bias vector for forgetting gate, h _t-1 Is the output value, x, of the last LSTM output gate _l For the time flow indicator at time t,

is the difference between two groups of time flow indexes at the time t and the time t-1, gamma is a preset difference offset value, S is the total index type number of the time flow index set, omega _j Is the weighted value of the j index.

2. The big data based network security awareness method according to claim 1, wherein the constructing of the traffic matrices of the aware client and the primary service client comprises:

acquiring IP addresses of the perception client and the main service terminal;

wherein X is the traffic matrix, X _p Identity matrix, x, representing the flow rate at the p-th acquisition cycle _np And the flow value of the flow acquisition executed on the nth time of the flow link in the p acquisition period is shown.

3. The big-data-based network security awareness method according to claim 2, wherein the selecting the sub-features with importance greater than a specified importance threshold from the feature value set to obtain a sub-feature set comprises:

wherein eta _b ^s Representing the importance score of the s-th feature set to be selected, b is the feature number of the s-th feature set to be selected, i is the feature number of each feature, m is the feature number of the feature value set, and lambda _i Representing the ith feature in the feature value set;

4. The big data-based network security awareness method according to claim 3, wherein the constructing different feature sets to be selected according to the feature value sets comprises:

5. The big data-based network security awareness method according to claim 4, wherein the constructing different feature sets to be selected according to the feature value sets comprises:

6. The big-data-based network security awareness method according to claim 1, wherein the traffic interaction index set includes a TCP session establishment success number, a TCP session establishment failure number, an uplink data packet number, a downlink data packet number, an average packet sending length, an average packet receiving length, a port access number of a served terminal, a connection number owned by a server IP, a RST packet receiving and sending number of served terminals, a RST packet receiving and sending number of sensing clients, and a SYN packet receiving and sending number of served terminals.

7. The method for network security awareness based on big data according to claim 1, wherein the extracting, according to the TCP connection rule, the server that the awareness client is actively connected at the current time to obtain the primary server and the server that the awareness client is passively connected to obtain the served client includes:

querying TCP messages of a sensing client and all service terminals at the current moment, and judging whether each TCP message in the sensing client requests a connection type or confirms the connection type;

8. The big data based network security awareness method as claimed in claim 2, wherein the collection period is set to 24 hours as one collection period.

9. An apparatus for sensing network security based on big data, the apparatus comprising:

the eigenvalue solving module is used for constructing a flow matrix of the sensing client and the main service end, constructing a flow covariance matrix based on the flow matrix and solving an eigenvalue set of the flow covariance matrix, wherein the constructing the flow covariance matrix based on the flow matrix and solving the eigenvalue set of the flow covariance matrix comprises the following steps:

wherein,

a traffic covariance matrix, X, representing a traffic matrix X ^T The traffic acquisition time is a transposed matrix, and n is the traffic acquisition frequency of traffic transmission between the sensing client and the main service terminal in each acquisition period when the traffic matrix is constructed;

the main service end risk judgment module is used for selecting the sub-features with the importance greater than a specified importance threshold value from the feature value set to obtain a sub-feature set, calculating the significance value of the sub-feature set to the feature value set, and judging that the main service end has network invasion risk to the perception client side if the correlation degree is greater than the specified significance threshold value; the calculation method of the significant value comprises the following steps:

a served end risk judgment module, configured to extract a traffic interaction index set of the served end and a sensing client, where the traffic interaction index set includes traffic interaction index values and traffic interaction time, sort the traffic interaction index set based on the traffic interaction time to obtain a time traffic index set, input the time traffic index set to a pre-trained network security awareness model to perform risk prediction, and obtain a network invasion risk judgment result of the served end to the sensing client, where the network security awareness model is constructed by a deep learning network and includes seven layers according to a network connection order, a first layer includes 128 LSTM units, a second layer includes 1 dropout layer, a third layer includes 64 improved LSTM units, a fourth layer includes 1 dropout layer, a fifth layer includes 32 improved LSTM units, a sixth layer includes 1 dropout layer, and a seventh layer includes a classification layer, where the improved LSTM units include:

the original expression of the forgetting gate of the LSTM unit is replaced by the following improved formula:

wherein f is _t For forgetting the formula of the door at time t, σ _a Activation function for forgetting door, e _f Weight matrix for forgetting gate, d _f To forget the offset vector of the gate, h _t-1 Is the output value, x, of the last LSTM output gate _l For the time flow indicator at time t,

is the difference between two groups of time flow indexes at the time t and the time t-1, gamma is a preset difference offset value, S is the total index type number of the time flow index set, omega _j Is the weight value of the jth index. />