CN115580486B - Network security sensing method and device based on big data - Google Patents

Network security sensing method and device based on big data Download PDF

Info

Publication number
CN115580486B
CN115580486B CN202211449597.9A CN202211449597A CN115580486B CN 115580486 B CN115580486 B CN 115580486B CN 202211449597 A CN202211449597 A CN 202211449597A CN 115580486 B CN115580486 B CN 115580486B
Authority
CN
China
Prior art keywords
flow
feature
value
client
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211449597.9A
Other languages
Chinese (zh)
Other versions
CN115580486A (en
Inventor
项翔翔
蒋行健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Zhenhai Big Data Investment Development Co ltd
Original Assignee
Ningbo Zhenhai Big Data Investment Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Zhenhai Big Data Investment Development Co ltd filed Critical Ningbo Zhenhai Big Data Investment Development Co ltd
Priority to CN202211449597.9A priority Critical patent/CN115580486B/en
Publication of CN115580486A publication Critical patent/CN115580486A/en
Application granted granted Critical
Publication of CN115580486B publication Critical patent/CN115580486B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/163In-band adaptation of TCP data exchange; In-band control procedures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to the technical field of information security detection, in particular to a network security sensing method and a device based on big data, which comprises the following steps: the method comprises the steps of constructing flow matrixes of a perception client and a main service end, constructing a flow covariance matrix based on the flow matrixes, solving a characteristic value set and a sub-characteristic set of the flow covariance matrix, calculating a significant value of the sub-characteristic set to the characteristic value set, judging that the main service end has a network invasion risk to the perception client if the significant value is larger than a specified significant threshold, extracting a time flow index set of a served end and the perception client, inputting the time flow index set to a pre-trained network security perception model to execute risk prediction, and obtaining a network invasion risk judgment result of the served end to the perception client. The method can solve the problem of low accuracy of network security threat prediction caused by using a machine learning or deep learning model in the current curing process.

Description

Network security sensing method and device based on big data
Technical Field
The invention relates to the technical field of information security, in particular to a network security sensing method and device based on big data.
Background
The network Security (Cyber Security) means that the hardware, software and data in the system of the network system are protected and are not damaged, changed and leaked due to accidental or malicious reasons, the system continuously, reliably and normally operates, and the network service is not interrupted.
The emphasis points of different network security perception methods are different, wherein the hot method is to predict whether a client has a security risk in advance by monitoring the traffic interaction condition. At present, the mainstream traffic interaction monitoring method mainly collects traffic interaction index data of a client and another server, and then judges whether the server has network threat to the client or not according to the traffic interaction index data through machine learning or deep learning.
Although the method can realize network security perception, whether the client is actively connected or passively connected with the server is not considered, risk judgment is executed by solidified machine learning or deep learning, and the accuracy of network security perception is low.
Disclosure of Invention
The invention provides a big data-based network security sensing method and device, and mainly aims to solve the problem of low accuracy of network security threat prediction caused by the fact that a machine learning or deep learning model is used at present.
In order to achieve the above object, the present invention provides a big data-based network security sensing method, which includes:
receiving a network security perception instruction, and determining a perception client to be detected according to the network security perception instruction;
according to a TCP connection rule, extracting a server which is actively connected with the sensing client at the current moment to obtain a main service end and a server which is passively connected to obtain a served end;
establishing flow matrixes of the perception client and the main service terminal, establishing a flow covariance matrix based on the flow matrixes and solving a characteristic value set of the flow covariance matrix;
selecting sub-features with the importance greater than a specified importance threshold value from the feature value set to obtain a sub-feature set, calculating a significant value of the sub-feature set to the feature value set, and if the significant value is greater than the specified significant threshold value, judging that the main service side has a network invasion risk to the perception client side;
extracting a flow interaction index set of the served terminal and the sensing client terminal, wherein the flow interaction index set comprises flow interaction index values and flow interaction time, and sequencing the flow interaction index set based on the flow interaction time to obtain a time flow index set;
inputting the time flow index set into a pre-trained network security perception model to execute risk prediction, and obtaining a network invasion risk judgment result of a served side to a perception client side, wherein the network security perception model is constructed by a deep learning network and comprises seven layers of structures according to the network connection sequence, the first layer of structure is 128 LSTM units, the second layer of structure is 1 dropout layer, the third layer of structure is 64 improved LSTM units, the fourth layer of structure is 1 dropout layer, the fifth layer of structure is 32 improved LSTM units, the sixth layer of structure is 1 dropout layer, and the seventh layer of structure is a classification layer.
Optionally, the constructing a traffic matrix of the aware client and the primary service end includes:
acquiring IP addresses of the perception client and the main service terminal;
establishing a flow link by taking the IP address of the sensing client as a starting point and the IP address of the main service end as an end point;
setting an acquisition period for acquiring the flow link, and acquiring a flow value of the flow link according to the acquisition period;
correspondingly arranging each flow value according to an acquisition cycle to obtain the flow matrix, wherein the flow matrix is as follows:
Figure 561716DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 92055DEST_PATH_IMAGE002
for the flow matrix, is>
Figure 704301DEST_PATH_IMAGE003
Represents a fifth or fifth party>
Figure 444724DEST_PATH_IMAGE004
Unit matrix for traffic in more than one acquisition cycle>
Figure 675986DEST_PATH_IMAGE005
Denotes the first
Figure 185464DEST_PATH_IMAGE004
The traffic link is ^ er/greater for several acquisition cycles>
Figure 234192DEST_PATH_IMAGE006
The flow value of the flow collection is performed again.
Optionally, the selecting the sub-features with importance greater than a specified importance threshold from the feature value set to obtain a sub-feature set includes:
constructing different feature sets to be selected according to the feature value sets;
calculating the importance score of each group of feature sets to be selected according to an importance calculation formula, wherein the importance calculation formula is as follows:
Figure 337277DEST_PATH_IMAGE007
wherein the content of the first and second substances,
Figure 231284DEST_PATH_IMAGE008
indicates the fifth->
Figure 860848DEST_PATH_IMAGE009
Importance scores for candidate feature sets +>
Figure 690264DEST_PATH_IMAGE010
Is the first->
Figure 136770DEST_PATH_IMAGE009
The number of features of the candidate feature set->
Figure 100047DEST_PATH_IMAGE011
Is numbered for each feature, is>
Figure 459484DEST_PATH_IMAGE012
The number of the characteristic values is the characteristic number of the characteristic value set;
and extracting the feature set to be selected with the importance score larger than the specified importance threshold, repeatedly extracting each feature from the feature set to be selected with the importance score larger than the specified importance threshold, and combining to obtain the sub-feature set.
Optionally, the constructing a traffic covariance matrix based on the traffic matrix and solving an eigenvalue set of the traffic covariance matrix includes:
solving a transposed matrix of the flow matrix, and constructing a flow covariance matrix based on the flow matrix and the transposed matrix, wherein the flow covariance matrix is as follows:
Figure 584435DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure 948199DEST_PATH_IMAGE014
represents a flow matrix pick>
Figure 184009DEST_PATH_IMAGE002
Based on the traffic covariance matrix, < > >>
Figure 522586DEST_PATH_IMAGE015
Is transposed matrix, is asserted>
Figure 959384DEST_PATH_IMAGE006
When a flow matrix is constructed, sensing the flow transmission times between a client and a main service end in each acquisition period;
constructing a characteristic equation of the flow covariance matrix, and solving the characteristic equation to obtain a characteristic value set, wherein the characteristic equation is as follows:
Figure 383412DEST_PATH_IMAGE016
wherein, the first and the second end of the pipe are connected with each other,
Figure 422912DEST_PATH_IMAGE017
for a set of characteristic values>
Figure 22520DEST_PATH_IMAGE018
Is a unit diagonal matrix, in combination with a plurality of unit diagonal matrix>
Figure 754853DEST_PATH_IMAGE019
The eigenvector of the flow covariance matrix;
optionally, the constructing different feature sets to be selected according to the feature value set includes:
receiving a preset set characteristic minimum value and a set characteristic maximum value;
and selecting features from the feature value sets which are not repeated, wherein the total number of the features is greater than or equal to the minimum value of the set features and less than or equal to the maximum value of the set features, and different feature sets to be selected are obtained.
Optionally, the traffic interaction index set includes a TCP session establishment success number, a TCP session establishment failure number, an uplink data packet number, a downlink data packet number, an average packet sending length, an average packet receiving length, a port access number of a served terminal, a connection number owned by a server IP, a RST packet receiving and sending number of a served terminal, a RST packet receiving and sending number of a sensing client, and a SYN packet receiving and sending number of a served terminal.
Optionally, the modified LSTM unit comprises:
the original expression of the forgetting gate of the LSTM unit is replaced by the following improved expression:
Figure 403528DEST_PATH_IMAGE020
wherein, the first and the second end of the pipe are connected with each other,
Figure 122085DEST_PATH_IMAGE021
for forgetting that the door is at moment>
Figure 700834DEST_PATH_IMAGE022
In conjunction with a modification of the formula>
Figure 338489DEST_PATH_IMAGE023
Activation function for a forgetting gate>
Figure 612475DEST_PATH_IMAGE024
Is a weight matrix of the forget gate>
Figure 993778DEST_PATH_IMAGE025
Is a forgetting gate bias vector>
Figure 161454DEST_PATH_IMAGE026
Is the output value of the last LSTM output gate, in conjunction with the signal strength of the signal strength sensor>
Figure 110956DEST_PATH_IMAGE027
Is in time>
Figure 996872DEST_PATH_IMAGE028
Time flow indicator of time->
Figure 916286DEST_PATH_IMAGE029
Is in time>
Figure 344994DEST_PATH_IMAGE028
And time &>
Figure 324451DEST_PATH_IMAGE030
The difference value of the two groups of time flow indicators->
Figure 573030DEST_PATH_IMAGE031
Is biased for a preset difference value>
Figure 293205DEST_PATH_IMAGE032
Is the total number of index types of the time flow index set>
Figure 435474DEST_PATH_IMAGE033
Is the first->
Figure 54674DEST_PATH_IMAGE034
The weight value of each index.
Optionally, the extracting, according to the TCP connection rule, a server that the sensing client is actively connected to at the current time to obtain a main server and a server that the sensing client is passively connected to obtain a served server includes:
inquiring TCP messages of a sensing client and all service terminals at the current moment, and judging whether each TCP message in the sensing client is a request connection type or a confirmation connection type;
when the TCP message is in a request connection type, confirming that the corresponding service end is a main service end according to a request destination address of the TCP message;
and when the TCP message is of the confirmed connection type, confirming that the corresponding server is the served terminal according to the confirmed destination address of the TCP message.
Optionally, the calculation method of the significant value is:
Figure 915183DEST_PATH_IMAGE035
wherein the content of the first and second substances,
Figure 317345DEST_PATH_IMAGE036
represents a significant value of the sub-feature set versus the feature set, based on the value of the feature set>
Figure 579699DEST_PATH_IMAGE037
The number of features in the sub-feature set>
Figure 900959DEST_PATH_IMAGE012
Is the characteristic number of the characteristic value set>
Figure 124130DEST_PATH_IMAGE038
Is->
Figure 454617DEST_PATH_IMAGE039
Checking or chi fang checking;
in order to achieve the above object, the present invention further provides a big data based network security sensing apparatus, including:
the server classification module is used for receiving a network security perception instruction, determining a perception client to be detected according to the network security perception instruction, and extracting a server actively connected with the perception client at the current moment to obtain a main server and a server passively connected with the perception client to obtain a served server according to a TCP (transmission control protocol) connection rule;
the eigenvalue solving module is used for constructing flow matrixes of the sensing client and the main service end, constructing a flow covariance matrix based on the flow matrixes and solving an eigenvalue set of the flow covariance matrix;
the main service end risk judgment module is used for selecting the sub-features with the importance greater than a specified important threshold value from the feature value set to obtain a sub-feature set, calculating the significant value of the sub-feature set to the feature value set, and if the significant value is greater than the specified significant threshold value, judging that the main service end has network invasion risk to the perception client;
the risk judgment module of the served terminal is used for extracting a flow interaction index set of the served terminal and the sensing client, wherein the flow interaction index set comprises flow interaction index values and flow interaction time, the flow interaction index set is sorted based on the flow interaction time to obtain a time flow index set, the time flow index set is input into a network security perception model which is trained in advance to execute risk prediction, and a network invasion risk judgment result of the served terminal to the sensing client is obtained, wherein the network security perception model is constructed by a deep learning network and comprises seven layers according to a network sequence connection sequence, the first layer structure comprises 128 LSTM units, the second layer structure comprises 1 dropout layer, the third layer structure comprises 64 improved LSTM units, the fourth layer structure comprises 1 dropout layer, the fifth layer structure comprises 32 improved LSTM units, the sixth layer structure comprises 1 dropout layer, and the seventh layer structure comprises a classification layer.
In order to solve the above problem, the present invention also provides an electronic device, including:
a memory storing at least one instruction; and the processor executes the instructions stored in the memory to realize the big data-based network security perception method.
In order to solve the above problem, the present invention further provides a computer-readable storage medium, where at least one instruction is stored, and the at least one instruction is executed by a processor in an electronic device to implement the big data based network security awareness method described above.
In order to solve the problems in the background art, a network security perception instruction is received, a perception client to be detected is determined according to the network security perception instruction, according to a TCP connection rule, a server which is actively connected with the perception client at the current moment is extracted to obtain a main server and a server which is passively connected with the perception client to obtain a served end, the probability of network threats suffered by the perception client when the active connection server is connected with the passive connection server is different from that of the server, and the probability of network threats suffered by the perception client when the active connection server is connected with the passive connection server is generally smaller than that of the perception client. Therefore, the network security sensing method and device based on big data can solve the problem of low accuracy of network security threat prediction caused by the fact that a machine learning or deep learning model is used in a solidifying mode.
Drawings
Fig. 1 is a schematic flowchart of a big data-based network security awareness method according to an embodiment of the present invention;
FIG. 2 is a functional block diagram of a big data-based network security awareness apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device implementing the big data-based network security awareness method according to an embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The embodiment of the application provides a network security perception method based on big data. The execution subject of the big data based network security awareness method includes, but is not limited to, at least one of electronic devices such as a server and a terminal that can be configured to execute the method provided by the embodiments of the present application. In other words, the big data based network security awareness method may be performed by software or hardware installed in a terminal device or a server device, and the software may be a block chain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
Fig. 1 is a schematic flow chart of a big data-based network security awareness method according to an embodiment of the present invention. In this embodiment, the method for sensing network security based on big data includes:
s1, receiving a network security perception instruction, and determining a perception client to be detected according to the network security perception instruction;
in the embodiment of the invention, the network security perception instruction can be sent by a network administrator or a perception client user. For example, when three pages are opened, a mobile notebook is used for developing related software, and the notebook comprises important commercial confidential programs, so that important information is prevented from being lost or stolen due to hacker intrusion or virus invasion, zhang Sandian generates a network security perception instruction by opening a network security perception button which is pre-installed in a notebook interface when the notebook is started, and the notebook is a perception client to be detected understandably.
S2, extracting a server which is actively connected with the sensing client at the current moment to obtain a main service end and a server which is passively connected to obtain a served end according to a TCP (Transmission control protocol) connection rule;
it should be explained that the embodiment of the present invention considers that the active connection of the sensing client to other servers and the passive connection to other servers have different levels of network risks, generally, the premise of the active connection of the sensing client is generated according to the user requirements, for example, a user accesses a certain webpage or clicks a certain graphical interface button, and if the sensing client is a normal webpage or button, the sensing client does not have a threat to the sensing client, but because the user mistakenly clicks and accesses an illegal webpage, the illegal webpage forcibly establishes traffic transmission with the sensing client, and therefore the forcibly established traffic transmission generally lasts for a long time and is abnormally active within a certain time period, and therefore, the embodiment of the present invention provides a rapid identification method according to the traffic characteristics of the server where the illegal webpage is located.
Further, the extracting, according to the TCP connection rule, a server that the sensing client is actively connected to at the current time to obtain a primary server and a server that the sensing client is passively connected to obtain a served server includes:
inquiring TCP messages of a sensing client and all service terminals at the current moment, and judging whether each TCP message in the sensing client is a request connection type or a confirmation connection type;
when the TCP message is in a request connection type, confirming that the corresponding service end is a main service end according to a request destination address of the TCP message;
and when the TCP message is of the confirmed connection type, confirming that the corresponding server is the served terminal according to the confirmed destination address of the TCP message.
Illustratively, 5 TCP messages are total in the sensing client are traversed at the current moment, 2 of the TCP messages are of a request connection type, and 3 of the TCP messages are of a confirmation connection type, so that 2 main service terminals and 3 served terminals can be obtained in sequence, and a subsequent task in the embodiment of the present invention is to identify whether the sensing client has a network security threat to the 2 main service terminals and the 3 served terminals.
S3, constructing flow matrixes of the perception client and the main service terminal, constructing a flow covariance matrix based on the flow matrixes, and solving a characteristic value set of the flow covariance matrix;
in detail, the constructing the traffic matrices of the aware client and the primary service end includes:
acquiring IP addresses of the perception client and the main service terminal;
establishing a flow link by taking the IP address of the sensing client as a starting point and the IP address of the main service end as an end point;
setting an acquisition period for acquiring the flow link, and acquiring a flow value of the flow link according to the acquisition period;
correspondingly arranging each flow value according to an acquisition cycle to obtain the flow matrix, wherein the flow matrix is as follows:
Figure 305899DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 939005DEST_PATH_IMAGE002
for the flow matrix, is>
Figure 511457DEST_PATH_IMAGE003
Indicates the fifth->
Figure 380055DEST_PATH_IMAGE004
Unit matrix of the flow in individual acquisition cycles->
Figure 226789DEST_PATH_IMAGE005
Is shown as
Figure 155430DEST_PATH_IMAGE004
Logarithmic traffic link first ÷ based on number of collection periods>
Figure 87614DEST_PATH_IMAGE006
A flow value for a sub-execution of a flow acquisition>
Figure 759904DEST_PATH_IMAGE006
And when the traffic matrix is constructed, sensing the traffic transmission times between the client and the main service terminal in each acquisition period.
Illustratively, three notebooks used by zhang have 2 main service terminals, and then a traffic link between the notebooks and each main service terminal is sequentially established, and an acquisition cycle is set. It should be emphasized that the collection period set by the embodiment of the present invention is 24 hours, that is, each day is set as one collection period, and the larger the collection times per day is, the better, the flow value in the collection flow link within 24 hours can be set to be 10000 at least.
Further, the flow value
Figure 585778DEST_PATH_IMAGE005
The flow value is positive or negative, the flow value indicates that the sensing client side pushes the flow to the main service side when the flow value is positive, and the flow value indicates that the sensing client side receives the flow pushed by the main service side when the flow value is negative, so that the traffic volume is reserved in the area of the area corresponding to the traffic volume>
Figure 560687DEST_PATH_IMAGE003
May have a value of [12,0.1,1.2, -67, -79, -0.3,19, …,11,17.2]。
In detail, the constructing a traffic covariance matrix based on the traffic matrix and solving an eigenvalue set of the traffic covariance matrix includes:
solving a transposed matrix of the flow matrix, and constructing a flow covariance matrix based on the flow matrix and the transposed matrix, wherein the flow covariance matrix is as follows:
Figure 104801DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure 315202DEST_PATH_IMAGE014
represents a flow matrix pick>
Figure 870948DEST_PATH_IMAGE002
Based on the traffic covariance matrix, < > >>
Figure 406972DEST_PATH_IMAGE015
Is transposed matrix, combined>
Figure 169873DEST_PATH_IMAGE006
When a flow matrix is constructed, sensing the flow transmission times between a client and a main service end in each acquisition period;
and constructing a characteristic equation of the flow covariance matrix, and solving the characteristic equation to obtain a characteristic value set, wherein the characteristic equation is as follows:
Figure 59332DEST_PATH_IMAGE016
wherein the content of the first and second substances,
Figure 594218DEST_PATH_IMAGE017
is a set of characteristic values, is selected>
Figure 301143DEST_PATH_IMAGE018
Is a unit diagonal matrix, is selected>
Figure 429636DEST_PATH_IMAGE019
Eigenvectors as flow covariance matrices
In the embodiment of the present invention, solving the eigenvalue based on the eigen equation is a disclosed technical implementation means, and is not described herein again.
And S4, selecting the sub-features with the importance greater than a specified importance threshold value from the feature value set to obtain a sub-feature set, calculating the significance value of the sub-feature set to the feature value set, and if the significance value is greater than the specified significance threshold value, judging that the main service terminal has network invasion risk to the perception client terminal.
In detail, the selecting the sub-features with importance greater than a specified importance threshold from the feature value set to obtain a sub-feature set includes:
constructing different feature sets to be selected according to the feature value sets;
calculating the importance score of each group of feature sets to be selected according to an importance calculation formula, wherein the importance calculation formula is as follows:
Figure 981840DEST_PATH_IMAGE007
wherein the content of the first and second substances,
Figure 902392DEST_PATH_IMAGE008
indicates the fifth->
Figure 124426DEST_PATH_IMAGE009
Importance scores for candidate feature sets +>
Figure 130428DEST_PATH_IMAGE010
Is a first->
Figure 486323DEST_PATH_IMAGE009
The number of features of the candidate feature set->
Figure 871168DEST_PATH_IMAGE011
Numbered for each feature>
Figure 919895DEST_PATH_IMAGE012
The number of the characteristic values is the characteristic number of the characteristic value set;
and extracting the feature set to be selected with the importance score larger than the specified importance threshold, repeatedly extracting each feature from the feature set to be selected with the importance score larger than the specified importance threshold, and combining to obtain the sub-feature set.
Further, the constructing different feature sets to be selected according to the feature value sets includes:
receiving a preset set characteristic minimum value and a set characteristic maximum value;
and selecting features from the feature value sets which are not repeated, wherein the total number of the features is greater than or equal to the minimum value of the set features and less than or equal to the maximum value of the set features, and different feature sets to be selected are obtained.
Illustratively, there are 10 groups of features in the feature value set, and the minimum value and the maximum value of the set features are 3 and 30, respectively, so that different feature sets to be selected can be obtained by sequentially extracting the 10 groups of features without repetition according to permutation and combination. Further, because the feature values of each group of feature sets to be selected may be different from each other, the importance scores of each group of feature sets to be selected are sequentially calculated according to the importance calculation formula, and features with the importance scores larger than a specified importance threshold are extracted and constructed to obtain the sub-feature sets.
Further, the calculation method of the significant value is as follows:
Figure 884965DEST_PATH_IMAGE035
wherein the content of the first and second substances,
Figure 654337DEST_PATH_IMAGE036
represents a significant value of the sub-feature set versus the feature set, based on the value of the feature set>
Figure 549481DEST_PATH_IMAGE037
For the number of features of the sub-feature set, <' > H>
Figure 503531DEST_PATH_IMAGE012
For the number of features of the feature set>
Figure 93912DEST_PATH_IMAGE038
Is->
Figure 791610DEST_PATH_IMAGE039
Checking or chi-square checking.
The T-test, also called Student's T test, is to use the T distribution theory to deduce the probability of difference occurrence, so as to compare whether the two groups of data are significant, the chi-square test is to count the deviation degree between the actual observed value and the theoretical deduced value of the sample, the deviation degree between the actual observed value and the theoretical deduced value determines the size of the chi-square value, if the chi-square value is larger, the deviation degree between the two values is larger; conversely, the smaller the deviation between the two. The embodiment of the invention can detect the significance between the sub-feature set and the feature value set by using T-test or chi-square test, and generally, when the significance between the sub-feature set and the feature value set is greater than 0.95, namely 0.95 is the designated significance threshold, which indicates that the primary service end has network invasion risk to the perception client end.
It needs to be explained that, when the sub-feature set and the feature value set have high significance, the main service end has a network invasion risk to the perception client end, because the feature value set of the traffic matrix represents the traffic interaction process between the main service end and the perception client end, under normal conditions, the perception client end actively connects the main service end, generally seeks the help of the main service end, and is unordered and irregular, if the main service end is a program sharing webpage, the program sharing webpage is accessed when Zusanli has part of program bugs to be solved in the software development process; or the main service end downloads the webpage by the video resource, zhang III needs to learn the development algorithm in the software development process, so the corresponding development algorithm is downloaded from the webpage downloaded by the video resource, and the like. The eigenvalue represents the degree of the change frequency of the traffic matrix, and particularly refers to that the traffic matrix generates a constant transformation frequency in the direction indicated by the eigenvector, so that when the sub-feature set and the eigenvalue set have high significance, it indicates that the traffic matrix develops toward a direction indicated by the eigenvector, for example, the traffic change is in the direction of a fixed increasing value and the like in a fixed acquisition period, and in the context of active connection, the traffic interaction process between the main service end and the sensing client end should be unordered but show regularity, so that the main service end may have a risk of security infringement on the sensing client end.
And S5, extracting a flow interaction index set of the served terminal and the perception client terminal, wherein the flow interaction index set comprises flow interaction index values and flow interaction time, and sequencing the flow interaction index set based on the flow interaction time to obtain a time flow index set.
It should be explained that the served end actively seeks to establish a traffic connection with the sensing client, so the risk coefficient is generally greater than that of the primary serving end, and thus another risk sensing method is adopted in the embodiment of the present invention. The traffic interaction index set is a series of indexes of a data transmission process between a served terminal and a sensing client terminal, and includes but is not limited to a TCP session establishment success number, a TCP session establishment failure number, an uplink data packet number, a downlink data packet number, an average packet sending length, an average packet receiving length, a served terminal port access number, a served terminal IP owned connection number, a served terminal RST packet receiving and sending number, a sensing client terminal RST packet receiving and sending number, a served terminal SYN packet receiving and sending number and the like.
And it is understood that each index corresponds to the occurrence time, i.e., the traffic interaction time. For example, the average length of the service end is 20Bytes at 8 months, 10 days and eight nights in 2022.
In the embodiment of the invention, each group of flow interaction indexes are sequenced according to the flow interaction time and the order of the flow interaction indexes, so that a time flow index set comprising time is obtained, such as the average packet sending length of a served end: 20Bytes (eight cents per 10 months and night in 2022), 25Bytes (10 cents per 10 months and night in 8 months and 10 cents in 2022), 500Bytes (20 cents per 10 months and night in 8 months and 10 days in 2022), and the like.
And S6, inputting the time flow index set to a network security perception model trained in advance to execute risk prediction, and obtaining a network invasion risk judgment result of the server side to the perception client side.
It should be explained that the network security perception model is constructed by a deep learning network, and comprises seven layers of structures according to the network connection sequence, wherein the first layer of structure comprises 128 LSTM units, the second layer of structure comprises 1 dropout layer, the third layer of structure comprises 64 improved LSTM units, the fourth layer of structure comprises 1 dropout layer, the fifth layer of structure comprises 32 improved LSTM units, the sixth layer of structure comprises 1 dropout layer, and the seventh layer of structure comprises a classification layer.
It should be explained that LSTM (Long Short-Term Memory) refers to a Long-Short-Term Memory artificial neural network, is a time-cycle neural network, and has an effect of efficiently extracting data features and predicting, therefore, in the embodiment of the present invention, 128 LSTM units are connected end to end in a first layer structure of a network security perception model, and then, in order to prevent an over-fitting phenomenon, while a second layer structure is 1 dropout layer, which is used to appropriately shift out part of model parameters, so that a synergistic effect between time-flow index sets is weakened.
It should be noted that, according to the data characteristics of the time traffic indicator set, the embodiment of the present invention improves the LSTM unit to obtain an improved LSTM unit, and places the improved LSTM unit on the third layer and the fifth layer of the network security awareness model.
In detail, the improved LSTM unit includes:
the expression of the forgetting gate of the LSTM unit is replaced with the following modified formula:
Figure 275680DEST_PATH_IMAGE020
wherein the content of the first and second substances,
Figure 541577DEST_PATH_IMAGE021
for forgetting that the door is at moment>
Figure 478309DEST_PATH_IMAGE022
Is improved formula (iv)>
Figure 714118DEST_PATH_IMAGE023
For the activation function of a forgetting gate>
Figure 928062DEST_PATH_IMAGE024
Is a weight matrix of the forget gate>
Figure 223914DEST_PATH_IMAGE025
Is a forgetting gate bias vector>
Figure 910591DEST_PATH_IMAGE026
Is the output value of the last LSTM output gate, <' > is greater than or equal to>
Figure 825458DEST_PATH_IMAGE027
Is in time>
Figure 284121DEST_PATH_IMAGE028
Time flow indicator of time->
Figure 750874DEST_PATH_IMAGE029
Is based on the time->
Figure 537565DEST_PATH_IMAGE028
And instant->
Figure 98865DEST_PATH_IMAGE030
The difference value of the two groups of time flow indicators->
Figure 880876DEST_PATH_IMAGE031
Is a preset difference offset value>
Figure 393897DEST_PATH_IMAGE032
Is the total number of index types of the time flow index set>
Figure 61026DEST_PATH_IMAGE033
Is the first->
Figure 442329DEST_PATH_IMAGE034
The weight value of each index.
The expression of the forgetting gate of the LSTM unit is replaced with the following modified formula:
Figure 750951DEST_PATH_IMAGE040
the embodiment of the invention refines the offset vector, because the offset vector of the original forgetting gate expression is a fixed value and can only be obtained through training without considering the influence of the difference between data sets on the offset vector, and because the flow index change frequency of the served end and the sensing client end is high, the introduction moment is high
Figure 559507DEST_PATH_IMAGE022
And time &>
Figure 586369DEST_PATH_IMAGE030
The difference value of two groups of time flow indexes and the change of the weight value adjusting offset vector can improve the prediction accuracy of the network security perception model.
In order to solve the problems in the background art, a network security perception instruction is received, a perception client to be detected is determined according to the network security perception instruction, according to a TCP connection rule, a server which is actively connected with the perception client at the current moment is extracted to obtain a main server and a server which is passively connected with the perception client to obtain a served end, the probability of network threats suffered by the perception client when the active connection server is connected with the passive connection server is different from that of the server, and the probability of network threats suffered by the perception client when the active connection server is connected with the passive connection server is generally smaller than that of the perception client. Therefore, the network security sensing method and device based on big data can solve the problem of low accuracy of network security threat prediction caused by currently solidifying and using a machine learning or deep learning model.
Fig. 2 is a functional block diagram of a big data-based network security awareness apparatus according to an embodiment of the present invention.
The big data based network security awareness apparatus 100 according to the present invention may be installed in an electronic device. According to the realized functions, the network security awareness apparatus 100 based on big data may include a server classification module 101, a eigenvalue solving module 102, a main server risk judgment module 103, and a served end risk judgment module 104. The module of the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and can perform a fixed function, and are stored in a memory of the electronic device.
The server classification module 101 is configured to receive a network security sensing instruction, determine a sensing client to be detected according to the network security sensing instruction, and extract a server actively connected to the sensing client at a current time to obtain a main server and a server passively connected to the sensing client to obtain a served server according to a TCP connection rule;
the eigenvalue solving module 102 is configured to construct traffic matrices of the sensing client and the main service end, construct a traffic covariance matrix based on the traffic matrices, and solve an eigenvalue set of the traffic covariance matrix;
the main service end risk judgment module 103 is configured to select a sub-feature with importance greater than a specified importance threshold from the feature value set to obtain a sub-feature set, calculate a significant value of the sub-feature set to the feature value set, and judge that the main service end has a network invasion risk to the sensing client if the significant value is greater than the specified significant threshold; the calculation method of the significant value comprises the following steps:
Figure 505783DEST_PATH_IMAGE035
wherein the content of the first and second substances,
Figure 527966DEST_PATH_IMAGE036
represents a significant value of the sub-feature set versus the feature set, based on the value of the feature set>
Figure 913948DEST_PATH_IMAGE037
For the number of features of the sub-feature set, <' > H>
Figure 21581DEST_PATH_IMAGE012
Is the characteristic number of the characteristic value set>
Figure 744686DEST_PATH_IMAGE038
Is->
Figure 27900DEST_PATH_IMAGE039
Checking or chi-square checking;
the served end risk judgment module 104 is configured to extract a traffic interaction index set of the served end and the sensing client, where the traffic interaction index set includes traffic interaction index values and traffic interaction time, sort the traffic interaction index set based on the traffic interaction time to obtain a time traffic index set, input the time traffic index set to a pre-trained network security perception model to perform risk prediction, and obtain a network invasion risk judgment result of the served end to the sensing client, where the network security perception model is constructed by a deep learning network and includes seven layers according to a network sequence connection order, a first layer includes 128 LSTM units, a second layer includes 1 dropout layer, a third layer includes 64 improved LSTM units, a fourth layer includes 1 dropout layer, a fifth layer includes 32 improved LSTM units, a sixth layer includes 1 dropout layer, and the seventh layer includes a classification layer.
In detail, when the modules in the network security awareness apparatus 100 based on big data in the embodiment of the present invention are used, the same technical means as the block chain based product supply chain management method described in fig. 1 above is adopted, and the same technical effects can be produced, which is not described herein again.
Fig. 3 is a schematic structural diagram of an electronic device for implementing a big data-based network security awareness method according to an embodiment of the present invention.
The electronic device 1 may include a processor 10, a memory 11 and a bus 12, and may further include a computer program, such as a big data-based network security awareness method program, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of network security awareness method programs based on big data, but also to temporarily store data that has been output or will be output.
The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (e.g., network security aware method programs based on big data, etc.) stored in the memory 11 and calling data stored in the memory 11.
The bus 12 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus 12 may be divided into an address bus, a data bus, a control bus, etc. The bus 12 is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like.
Fig. 3 shows only an electronic device with components, and it will be understood by those skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than those shown, or some components may be combined, or a different arrangement of components.
For example, although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used to establish a communication connection between the electronic device 1 and another electronic device.
Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.
It is to be understood that the embodiments described are illustrative only and are not to be construed as limiting the scope of the claims.
The big data-based network security awareness method program stored in the memory 11 of the electronic device 1 is a combination of a plurality of instructions, and when running in the processor 10, the method can realize:
receiving a network security perception instruction, and determining a perception client to be detected according to the network security perception instruction;
according to a TCP connection rule, extracting a server which is actively connected with the sensing client at the current moment to obtain a main service end and a server which is passively connected to obtain a served end;
establishing flow matrixes of the perception client and the main service terminal, establishing a flow covariance matrix based on the flow matrixes and solving a characteristic value set of the flow covariance matrix;
selecting sub-features with the importance greater than a specified importance threshold value from the feature value set to obtain a sub-feature set, calculating a significant value of the sub-feature set to the feature value set, and if the significant value is greater than the specified significant threshold value, judging that the main service side has a network invasion risk to the perception client side; the calculation method of the significant value comprises the following steps:
Figure 178259DEST_PATH_IMAGE035
wherein, the first and the second end of the pipe are connected with each other,
Figure 793696DEST_PATH_IMAGE036
represents a significant value of the sub-feature set versus the feature set, based on the value of the feature set>
Figure 461438DEST_PATH_IMAGE037
For the number of features of the sub-feature set, <' > H>
Figure 458213DEST_PATH_IMAGE012
Is the characteristic number of the characteristic value set>
Figure 45052DEST_PATH_IMAGE038
Is->
Figure 2643DEST_PATH_IMAGE039
Checking or chi fang checking;
extracting a flow interaction index set of the served terminal and the perception client terminal, wherein the flow interaction index set comprises flow interaction index values and flow interaction time, and sequencing the flow interaction index set based on the flow interaction time to obtain a time flow index set;
inputting the time flow index set into a pre-trained network security perception model to execute risk prediction, and obtaining a network invasion risk judgment result of a served side to a perception client side, wherein the network security perception model is constructed by a deep learning network and comprises seven layers of structures according to the network connection sequence, the first layer of structure is 128 LSTM units, the second layer of structure is 1 dropout layer, the third layer of structure is 64 improved LSTM units, the fourth layer of structure is 1 dropout layer, the fifth layer of structure is 32 improved LSTM units, the sixth layer of structure is 1 dropout layer, and the seventh layer of structure is a classification layer.
Specifically, the specific implementation method of the processor 10 for the instruction may refer to the description of the relevant steps in the embodiments corresponding to fig. 1 to fig. 3, which is not repeated herein.
Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying said computer program code, a recording medium, a usb-disk, a removable hard disk, a magnetic diskette, an optical disk, a computer Memory, a Read-Only Memory (ROM).
The present invention also provides a computer-readable storage medium, storing a computer program which, when executed by a processor of an electronic device, may implement:
receiving a network security perception instruction, and determining a perception client to be detected according to the network security perception instruction;
according to a TCP connection rule, extracting a server which is actively connected with the sensing client at the current moment to obtain a main service end and a server which is passively connected to obtain a served end;
establishing flow matrixes of the perception client and the main service terminal, establishing a flow covariance matrix based on the flow matrixes and solving a characteristic value set of the flow covariance matrix;
selecting sub-features with the importance greater than a specified importance threshold value from the feature value set to obtain a sub-feature set, calculating a significant value of the sub-feature set to the feature value set, and if the significant value is greater than the specified significant threshold value, judging that the main service side has a network invasion risk to the perception client side; the calculation method of the significant value comprises the following steps:
Figure 67551DEST_PATH_IMAGE035
wherein the content of the first and second substances,
Figure 184412DEST_PATH_IMAGE036
represents a significant value of the sub-feature set versus the feature set, based on the value of the feature set>
Figure 817519DEST_PATH_IMAGE037
For the number of features of the sub-feature set, <' > H>
Figure 652619DEST_PATH_IMAGE012
Is the characteristic number of the characteristic value set>
Figure 255639DEST_PATH_IMAGE038
Is->
Figure 102372DEST_PATH_IMAGE039
Checking or chi fang checking;
extracting a flow interaction index set of the served terminal and the perception client terminal, wherein the flow interaction index set comprises flow interaction index values and flow interaction time, and sequencing the flow interaction index set based on the flow interaction time to obtain a time flow index set;
inputting the time flow index set into a pre-trained network security perception model to execute risk prediction, and obtaining a network invasion risk judgment result of a served side to a perception client side, wherein the network security perception model is constructed by a deep learning network and comprises seven layers of structures according to the network connection sequence, the first layer of structure is 128 LSTM units, the second layer of structure is 1 dropout layer, the third layer of structure is 64 improved LSTM units, the fourth layer of structure is 1 dropout layer, the fifth layer of structure is 32 improved LSTM units, the sixth layer of structure is 1 dropout layer, and the seventh layer of structure is a classification layer.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (9)

1. A big data-based network security awareness method is characterized by comprising the following steps:
receiving a network security perception instruction, and determining a perception client to be detected according to the network security perception instruction;
according to a TCP connection rule, extracting a server actively connected with the sensing client at the current moment to obtain a main service end and a server passively connected with the sensing client to obtain a served end;
constructing flow matrixes of the perception client and the main service end, constructing a flow covariance matrix based on the flow matrixes and solving an eigenvalue set of the flow covariance matrix, wherein the constructing the flow covariance matrix based on the flow matrixes and solving the eigenvalue set of the flow covariance matrix comprises the following steps:
solving a transposed matrix of the flow matrix, and constructing a flow covariance matrix based on the flow matrix and the transposed matrix, wherein the flow covariance matrix is as follows:
Figure FDA0004060113920000011
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0004060113920000012
a traffic covariance matrix, X, representing a traffic matrix X T When the matrix is transposed matrix and n is constructed flow matrix, the sensing client and the main server are sensed in each acquisition periodThe number of flow acquisition times of flow transmission between service ends;
and constructing a characteristic equation of the flow covariance matrix, and solving the characteristic equation to obtain a characteristic value set, wherein the characteristic equation is as follows:
Figure FDA0004060113920000013
wherein, λ is a characteristic value set, E is a unit diagonal matrix, and y is a characteristic vector of a flow covariance matrix;
selecting sub-features with the importance greater than a specified importance threshold value from the feature value set to obtain a sub-feature set, calculating the significance value of the sub-feature set to the feature value set, and if the correlation degree is greater than the specified significance threshold value, judging that the main service side has a network invasion risk to the perception client side; the calculation method of the significant value comprises the following steps:
Figure FDA0004060113920000014
wherein, T a Representing the significant value of the sub-feature set to the feature value set, a is the feature number of the sub-feature set, m is the feature number of the feature value set, F t T-test or chi-square test;
extracting a flow interaction index set of the served terminal and the sensing client terminal, wherein the flow interaction index set comprises flow interaction index values and flow interaction time, and sequencing the flow interaction index set based on the flow interaction time to obtain a time flow index set;
inputting the time flow index set into a pre-trained network security perception model to execute risk prediction, and obtaining a network invasion risk judgment result of a served terminal to a perception client, wherein the network security perception model is constructed by a deep learning network and comprises seven layers of structures according to the network connection sequence, the first layer of structure is 128 LSTM units, the second layer of structure is 1 dropout layer, the third layer of structure is 64 improved LSTM units, the fourth layer of structure is 1 dropout layer, the fifth layer of structure is 32 improved LSTM units, the sixth layer of structure is 1 dropout layer, and the seventh layer of structure is a classification layer, wherein the improved LSTM units comprise:
the original expression of the forgetting gate of the LSTM unit is replaced by the following improved expression:
Figure FDA0004060113920000021
wherein, f t For forgetting the formula of the door at time t, σ a Activation function for forgetting door, e f Weight matrix for forgetting gate, d f Bias vector for forgetting gate, h t-1 Is the output value, x, of the last LSTM output gate l For the time flow indicator at time t,
Figure FDA0004060113920000022
is the difference between two groups of time flow indexes at the time t and the time t-1, gamma is a preset difference offset value, S is the total index type number of the time flow index set, omega j Is the weighted value of the j index.
2. The big data based network security awareness method according to claim 1, wherein the constructing of the traffic matrices of the aware client and the primary service client comprises:
acquiring IP addresses of the perception client and the main service terminal;
establishing a flow link by taking the IP address of the sensing client as a starting point and the IP address of the main service end as an end point;
setting an acquisition period for acquiring the flow link, and acquiring a flow value of the flow link according to the acquisition period;
correspondingly arranging each flow value according to an acquisition cycle to obtain the flow matrix, wherein the flow matrix is as follows:
Figure FDA0004060113920000023
wherein X is the traffic matrix, X p Identity matrix, x, representing the flow rate at the p-th acquisition cycle np And the flow value of the flow acquisition executed on the nth time of the flow link in the p acquisition period is shown.
3. The big-data-based network security awareness method according to claim 2, wherein the selecting the sub-features with importance greater than a specified importance threshold from the feature value set to obtain a sub-feature set comprises:
constructing different feature sets to be selected according to the feature value sets;
calculating the importance score of each group of feature sets to be selected according to an importance calculation formula, wherein the importance calculation formula is as follows:
Figure FDA0004060113920000031
wherein eta b s Representing the importance score of the s-th feature set to be selected, b is the feature number of the s-th feature set to be selected, i is the feature number of each feature, m is the feature number of the feature value set, and lambda i Representing the ith feature in the feature value set;
and extracting the feature set to be selected with the importance score larger than the specified importance threshold, repeatedly extracting each feature from the feature set to be selected with the importance score larger than the specified importance threshold, and combining to obtain the sub-feature set.
4. The big data-based network security awareness method according to claim 3, wherein the constructing different feature sets to be selected according to the feature value sets comprises:
receiving a preset set characteristic minimum value and a set characteristic maximum value;
and selecting features from the feature value sets which are not repeated, wherein the total number of the features is greater than or equal to the minimum value of the set features and less than or equal to the maximum value of the set features, and different feature sets to be selected are obtained.
5. The big data-based network security awareness method according to claim 4, wherein the constructing different feature sets to be selected according to the feature value sets comprises:
receiving a preset set characteristic minimum value and a set characteristic maximum value;
and selecting features from the feature value sets which are not repeated, wherein the total number of the features is greater than or equal to the minimum value of the set features and less than or equal to the maximum value of the set features, and different feature sets to be selected are obtained.
6. The big-data-based network security awareness method according to claim 1, wherein the traffic interaction index set includes a TCP session establishment success number, a TCP session establishment failure number, an uplink data packet number, a downlink data packet number, an average packet sending length, an average packet receiving length, a port access number of a served terminal, a connection number owned by a server IP, a RST packet receiving and sending number of served terminals, a RST packet receiving and sending number of sensing clients, and a SYN packet receiving and sending number of served terminals.
7. The method for network security awareness based on big data according to claim 1, wherein the extracting, according to the TCP connection rule, the server that the awareness client is actively connected at the current time to obtain the primary server and the server that the awareness client is passively connected to obtain the served client includes:
querying TCP messages of a sensing client and all service terminals at the current moment, and judging whether each TCP message in the sensing client requests a connection type or confirms the connection type;
when the TCP message is in a request connection type, confirming that the corresponding service end is a main service end according to a request destination address of the TCP message;
and when the TCP message is of the confirmed connection type, confirming that the corresponding server is the served terminal according to the confirmed destination address of the TCP message.
8. The big data based network security awareness method as claimed in claim 2, wherein the collection period is set to 24 hours as one collection period.
9. An apparatus for sensing network security based on big data, the apparatus comprising:
the server classification module is used for receiving a network security perception instruction, determining a perception client to be detected according to the network security perception instruction, and extracting a server actively connected with the perception client at the current moment to obtain a main server and a server passively connected with the perception client to obtain a served server according to a TCP (transmission control protocol) connection rule;
the eigenvalue solving module is used for constructing a flow matrix of the sensing client and the main service end, constructing a flow covariance matrix based on the flow matrix and solving an eigenvalue set of the flow covariance matrix, wherein the constructing the flow covariance matrix based on the flow matrix and solving the eigenvalue set of the flow covariance matrix comprises the following steps:
solving a transposed matrix of the flow matrix, and constructing a flow covariance matrix based on the flow matrix and the transposed matrix, wherein the flow covariance matrix is as follows:
Figure FDA0004060113920000041
wherein the content of the first and second substances,
Figure FDA0004060113920000042
a traffic covariance matrix, X, representing a traffic matrix X T The traffic acquisition time is a transposed matrix, and n is the traffic acquisition frequency of traffic transmission between the sensing client and the main service terminal in each acquisition period when the traffic matrix is constructed;
and constructing a characteristic equation of the flow covariance matrix, and solving the characteristic equation to obtain a characteristic value set, wherein the characteristic equation is as follows:
Figure FDA0004060113920000043
wherein, λ is a characteristic value set, E is a unit diagonal matrix, and y is a characteristic vector of a flow covariance matrix;
the main service end risk judgment module is used for selecting the sub-features with the importance greater than a specified importance threshold value from the feature value set to obtain a sub-feature set, calculating the significance value of the sub-feature set to the feature value set, and judging that the main service end has network invasion risk to the perception client side if the correlation degree is greater than the specified significance threshold value; the calculation method of the significant value comprises the following steps:
Figure FDA0004060113920000051
wherein, T a Representing the significant value of the sub-feature set to the feature value set, a is the feature number of the sub-feature set, m is the feature number of the feature value set, F t T-test or chi-square test;
a served end risk judgment module, configured to extract a traffic interaction index set of the served end and a sensing client, where the traffic interaction index set includes traffic interaction index values and traffic interaction time, sort the traffic interaction index set based on the traffic interaction time to obtain a time traffic index set, input the time traffic index set to a pre-trained network security awareness model to perform risk prediction, and obtain a network invasion risk judgment result of the served end to the sensing client, where the network security awareness model is constructed by a deep learning network and includes seven layers according to a network connection order, a first layer includes 128 LSTM units, a second layer includes 1 dropout layer, a third layer includes 64 improved LSTM units, a fourth layer includes 1 dropout layer, a fifth layer includes 32 improved LSTM units, a sixth layer includes 1 dropout layer, and a seventh layer includes a classification layer, where the improved LSTM units include:
the original expression of the forgetting gate of the LSTM unit is replaced by the following improved formula:
Figure FDA0004060113920000052
wherein f is t For forgetting the formula of the door at time t, σ a Activation function for forgetting door, e f Weight matrix for forgetting gate, d f To forget the offset vector of the gate, h t-1 Is the output value, x, of the last LSTM output gate l For the time flow indicator at time t,
Figure FDA0004060113920000053
is the difference between two groups of time flow indexes at the time t and the time t-1, gamma is a preset difference offset value, S is the total index type number of the time flow index set, omega j Is the weight value of the jth index. />
CN202211449597.9A 2022-11-18 2022-11-18 Network security sensing method and device based on big data Active CN115580486B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211449597.9A CN115580486B (en) 2022-11-18 2022-11-18 Network security sensing method and device based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211449597.9A CN115580486B (en) 2022-11-18 2022-11-18 Network security sensing method and device based on big data

Publications (2)

Publication Number Publication Date
CN115580486A CN115580486A (en) 2023-01-06
CN115580486B true CN115580486B (en) 2023-04-07

Family

ID=84589952

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211449597.9A Active CN115580486B (en) 2022-11-18 2022-11-18 Network security sensing method and device based on big data

Country Status (1)

Country Link
CN (1) CN115580486B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111092900A (en) * 2019-12-24 2020-05-01 北京北信源软件股份有限公司 Method and device for monitoring abnormal connection and scanning behavior of server
CN114969333A (en) * 2022-05-20 2022-08-30 遥相科技发展(北京)有限公司 Network information security management method and device based on data mining

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107786369B (en) * 2017-09-26 2020-02-04 广东电网有限责任公司电力调度控制中心 Power communication network security situation perception and prediction method based on IRT (intelligent resilient test) hierarchical analysis and LSTM (local Scale TM)
CN109040130B (en) * 2018-09-21 2020-12-22 成都力鸣信息技术有限公司 Method for measuring host network behavior pattern based on attribute relation graph
CN109359698B (en) * 2018-10-30 2020-07-21 清华大学 Leakage identification method based on long-time memory neural network model
CN111970309B (en) * 2020-10-20 2021-02-02 南京理工大学 Spark Internet of vehicles based combined deep learning intrusion detection method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111092900A (en) * 2019-12-24 2020-05-01 北京北信源软件股份有限公司 Method and device for monitoring abnormal connection and scanning behavior of server
CN114969333A (en) * 2022-05-20 2022-08-30 遥相科技发展(北京)有限公司 Network information security management method and device based on data mining

Also Published As

Publication number Publication date
CN115580486A (en) 2023-01-06

Similar Documents

Publication Publication Date Title
Wang et al. A Real‐Time Pothole Detection Approach for Intelligent Transportation System
CN111612168B (en) Management method and related device for machine learning task
US10740411B2 (en) Determining repeat website users via browser uniqueness tracking
US9490987B2 (en) Accurately classifying a computer program interacting with a computer system using questioning and fingerprinting
CN110177108A (en) A kind of anomaly detection method, device and verifying system
CN110417778B (en) Access request processing method and device
CN107888616A (en) The detection method of construction method and Webshell the attack website of disaggregated model based on URI
CN106716958A (en) Lateral movement detection
EP3231199B1 (en) Notifications on mobile devices
CN102077201A (en) System and method for dynamic and real-time categorization of webpages
WO2017003593A1 (en) Customized network traffic models to detect application anomalies
CN109862003A (en) Local generation method, device, system and the storage medium for threatening information bank
CN106571933B (en) Service processing method and device
CN113949527A (en) Abnormal access detection method and device, electronic equipment and readable storage medium
CN113111359A (en) Big data resource sharing method and resource sharing system based on information security
CN114553523A (en) Attack detection method and device based on attack detection model, medium and equipment
CN113221163B (en) Model training method and system
CN110912874A (en) Method and system for effectively identifying machine access behaviors
CN110730164A (en) Safety early warning method, related equipment and computer readable storage medium
CN110572302A (en) Diskless local area network scene identification method and device and terminal
CN115580486B (en) Network security sensing method and device based on big data
CN115119197B (en) Wireless network risk analysis method, device, equipment and medium based on big data
Wang et al. Application research of file fingerprint identification detection based on a network security protection system
Yu et al. Whether the sensitive information statement of the IoT privacy policy is consistent with the actual behavior
Niu et al. Implementation of network information security monitoring system based on adaptive deep detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant