CN116340540A - Method for generating network security emergency response knowledge graph based on text - Google Patents
Method for generating network security emergency response knowledge graph based on text Download PDFInfo
- Publication number
- CN116340540A CN116340540A CN202310316305.2A CN202310316305A CN116340540A CN 116340540 A CN116340540 A CN 116340540A CN 202310316305 A CN202310316305 A CN 202310316305A CN 116340540 A CN116340540 A CN 116340540A
- Authority
- CN
- China
- Prior art keywords
- node
- edges
- text
- generating
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 230000004044 response Effects 0.000 title claims abstract description 19
- 230000006870 function Effects 0.000 claims abstract description 11
- 239000011159 matrix material Substances 0.000 claims description 16
- 238000012549 training Methods 0.000 claims description 15
- 238000011176 pooling Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 claims description 5
- 238000002474 experimental method Methods 0.000 claims description 4
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 230000008901 benefit Effects 0.000 claims description 3
- 230000001364 causal effect Effects 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000012986 modification Methods 0.000 claims description 3
- 230000004048 modification Effects 0.000 claims description 3
- 238000000926 separation method Methods 0.000 claims description 3
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 2
- 239000003550 marker Substances 0.000 claims description 2
- 230000004931 aggregating effect Effects 0.000 abstract description 2
- 238000010276 construction Methods 0.000 description 5
- 238000001514 detection method Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Biology (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Human Computer Interaction (AREA)
- Animal Behavior & Ethology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method for generating a network security emergency response knowledge graph based on text. The method comprises the following steps: firstly, generating text nodes and node characteristics through a pre-trained coder decoder language model; then through inputting text and node inquiry which can be learned, the characteristic of the inquired node is output, and then the node text is generated through LSTM; then merging the text node characteristics and the query node characteristics to obtain final node characteristics; then generating and fusing edges by using two modes of generation and classification; finally we train the network by aggregating the loss functions and sparse adjacency matrices. The network security knowledge graph generated by the method has extremely high applicability and accuracy.
Description
Technical Field
The invention belongs to the field of network security knowledge graph.
Background
Network security emergency response refers to utilizing internally stored security knowledge to cope with a possible threat and taking corresponding measures after the threat occurs. With the emerging threat of increasing complexity, traditional passive defense approaches to network security have been struggled. Therefore, people are innovating in the field of network security, and the standards required for improving the emergency command capability and efficiency for coping with different situations are also higher. To address this problem, the scientific community has proposed using knowledge-graph to address network security issues. The knowledge graph is a new method for analyzing and processing network security data, and the network security emergency response knowledge graph is generated. By using the network safety emergency response knowledge graph, safety emergency personnel can quickly identify and analyze safety events and know required emergency response flows and tool technologies, so that the efficiency of safety emergency response is improved. The network security emergency response knowledge graph is a data-driven, linear, very computationally powerful tool. Personnel engaged in network security work can intuitively know the relationship between network security entities and entities through a network security emergency response knowledge graph, such as the utilization relationship between malicious software and loopholes, the countermeasure relationship between attackers and security protection equipment and the relationship between systems and loopholes, so that network security problems can be better treated, the quality of the network security knowledge graph plays a decisive role in subsequent application based on the knowledge graph, how to trace the network security according to the knowledge graph, and how to generate an accurate network security knowledge graph becomes popular in research.
Disclosure of Invention
The invention aims to realize the construction of a high-performance knowledge graph through texts, and the traditional knowledge graph construction method is time-consuming and labor-consuming, because even common knowledge graph nodes are required to be generated and checked with high labor cost, and in the traditional method, the nodes are hundreds of nodes in the knowledge graph, and a great deal of labor is consumed for repeated operation, so that the knowledge graph construction efficiency is quite low. How to quickly construct the network security knowledge graph becomes a great importance, the invention constructs the network security knowledge graph by a novel method, solves various defects of the traditional method, and the network security knowledge graph constructed by the method has extremely high applicability and accuracy. The invention introduces a novel method for generating a network security emergency response knowledge graph based on text. Firstly, generating text nodes and node characteristics through a pre-trained coder decoder language model; then through inputting text and node inquiry which can be learned, the characteristic of the inquired node is output, and then the node text is generated through LSTM; then fusing text node characteristics and query node characteristics to obtain final node characteristics, and generating and fusing edges by using two modes of generation and classification; finally we train the network by aggregating the loss functions and sparse adjacency matrices. The network security knowledge graph generated by the method has extremely high applicability and accuracy.
Technical means
The invention aims to realize the construction of a high-performance knowledge graph through texts, and the traditional knowledge graph construction method is time-consuming and labor-consuming. This is because even ordinary knowledge graph nodes require high labor cost for generation and verification, and such nodes usually reach millions in large-scale knowledge graphs, and in the conventional method, a great deal of labor is consumed for repeated operation, so that the efficiency of building the knowledge graph is quite low, and how to quickly build the network security knowledge graph becomes important. The network security knowledge graph is constructed by the novel method, so that various defects of the traditional method are overcome, and the network security knowledge graph constructed by the method has extremely high applicability and accuracy. The method comprises the following steps:
s1, generating text nodes and node characteristics through a pre-trained coder decoder language model.
S2, outputting query node characteristics through inputting text and a learnable node query, and then generating node text through LSTM.
And S3, merging the text node characteristics and the query node characteristics to obtain final node characteristics.
And S4, generating and fusing edges by using two modes of generation and classification.
S5, training the network through the aggregation loss function and the sparse adjacency matrix.
As a preferred mode of the present invention, the step S1 includes the steps of:
s101, formulating NODE generation into a sequence-to-sequence problem using a pre-trained encoder-decoder language model, wherein the system is fine-tuned to convert text input into a sequence of NODEs, separated by a special tag < PAD > NODE1< NODE_SEP > NODE2 … </S >, wherein NODEi represents one or more words;
s102, the module can generate NODEs and provide NODE characteristics for the generated edge tasks, each NODE can be associated with a plurality of NODE characteristics, NODE boundaries are described by using a separation mark < NODE_SEP >, a character string is generated through greedy decoding, and the hidden state of the last layer of the decoder is subjected to mean value pooling; we pre-fix the number of generating NODEs and fill the missing NODEs with special < no_node > tokens.
As a preferred mode of the present invention, the step S2 shown includes the steps of:
s201, the decoder receives as input a set of learnable node queriesAnd represents this as an embedded feature matrix, the output of the decoder can now be read directly as if no causal mask was used in order to ensure that the network can handle all queries simultaneouslyN represents the number of nodes, d represents the node feature dimension, and is passed to the pre-header LSTM for decoding into node logic +.>Where S is the length of the generated node sequence and V is the vocabulary size;
s202, in order to avoid the network remembering a specific target node order, the logits and features are arranged as
L′ n (s)=L n (s)P,F′ n =F n P,
Where s=1, …, S, andto a permutation matrix obtained using a binary matching method between the target and greedy decoding nodes. Node feature F 'processed by permutation matrix using cross entropy loss as matching cost function' n Now target aligned.
As a preferred mode of the present invention, the step S3 includes the steps of:
s301, in order to fully utilize the characteristics of text nodes and query nodes, a node fusion module is designed, the characteristics obtained in the previous two steps are spliced together, the characteristics are fused through a residual block, and then important information in the characteristics is extracted through a self-attention module;
s302, carrying out feature enhancement through a cavity space convolution pooling pyramid and a convolution attention module, compressing features through 3d convolution of 5 multiplied by 3, and paying attention to the compressed important feature information through a channel attention module;
s303, in order to remove redundant information in the generated node characteristics, predicting the redundant information in the node by using a simple encoder-decoder structure, and then subtracting the redundant information from the original information;
and S304, measuring the similarity between node characteristics by using dot multiplication, mapping the obtained similarity to between 0 and 1 through a softmax function, and randomly removing one node if the similarity is larger than Y so as to control the number of redundant nodes. Experiments show that the best effect is obtained when y=0.7, and the final node characteristics are obtained after the redundant nodes are deleted.
As a preferred mode of the present invention, the step S4 includes the steps of:
s401, the node characteristics of the last step are then used in the module to generate edges. Given a pair of node characteristics, the prediction head decides whether edges exist between the respective nodes, generates edges by two ways, and then fuses the edges generated by the two ways;
s402, first generating edges as a tag sequence using LSTM. The advantage of the generation is that any edge sequence can be constructed, including edge sequences that are not visible during training, but there is a risk of not perfectly matching the target edge token sequence;
s403, the classification header is then used to predict edges. If the set of possible relationships is fixed and known, the classification head will be more efficient and accurate, but if training has limited coverage of all possible edges, the system may misclassify during reasoning;
and S404, splicing edges in pairs, and then fusing the features through the dense layer. The confidence level is evaluated for the fused edges using a trained scoring network, and if the confidence level is greater than 0.5, the fused edges are retained.
As a preferred mode of the present invention, the step S5 includes:
s501, generating and predicting up to N2 edges, where N is the number of nodes, due to the need to check for the presence of edges between all node pairs. The cost of some calculations is reduced by omitting the self-loop and omitting the NODE connected to a specific tag < NO NODE >. When there is NO EDGE between two nodes, it is represented by a special token < no_edge >;
s502 a novel focus loss is proposed, denoted with the symbol F, the main idea of which is to reduce the cross entropy loss of well-classified samples < no_edge > and to increase the cross entropy loss for misclassification, as follows:
F=-(1-p t ) γ log(p t ),
where γ is a weighting factor, when γ=0, the two losses are equal. p is the probability of a single edge and t is the target class. P is p t Probability of representing the target class;
s503, a method of modifying training settings is proposed, namely removing most < no_edge > EDGEs by thinning the adjacency matrix, leaving all actual EDGEs but only some randomly selected < no_edge > EDGEs; the modification can improve the accuracy by 10-20%, and the training time by using the sparse edge is shortened by 10%;
s504, removing some actual edges and randomly replacing some edges after the previous step of sparse adjacent matrix operation so as to enhance the robustness of the model, wherein the accuracy can be improved by 5% -10% by modifying the model.
Drawings
FIG. 1 is an overall flow chart of an embodiment of the present invention.
Fig. 2 is a diagram of generating a text node map in accordance with an embodiment of the present invention.
FIG. 3 is a diagram of generating query node points in accordance with an embodiment of the present invention.
Detailed Description
In order to describe the technical content, constructional features, achieved objects and effects of the technical solution in detail, the following description is made in connection with the specific embodiments in conjunction with the accompanying drawings.
Referring to fig. 1, as shown in the drawing, the present embodiment provides a method for generating a network security emergency response knowledge graph based on text, which includes the following steps:
s1, generating text nodes and node characteristics through a pre-trained coder decoder language model. See fig. 2.
S2, outputting query node characteristics through inputting text and a learnable node query, and then generating node text through LSTM. See fig. 3.
And S3, merging the text node characteristics and the query node characteristics to obtain final node characteristics.
And S4, generating and fusing edges by using two modes of generation and classification.
S5, training the network through the aggregation loss function and the sparse adjacency matrix.
In the above embodiment, S1 further includes the steps of:
s101, formulating NODE generation into a sequence-to-sequence problem using a pre-trained encoder-decoder language model, wherein the system is fine-tuned to convert text input into a sequence of NODEs, separated by a special tag < PAD > NODE1< NODE_SEP > NODE2 … </S >, wherein NODEi represents one or more words;
s102, the module can generate NODEs and provide NODE characteristics for the generated edge tasks, each NODE can be associated with a plurality of NODE characteristics, NODE boundaries are described by using a separation mark < NODE_SEP >, a character string is generated by greedy decoding, and the hidden state of the last layer of the decoder is subjected to mean value pooling; we pre-fix the number of generating NODEs and fill the missing NODEs with special < no_node > tokens.
In the above embodiment, S2 further includes the steps of:
s201, the decoder receives as input a set of learnable node queries and represents them as embedded feature matrices, the output of which can now be read directly as if all queries were processed simultaneously by the network without using causal masksN represents the number of nodes, d represents the node feature dimension, and is passed to the pre-header LSTM for decoding into node logic +.>Where S is the length of the generated node sequence and V is the vocabulary size;
s202, in order to avoid the network remembering a specific target node order, the logits and features are arranged as
L ′ n (s)=L n (s)P,F n ′ =F n P,
Where s=1, …, S, andto a permutation matrix obtained using a binary matching method between the target and greedy decoding nodes. Node feature F processed by permutation matrix using cross entropy loss as matching cost function n ′ Now target aligned.
In the above embodiment, S3 further includes the steps of:
s301, in order to fully utilize the characteristics of text nodes and query nodes, a node fusion module is designed, and the characteristics obtained in the previous two steps are spliced together. Firstly, fusing characteristics through a residual error block, and then extracting important information in the characteristics through a self-attention module;
s302, carrying out feature enhancement through a cavity space convolution pooling pyramid and a convolution attention module, compressing features through 3d convolution of 5 multiplied by 3, and paying attention to the compressed important feature information through a channel attention module;
s303, in order to remove redundant information in the generated node characteristics, predicting the redundant information in the node by using a simple encoder-decoder structure, and then subtracting the redundant information from the original information;
and S304, measuring the similarity between node characteristics by using dot multiplication, and mapping the obtained similarity to between 0 and 1 through a softmax function. If the similarity is larger than Y, one node is randomly removed to control the number of redundant nodes. Experiments show that the best effect is obtained when y=0.7, and the final node characteristics are obtained after the redundant nodes are deleted.
In the above embodiment, S4 further includes the steps of:
s401, generating edges by using the node characteristics of the last step in the module, giving a pair of node characteristics, determining whether edges exist between the nodes of the node characteristics by a prediction head, generating the edges by using two modes, and fusing the edges generated by the node characteristics;
s402, firstly, generating edges into a marker sequence by using LSTM, wherein the generation has the advantages of being capable of constructing any edge sequence, including edge sequences which cannot be seen during training, but have the risk of not being completely matched with a target edge token sequence;
s403, predicting edges using a classification head, which is more efficient and accurate if the set of possible relationships is fixed and known, but which may misclassifie during reasoning if training has limited coverage of all possible edges;
and S404, splicing the edges in pairs, fusing the features through the dense layer, and evaluating the confidence coefficient of the fused edges by using a trained scoring network, wherein if the confidence coefficient is greater than 0.5, the fused edges are reserved.
In the above embodiment, S5 further includes the steps of:
s501, generating and predicting up to N2 edges, where N is the number of nodes, due to the need to check for the presence of edges between all node pairs. The cost of some calculations is reduced by omitting the self-loop and omitting the NODE connected to a specific tag < NO NODE >. When there is NO EDGE between two nodes, it is represented by a special token < no_edge >;
s502 a novel focus loss is proposed, denoted with the symbol F, the main idea of which is to reduce the cross entropy loss of well-classified samples < no_edge > and to increase the cross entropy loss for misclassification, as follows:
F=-(1-p t ) γ log(p t ),
where γ is a weighting factor, when γ=0, the two losses are equal. p is the probability of a single edge and t is the target class. P is p t Probability of representing the target class;
s503, a method of modifying training settings is proposed, namely removing most < no_edge > EDGEs by thinning the adjacency matrix, leaving all actual EDGEs but only some randomly selected < no_edge > EDGEs; the modification can improve the accuracy by 10-20%, and the training time by using the sparse edge is shortened by 10%;
s504, removing some actual edges and randomly replacing some edges after the previous step of sparse adjacent matrix operation so as to enhance the robustness of the model, wherein the accuracy can be improved by 5% -10% by modifying the model.
To demonstrate the effectiveness of the present invention, different data sets were used for verification. The data sets specifically used are compnsed, multi-Source Cyber-Security Events (all-round Multi-Source network Security activity) data sets and ADFA (intrusion detection data set) data sets. Wherein the complex, multi-Source Cyber-Security Events data set is obtained from various websites and various vulnerability databases on the network, including network Security and vulnerability information and network text data. The ADFA (intrusion detection dataset) dataset contains data of various intrusions; webSEC2020 (network security knowledge dataset) is a dataset of network security emergency responses, consisting of multiple sets of exception events and corresponding tags; MAWILab (network traffic anomaly dataset) is a network traffic anomaly detection dataset that consists of multiple sets of labels of traffic anomalies. A large number of experiments show that the method is superior to the most advanced method. Performance was 20% higher on the ADFA dataset compared to the BT5 method. Performance was 25% higher on the complex-Source Cyber-Security Events dataset compared to the ReGen method. The experimental results are as follows:
TABLE 1 feature semantic similarity matching results for different datasets
Experimental results show that the network security knowledge graph generated by the method has extremely high applicability and accuracy.
Claims (6)
1. A method for generating a network security emergency response knowledge graph based on text is characterized by comprising the following steps: the method comprises the following steps:
s1, generating text nodes and node characteristics through a pre-trained coder decoder language model;
s2, outputting query node characteristics through inputting text and node query capable of learning, and then generating node text through LSTM;
s3, merging the text node characteristics and the query node characteristics to obtain final node characteristics;
s4, generating and fusing edges by using two modes of generation and classification;
s5, training the network through the aggregation loss function and the sparse adjacency matrix.
2. The method for generating the network security emergency response knowledge graph based on the text is characterized in that the S1 further comprises the following steps:
s101, formulating NODE generation into a sequence-to-sequence problem using a pre-trained encoder-decoder language model, wherein the system is fine-tuned to convert text input into a sequence of NODEs, separated by a special tag < PAD > NODE1< NODE_SEP > NODE2 … </S >, wherein NODEi represents one or more words;
s102, the module can generate NODEs and provide NODE characteristics for the generated edge tasks, each NODE can be associated with a plurality of NODE characteristics, NODE boundaries are described by using a separation mark < NODE_SEP >, a character string is generated by greedy decoding, and the hidden state of the last layer of the decoder is subjected to mean value pooling; we pre-fix the number of generating NODEs and fill the missing NODEs with special < no_node > tokens.
3. The method for generating the network security emergency response knowledge graph based on the text is characterized in that the S2 further comprises the following steps:
s201, the decoder receives as input a set of learnable node queries and represents them as embedded feature matrices, the output of which can now be read directly as if all queries were processed simultaneously by the network without using causal masksN represents the node number, d represents the node feature dimension, and is passed to the pre-header LSTM for decoding into node logicWhere S is the length of the generated node sequence and V is the vocabulary size;
s202, in order to avoid the network memorizing the specific target node sequence, the logits and features are arranged as L ′ n (s)=L n (s)P,F n ′ =F n P,
4. The method for generating the network security emergency response knowledge graph based on the text is characterized in that the S3 further comprises the following steps:
s301, in order to fully utilize the characteristics of text nodes and query nodes, a node fusion module is designed, the characteristics obtained in the previous two steps are spliced together, the characteristics are fused through a residual block, and then important information in the characteristics is extracted through a self-attention module;
s302, carrying out feature enhancement through a cavity space convolution pooling pyramid and a convolution attention module, compressing features through 3d convolution of 5 multiplied by 3, and paying attention to the compressed important feature information through a channel attention module;
s303, in order to remove redundant information in the generated node characteristics, predicting the redundant information in the node by using a simple encoder-decoder structure, and then subtracting the redundant information from the original information;
and S304, measuring the similarity between node characteristics by using dot multiplication, mapping the obtained similarity to between 0 and 1 through a softmax function, and randomly removing one node if the similarity is larger than Y so as to control the number of redundant nodes. Experiments show that the best effect is obtained when y=0.7, and the final node characteristics are obtained after the redundant nodes are deleted.
5. The method for generating the network security emergency response knowledge graph based on the text is characterized in that the S4 further comprises the following steps:
s401, generating edges by using the node characteristics of the last step in the module, giving a pair of node characteristics, determining whether edges exist between the nodes of the node characteristics by a prediction head, generating the edges by using two modes, and fusing the edges generated by the node characteristics;
s402, firstly, generating edges into a marker sequence by using LSTM, wherein the generation has the advantages of being capable of constructing any edge sequence, including edge sequences which cannot be seen during training, but have the risk of not being completely matched with a target edge token sequence;
s403, predicting edges using a classification head, which is more efficient and accurate if the set of possible relationships is fixed and known, but which may misclassifie during reasoning if training has limited coverage of all possible edges;
and S404, splicing the edges in pairs, fusing the features through the dense layer, and evaluating the confidence coefficient of the fused edges by using a trained scoring network, wherein if the confidence coefficient is greater than 0.5, the fused edges are reserved.
6. The method for generating the network security emergency response knowledge graph based on the text is characterized in that the S5 further comprises the following steps:
s501, generating and predicting up to N2 EDGEs, where N is the number of NODEs, by omitting self-loops and NODEs connected to a specific label < no_node > to reduce the cost of some calculations, when there is NO EDGE between two NODEs, represented by a special token < no_edge >;
s502 a novel focus loss is proposed, denoted with the symbol F, the main idea of which is to reduce the cross entropy loss of well-classified samples < no_edge > and to increase the cross entropy loss for misclassification, as follows:
F=-(1-p t ) γ log(p t ),
where γ is a weighting factor, when γ=0, the two losses are equal; p is the probability of a single edge, and t is the target class, p t Probability of representing the target class;
s503, a method of modifying training settings is proposed, namely removing most < no_edge > EDGEs by thinning the adjacency matrix, leaving all actual EDGEs but only some randomly selected < no_edge > EDGEs; the modification can improve the accuracy by 10-20%, and the training time by using the sparse edge is shortened by 10%;
s504, removing some actual edges and randomly replacing some edges after the previous step of sparse adjacent matrix operation so as to enhance the robustness of the model, wherein the accuracy can be improved by 5% -10% by modifying the model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310316305.2A CN116340540A (en) | 2023-03-24 | 2023-03-24 | Method for generating network security emergency response knowledge graph based on text |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310316305.2A CN116340540A (en) | 2023-03-24 | 2023-03-24 | Method for generating network security emergency response knowledge graph based on text |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116340540A true CN116340540A (en) | 2023-06-27 |
Family
ID=86883647
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310316305.2A Pending CN116340540A (en) | 2023-03-24 | 2023-03-24 | Method for generating network security emergency response knowledge graph based on text |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116340540A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118036732A (en) * | 2024-04-11 | 2024-05-14 | 神思电子技术股份有限公司 | Social event pattern relation completion method and system based on critical countermeasure learning |
-
2023
- 2023-03-24 CN CN202310316305.2A patent/CN116340540A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118036732A (en) * | 2024-04-11 | 2024-05-14 | 神思电子技术股份有限公司 | Social event pattern relation completion method and system based on critical countermeasure learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111709241B (en) | Named entity identification method oriented to network security field | |
CN109450845B (en) | Detection method for generating malicious domain name based on deep neural network algorithm | |
CN109308494B (en) | LSTM model and network attack identification method and system based on LSTM model | |
Xiao et al. | Mcapsnet: Capsule network for text with multi-task learning | |
WO2022134071A1 (en) | Text extraction method and apparatus, computer readable storage medium, and electronic device | |
EP3614645B1 (en) | Embedded dga representations for botnet analysis | |
CN110933105B (en) | Web attack detection method, system, medium and equipment | |
CN115380284A (en) | Unstructured text classification | |
Yin et al. | Deep learning-aided OCR techniques for Chinese uppercase characters in the application of Internet of Things | |
CN115080756B (en) | Attack and defense behavior and space-time information extraction method oriented to threat information map | |
CN114781609A (en) | Traffic flow prediction method based on multi-mode dynamic residual image convolution network | |
CN116340540A (en) | Method for generating network security emergency response knowledge graph based on text | |
Nakagawa et al. | Character-level convolutional neural network for predicting severity of software vulnerability from vulnerability description | |
Wang et al. | An unknown protocol syntax analysis method based on convolutional neural network | |
Xu et al. | Adversarial attacks on text classification models using layer‐wise relevance propagation | |
Ren et al. | CLIO: Role-interactive multi-event head attention network for document-level event extraction | |
CN112887323B (en) | Network protocol association and identification method for industrial internet boundary security | |
CN117318980A (en) | Small sample scene-oriented self-supervision learning malicious traffic detection method | |
CN115759081A (en) | Attack mode extraction method based on phrase similarity | |
CN114301671A (en) | Network intrusion detection method, system, device and storage medium | |
CN115631502A (en) | Character recognition method, character recognition device, model training method, electronic device and medium | |
CN113055890B (en) | Multi-device combination optimized real-time detection system for mobile malicious webpage | |
CN115473734A (en) | Remote code execution attack detection method based on single classification and federal learning | |
WO2022141855A1 (en) | Text regularization method and apparatus, and electronic device and storage medium | |
Cao et al. | Adversarial DGA domain examples generation and detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |