CN114490995A - Multistage self-attention network security cooperative disposal battle room semantic abstraction method - Google Patents

Multistage self-attention network security cooperative disposal battle room semantic abstraction method Download PDF

Info

Publication number
CN114490995A
CN114490995A CN202210329999.9A CN202210329999A CN114490995A CN 114490995 A CN114490995 A CN 114490995A CN 202210329999 A CN202210329999 A CN 202210329999A CN 114490995 A CN114490995 A CN 114490995A
Authority
CN
China
Prior art keywords
vector
sentence
semantic
text
vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210329999.9A
Other languages
Chinese (zh)
Inventor
孙捷
车洵
胡牧
孙翰墨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Zhongzhiwei Information Technology Co ltd
Original Assignee
Nanjing Zhongzhiwei Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Zhongzhiwei Information Technology Co ltd filed Critical Nanjing Zhongzhiwei Information Technology Co ltd
Priority to CN202210329999.9A priority Critical patent/CN114490995A/en
Publication of CN114490995A publication Critical patent/CN114490995A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a multistage self-attention network security cooperative disposal battle room semantic abstract method, which gives a section of network security cooperative disposal battle room chat record and comprises the following steps: recording the operation process of the network security cooperative disposal studio to form a corpus; converting text contents in the corpus into a mark vector through a word vector matrix, and splicing the mark vector, the sentence segmentation vector and the position vector to form a complete word vector as the input of a network structure; sending the output result to an encoder of the transformer; inputting the result into a module of the transformer containing an attention mechanism to eliminate repeated redundant contents in the text; generating a semantic abstract through a decoder in a transformer model; outputting semantic abstracts corresponding to the texts according to the k key words with larger weights to form task instructions; and operation records are simplified, and a concise and smooth abstract task is generated.

Description

Multistage self-attention network security cooperative treatment combat room semantic abstraction method
Technical Field
The invention relates to the technical field of network security, in particular to a multistage self-attention network security cooperative treatment battle room semantic abstraction method.
Background
Under the background of the rapid development of the information age, the network information security becomes a topic which is greatly regarded by people at present. Network security technology protects the software, hardware of network systems and data in their systems from malicious influences. In the face of a network attack and defense countermeasure environment which is increasingly white and persistent, Security Operations (SecOps) are oriented to integration and fusion of people, technologies and processes, the global performance and the cooperativity of Security defense resources are improved, and the SecOps become the most direct and most critical links for falling on the ground of Security capability, playing the effectiveness of a defense system and resisting high-level threats. The network security cooperative treatment battle room is used as an important ring in a security operation system, and defense strategies are formulated and executed on network security events by combining operation personnel and an automatic operation system to form an efficient, accurate and timely network security defense system.
In the network safety cooperative disposition battle room, the artificial intelligence cooperative operation personnel complete the network safety emergency response task. The artificial intelligence obtains the task instruction by understanding the chat record content of the operator in the battle room. Firstly, identifying and understanding the content in the chat records of the battle room, extracting the key information in the chat records, forming a script responding to the network security events according to the semantic abstract content of the chat records of the network security cooperative disposition battle room, and completing the corresponding tasks.
The method for analyzing the text data of the chat records in the network security cooperative disposal combat room can use an algorithm for processing natural language, in the recent algorithm for processing natural language, the introduction of an attention mechanism greatly improves the accuracy of semantic abstract extraction, and the semantic abstract extraction of the chat records in the network security cooperative disposal combat room is completed by combining a transformer model.
The existing analysis method for the chat records of the network security cooperative treatment battle room is characterized in that a modeling sequence marking task is processed, each sentence of the battle room is processed and learned, a simple keyword combination meeting requirements is screened out, tasks needing to be executed are identified through a classifier, and the understanding of text semantics is lacked. The processing of the semantic abstract of the network security cooperative treatment battle room has the following three defects:
(1) the text semantic abstract extraction task under supervised learning needs a large number of data training models, but the data set resources in the network security professional field are not abundant, and the cost of manually labeled supervision data is too high;
(2) different semantic roles such as combat participation, general command, executive staff and the like exist in chat records of a network security cooperative disposal combat room, the traditional processing method has the same attention for extracting the characteristics of text sequences input by different roles, so that the understanding of key information in a section of semantics depends on the physical position of a keyword instead of the meaning of a word, and the redundancy of memory information is caused;
(3) before a transformer model is provided, a recurrent neural network is generally used for processing a text, and the limitation is that the recurrent neural network is effective in processing a short text and has a poor effect in processing a long text. However, the studio chat history we need to process is complex long text, and the hierarchical structure of the text needs to be understood, requiring the model to be able to process unstructured multi-role text.
Based on the above considerations, it is urgently needed to provide a multi-level self-attention network security cooperative treatment studio semantic abstraction method to solve the above problems.
Disclosure of Invention
In order to achieve the above object, the inventor provides a multistage self-attention network security cooperative treatment studio semantic abstraction method, and a section of network security cooperative treatment studio chat record is given, and the semantic abstraction extraction of the section of chat record comprises the following steps:
s1: recording the operation process of the network security cooperative disposal combat room to form a corpus, and performing table building, text content serialization, role marking of operators, content of analysis problems and coping tasks according to analysis arrangement on dialogs and chatting record texts in the corpus to form the network security cooperative disposal combat room chatting record corpus;
s2: converting text contents in a corpus into mark vectors through a word vector matrix, distinguishing different sentences by assigning values of 0 and 1 to different sentences to obtain sentence segmentation vectors, distributing position vectors for each mark vector by adopting a transformer, and splicing the mark vectors, the sentence segmentation vectors and the position vectors to form complete word vectors as input of a network structure;
s3: sending the output result in the step S2 to an encoder of the transformer;
s4: inputting the output result of the encoder of the transformer in the S3 into a module of the transformer, which contains an attention mechanism, compressing the output result into a vector representation, and extracting key information by combining a context vector to eliminate repeated redundant contents in the text;
s5: after the text content in the corpus is compressed into a vector, generating a semantic abstract through a decoder in a transformer model;
s6: processing the output result of the step S5 by a softmax layer, and outputting semantic abstracts corresponding to the text according to k key words with larger weight to form a task instruction;
s7: and the semantic abstract of the chat record in the battle room is cooperatively treated by network security to form a task instruction, so that the operation record is simplified, and a concise and smooth abstract task is generated.
As a preferred mode of the present invention, the S2 includes the steps of: for inputted text
Figure 128521DEST_PATH_IMAGE002
Is formed by
Figure 634589DEST_PATH_IMAGE004
The composition of each sentence is as follows,
Figure 195015DEST_PATH_IMAGE005
wherein
Figure 546361DEST_PATH_IMAGE007
To represent the second in the text
Figure 633266DEST_PATH_IMAGE009
Sentence words, the text is preprocessed in sequence, word segmentation is carried out by using an LTP word segmentation device, then noise words and stop words are removed, training linguistic data are generated in a standardized way, and each sentence is assigned with a label
Figure 434869DEST_PATH_IMAGE010
Where 0 means no sentence is recognized and 1 means a sentence is recognized.
As a preferred embodiment of the present invention, the S2 further includes the steps of: changing the character symbols of the processed text into digital sequence word vectors through a word vector layer, and marking [ CLS ] on the head mark]And tail mark (SEP)]And generating a sentence segmentation vector for distinguishing sentences and a position vector representing an absolute position of each word, wherein the vector dimensions of the token vector, the sentence segmentation vector and the position vector are all z, and then splicing the corresponding token vector, sentence segmentation vector and position vector of the input sequence for use
Figure 607224DEST_PATH_IMAGE012
Represents:
Figure 496683DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure 576110DEST_PATH_IMAGE015
a token vector representing each word in the sentence,
Figure 423980DEST_PATH_IMAGE017
a statement segmentation vector is represented which is,
Figure 83632DEST_PATH_IMAGE019
the corresponding parities divide the sentence into AB blocks,
Figure 42360DEST_PATH_IMAGE021
a position vector is represented by a vector of positions,
Figure 697333DEST_PATH_IMAGE023
corresponding to the maximum length of the sentence,
Figure 716104DEST_PATH_IMAGE025
representing input text
Figure 863052DEST_PATH_IMAGE002
Is obtained by splicing three vectors, and the dimensions of the rows and the columns in the vector space are all
Figure 500838DEST_PATH_IMAGE026
As a preferable aspect of the present invention, the S3 includes: for inputted text
Figure 416841DEST_PATH_IMAGE002
Is formed by inputting a sequence
Figure 75355DEST_PATH_IMAGE012
Firstly, a multi-head attention block composed of a plurality of attention modules is transmitted, the number of heads of the multi-head attention block is super-parameter, t heads are customized, and the output is t heads
Figure 834233DEST_PATH_IMAGE026
Then using the initialization matrix
Figure 134764DEST_PATH_IMAGE028
And input sequence
Figure 170853DEST_PATH_IMAGE012
Multiply to obtain
Figure 265848DEST_PATH_IMAGE030
The query vector matrix, the key value vector matrix and the value vector matrix are respectively corresponded.
As a preferred embodiment of the present invention, the S3 further includes the steps of: since there are t heads of attention, i.e. division into
Figure 997175DEST_PATH_IMAGE031
For the attention weight at the current moment, firstly, the association degree of the current word and other words is calculated, and the similarity is calculated by using the query vector and the key value vectors of other words
Figure 101397DEST_PATH_IMAGE033
Figure 726414DEST_PATH_IMAGE034
For calculating similarity by product of query vector and key value vector
Figure 382523DEST_PATH_IMAGE033
Performing a reduction by dividing by the same factor
Figure 725780DEST_PATH_IMAGE036
Then, carrying out normalization processing by using a softmax function, wherein the obtained value is the value representation of the current word and the current word, and the expression is as follows:
Figure 633693DEST_PATH_IMAGE037
weighted value obtained by current word
Figure 378795DEST_PATH_IMAGE039
To update the attention weight of the current word
Figure 690959DEST_PATH_IMAGE040
Figure 521511DEST_PATH_IMAGE041
And circulating the same steps for other input sequences to obtain all outputs.
As a preferred embodiment of the present invention, the S3 further includes the steps of: updating the attention weight by the following formula:
Figure 967536DEST_PATH_IMAGE042
then, a plurality of attention weights are output and spliced together by a multi-head attention module and then are mixed with the input sequence
Figure 691779DEST_PATH_IMAGE012
Residual error jump output is carried out, and then the residual error jump output is input into a specification layer LN to output a new value, wherein the attention weight of an input sequence is shown as the following expression:
Figure 565057DEST_PATH_IMAGE043
calculated, output vector as input for fully connected layer, also through residual jump and normative layer LN, and encapsulated with superimposed linear layer
Figure 882906DEST_PATH_IMAGE045
Function activation, the expression is:
Figure 473899DEST_PATH_IMAGE046
vector of output for full connection layer
Figure 193594DEST_PATH_IMAGE047
As the input of the coding part of the next layer of transformer, the expression is:
Figure 237773DEST_PATH_IMAGE048
as a preferred mode of the present invention, the S4 includes the steps of: the calculation process is repeated to process the features by using 12 layers of transformer coding parts, and the vectors are output after passing through the training layers of the stacked bidirectional transformer coding parts
Figure 42918DEST_PATH_IMAGE050
Figure 955379DEST_PATH_IMAGE050
Is the beginning of each sentence at the time of input [ CLS]The symbolic tag vector is also an information vector containing the whole sentence, and the expression is as follows:
Figure 264001DEST_PATH_IMAGE051
as a preferred mode of the present invention, the S5 includes the steps of: to obtain
Figure 479082DEST_PATH_IMAGE050
And then, introducing the input of a decoding part of the multilayer transformer for decoding, and splicing the output of each layer, wherein the expression is as follows:
Figure 381310DEST_PATH_IMAGE052
wherein
Figure 972828DEST_PATH_IMAGE054
Obtained by the weight summation and the average of the information vectors of the multilayer transformer and additionally input into a sigmod function
Figure 401535DEST_PATH_IMAGE056
To predict the semantic extraction score of each sentence, the expression is:
Figure 787517DEST_PATH_IMAGE057
Figure 691888DEST_PATH_IMAGE059
is shown as
Figure DEST_PATH_IMAGE060
The result of each sentence.
As a preferred mode of the present invention, the S6 includes the steps of: ranking according to the score of each sentence which is output by sequential training, selecting a label corresponding to the sentence with the highest score by adding a softmax layer, and mapping a corresponding semantic abstract expression from an operation set S according to the corresponding label as follows:
Figure 555939DEST_PATH_IMAGE061
different from the prior art, the technical scheme has the following beneficial effects:
according to the scheme, the chat record content in the safe operation process in the network safe cooperative disposal warroom is fully mined by the network safe cooperative disposal warroom semantic abstract method based on multi-level attention, key content in a chat record text is concerned to obtain effective information, the efficiency of the warroom is improved by the semantic abstract, the communication content of operators is quickly converted into a language which can be processed by a computer, and a high-efficiency, high-precision and low-cost safe operation automation system is formed.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment.
FIG. 2 is a block diagram of a method according to an embodiment.
FIG. 3 is a diagram illustrating the method of the present invention in greater detail.
Detailed Description
To explain technical contents, structural features, and objects and effects of the technical solutions in detail, the following detailed description is given with reference to the accompanying drawings in conjunction with the embodiments.
The method aims to solve the problem that in the process of cooperation of operators and an automatic operation system, the operators can efficiently understand operation instructions, the operation instructions are mostly spoken unstructured texts such as conversation records and chat records, tasks are executed according to the conversation records and the chat records, the automatic operation system needs to extract key information in the texts, semantic abstracts are generated through neural network processing, and then task instructions are formed according to the abstracts. Most of the existing text processing methods focus on structural texts, and the semantic recognition effect on non-structural texts is not ideal. In the process of carrying out safe operation work in a network safe cooperative disposition operation room, tasks such as supervision and treatment, safe preparation, operation and maintenance, protection and defense, safe analysis, collection and operation, safe investigation and the like are composed of different professional fields and work roles, each role carries out information exchange and data analysis in a conversation mode, artificial intelligence is introduced to assist operators to complete network safe operation tasks, and the meaning of text input of the operators is understood firstly. Therefore, the technology takes the transformer model as a basic structure of the abstract language model, can carry out semantic recognition on the text according to key information and context content, utilizes the multistage attention mechanism to carry out deeper reading on the text according to semantic roles and task content in conversation and chatting records, removes redundancy in the conversation text, generates an abstract with complete and effective semantics, simplifies the chatting records in the network safety cooperative disposal studio, and provides efficient work help.
Specifically, the embodiment provides a multistage self-attention network security cooperative disposition battle room semantic abstraction method, and a section of network security cooperative disposition battle room chat record is given, and the semantic abstraction extraction of the section of chat record comprises the following steps:
s1: recording the operation process of the network security cooperative disposal combat room to form a corpus, and performing table building, text content serialization, role marking of operators, content of analysis problems and coping tasks according to analysis arrangement on dialogs and chatting record texts in the corpus to form the network security cooperative disposal combat room chatting record corpus;
s2: converting text contents in a corpus into mark vectors through a word vector matrix, distinguishing different sentences by assigning values of 0 and 1 to different sentences to obtain sentence segmentation vectors, distributing position vectors for each mark vector by adopting a transformer, and splicing the mark vectors, the sentence segmentation vectors and the position vectors to form complete word vectors as input of a network structure;
s3: sending the output result in the step S2 to an encoder of the transformer;
s4: inputting the output result of the encoder of the transformer in the S3 into a module of the transformer, which contains an attention mechanism, compressing the output result into a vector representation, and extracting key information by combining a context vector to eliminate repeated redundant contents in the text;
s5: after the text content in the corpus is compressed into a vector, generating a semantic abstract through a decoder in a transformer model;
s6: processing the output result of the step S5 by a softmax layer, and outputting semantic abstracts corresponding to the text according to k key words with larger weight to form a task instruction;
s7: and the semantic abstract of the chat record in the battle room is cooperatively treated by network security to form a task instruction, so that the operation record is simplified, and a concise and smooth abstract task is generated.
In the specific implementation process of the above embodiment, as shown in fig. 1 to fig. 3, the method mainly includes a processing procedure of recording an operation process of the network security studio, specifically an analysis procedure of an operator for a certain network security event; and (3) performing table building on the dialogs and the chat record texts in the corpus, serializing the text contents, marking the roles of operators, analyzing the contents of problems and responding tasks according to analysis arrangement to form a network security cooperative treatment studio chat record corpus.
For inputted text
Figure 839153DEST_PATH_IMAGE002
Is formed by
Figure 536981DEST_PATH_IMAGE062
The composition of each sentence is as follows,
Figure 538436DEST_PATH_IMAGE005
wherein
Figure 471756DEST_PATH_IMAGE007
To represent the second in the text
Figure 609477DEST_PATH_IMAGE009
Sentence words, the text is preprocessed in sequence, word segmentation is carried out by using an LTP word segmentation device, then noise words and stop words are removed, training linguistic data are generated in a standardized way, and each sentence is assigned with a label
Figure 461895DEST_PATH_IMAGE010
Where 0 means no sentence is recognized and 1 means a sentence is recognized.
Changing the character symbols of the processed text into digital sequence word vectors through a word vector layer, and marking [ CLS ] on the head mark]And tail mark (SEP)]And generating a sentence segmentation vector for distinguishing sentences and a position vector representing an absolute position of each word, wherein the vector dimensions of the token vector, the sentence segmentation vector and the position vector are all z, and then splicing the corresponding token vector, sentence segmentation vector and position vector of the input sequence for use
Figure 950645DEST_PATH_IMAGE012
Represents:
Figure 422078DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE063
a token vector representing each word in the sentence,
Figure 289671DEST_PATH_IMAGE017
a statement segmentation vector is represented which is,
Figure 453936DEST_PATH_IMAGE019
the corresponding parities divide the sentence into AB blocks,
Figure 164403DEST_PATH_IMAGE021
a position vector is represented by a vector of positions,
Figure 564160DEST_PATH_IMAGE023
corresponding to the maximum length of the sentence,
Figure 942052DEST_PATH_IMAGE025
representing input text
Figure 11639DEST_PATH_IMAGE002
Is obtained by splicing three vectors, and the dimensions of the rows and the columns in the vector space are all
Figure 474982DEST_PATH_IMAGE026
For inputted text
Figure 172372DEST_PATH_IMAGE002
Is formed by inputting a sequence
Figure 404770DEST_PATH_IMAGE012
Firstly, a multi-head attention block composed of a plurality of attention modules is transmitted, the number of heads of the multi-head attention block is super-parameter, t heads are customized, and the output is t heads
Figure 645259DEST_PATH_IMAGE026
Then using three initialization matrices
Figure 454952DEST_PATH_IMAGE028
And input sequence
Figure 71878DEST_PATH_IMAGE012
Multiply to obtain
Figure 158783DEST_PATH_IMAGE064
The query vector matrix, the key value vector matrix and the value vector matrix are respectively corresponded.
Since there are t attention heads, so divide into
Figure 835752DEST_PATH_IMAGE031
For the attention weight at the current moment, firstly, the association degree of the current word and other words is calculated, and the key value direction of the query vector and other words is usedQuantity calculation similarity
Figure 883473DEST_PATH_IMAGE033
Figure 38511DEST_PATH_IMAGE034
For calculating similarity by product of query vector and key value vector
Figure 245501DEST_PATH_IMAGE033
Performing a reduction by dividing by the same factor
Figure 827793DEST_PATH_IMAGE036
Then, by using a softmax function and normalization processing of one, the obtained value is the value representation of the current word and the current word, and the expression is as follows:
Figure 612078DEST_PATH_IMAGE037
finally, the weighted value obtained by the current word
Figure 570807DEST_PATH_IMAGE039
To update the attention weight of the current word
Figure 632303DEST_PATH_IMAGE040
Figure 651075DEST_PATH_IMAGE041
All outputs can be obtained by also cycling through the same steps for other input sequences.
The attention weight is updated by the following formula:
Figure 673389DEST_PATH_IMAGE042
then using a multi-head attention moduleOutputting and splicing a plurality of attention weights together and then adding the attention weights to an input sequence
Figure 170229DEST_PATH_IMAGE012
Residual error jump output is carried out, and then the residual error jump output is input into a specification layer LN to output a new value, wherein the attention weight of an input sequence is shown as the following expression:
Figure 86233DEST_PATH_IMAGE043
calculated, output vector as input for fully connected layer, also through residual jump and normative layer LN, and encapsulated with superimposed linear layer
Figure 275905DEST_PATH_IMAGE045
Function activation, the expression is:
Figure 34783DEST_PATH_IMAGE046
vector of output for full connection layer
Figure 335314DEST_PATH_IMAGE047
As the input of the coding part of the next layer of transformer, the expression is:
Figure 105824DEST_PATH_IMAGE048
the calculation process is repeated to process the features by using 12 layers of transformer coding parts, and the vectors are output after passing through the training layers of the stacked bidirectional transformer coding parts
Figure 466398DEST_PATH_IMAGE050
Figure 463304DEST_PATH_IMAGE050
Is the beginning of each sentence at the time of input [ CLS]The symbolic tag vector is also an information vector containing the whole sentence, and the expression is as follows:
Figure 301947DEST_PATH_IMAGE051
to obtain
Figure 192543DEST_PATH_IMAGE050
And then, introducing the input of a decoding part of the multilayer transformer for decoding, and splicing the output of each layer, wherein the expression is as follows:
Figure 724018DEST_PATH_IMAGE052
wherein
Figure 191909DEST_PATH_IMAGE054
Obtained by the weight summation and the average of the information vectors of the multilayer transformer and additionally input into a sigmod function
Figure 99822DEST_PATH_IMAGE056
To predict the semantic extraction score of each sentence, the expression is:
Figure 844924DEST_PATH_IMAGE057
Figure 547301DEST_PATH_IMAGE059
is shown as
Figure 984711DEST_PATH_IMAGE060
The result of each sentence.
Ranking according to the score of each sentence which is output by sequential training, selecting a label corresponding to the sentence with the highest score by a softmax layer, and mapping corresponding semantic abstract expressions from an operation set S according to the corresponding labels as follows:
Figure 430736DEST_PATH_IMAGE061
in some embodiments, the whole process framework shown in fig. 2 needs to be trained in advance, the training stage adopts a few-sample learning framework to learn meta knowledge, and the testing stage uses a transformer based on multi-level attention to complete a semantic abstraction extraction task, which is described in detail as follows:
pre-training with public dialogue data sets: the pre-training task uses a semantic abstract model based on a multilevel attention transformer, when a text is input, sentences in the text are processed in a segmented mode to generate a subsequence with limited length, words in the subsequence are converted into digital sequence word vectors, and semantic abstract is generated through transformer processing.
After the pre-training is completed, the network model is fine-tuned 12000 times with the open source dataset CMCSE.
The invention adopts the parameter initialization network model of Chinese pre-trained Bert-base-case (text data set based on Bert transformer-supporting case) issued by Google, uses the cross entropy loss function for training, adopts the adamW optimizer, and sets the dynamic setting by default
Figure 30344DEST_PATH_IMAGE066
Model with
Figure 28256DEST_PATH_IMAGE068
Training to 10000 times, then descending, training 100k, L2 attenuation parameter 0.01, replacing the RELU by the GELU for the activation function, then fine-tuning by fixing the parameter, the hidden layer vector dimension, namely embedding size, is 768, the maximum length of the input sequence is 256, the trained batch is 16, and the learning rate is set to
Figure DEST_PATH_IMAGE070
And does not participate in training. The number of model layers is set to be 12, the number of attention heads is set to be 8, the dimension of an input layer is 256, the training speed is influenced by too long, and the difference between the training in the fine-tuning stage and the pre-training stage is small.
Based on the scheme of the embodiment, an open Source data set CMCSE (Comprehensive, Multi-Source Cyber-Security Events) is used, because most of chat records of a network Security cooperative treatment studio are confidential, a semantic abstract is trained and extracted by using a model of less sample learning, a text data set with enough open data volume is used for training a model during training, model parameters capable of extracting the semantic abstract are obtained, and a small amount of safe operation chat record texts are input during testing, so that the effectiveness of the model is tested. The model measures the performance of the semantic abstract method of the network security cooperative disposition battle room based on multi-level attention from three evaluation criteria of accuracy, recall and F-measure (comprehensive evaluation index), as shown in the following table. On the same data set, the semantic abstract method of the network security cooperative treatment battle room based on multi-level attention is higher than other model methods in the aspect of semantic abstract extraction effect, in the transverse comparison, different models are used for comparison on a CMCSE data set, compared with the basic framework of a recurrent neural network, such as LSTM (long and short term memory network), BilTM (bidirectional long and short term memory network), GRU (gated memory network) and other models, the transformer-based multi-stage attention mechanism is added to extract the text semantic abstract information of a battle room, the precision rate of abstract extraction is better than the best effect of redundancy removal, the evaluation standard precision rate, the recall rate and the F-measure value are respectively improved by 9.82%, 7.23% and 3.70%, and the requirement for tag data is reduced by using a less-sample learning framework.
Figure DEST_PATH_IMAGE071
It should be noted that, although the above embodiments have been described herein, the scope of the present invention is not limited thereby. Therefore, based on the innovative concepts of the present invention, the technical solutions of the present invention can be directly or indirectly applied to other related technical fields by making changes and modifications to the embodiments described herein, or by using equivalent structures or equivalent processes performed in the content of the present specification and the attached drawings, which are included in the scope of the present invention.

Claims (9)

1. A multistage self-attention network security cooperative disposal battle room semantic abstract method is characterized in that given a section of network security cooperative disposal battle room chat records, semantic abstract extraction of the section of chat records comprises the following steps:
s1: recording the operation process of the network security cooperative disposal combat room to form a corpus, and performing table building, text content serialization, role marking of operators, content of analysis problems and coping tasks according to analysis arrangement on dialogs and chatting record texts in the corpus to form the network security cooperative disposal combat room chatting record corpus;
s2: converting text contents in a corpus into mark vectors through a word vector matrix, distinguishing different sentences by assigning values of 0 and 1 to different sentences to obtain sentence segmentation vectors, distributing position vectors for each mark vector by adopting a transformer, and splicing the mark vectors, the sentence segmentation vectors and the position vectors to form complete word vectors as input of a network structure;
s3: sending the output result in the step S2 to an encoder of the transformer;
s4: inputting the output result of the encoder of the transformer in the S3 into a module of the transformer, which contains an attention mechanism, compressing the output result into a vector representation, and extracting key information by combining a context vector to eliminate repeated redundant contents in the text;
s5: after the text content in the corpus is compressed into a vector, generating a semantic abstract through a decoder in a transformer model;
s6: processing the output result of the step S5 by a softmax layer, and outputting semantic abstracts corresponding to the text according to k key words with larger weight to form a task instruction;
s7: and the semantic abstract of the chat record in the battle room is cooperatively treated by network security to form a task instruction, so that the operation record is simplified, and a concise and smooth abstract task is generated.
2. The multi-level self-attentive cyber-security-co-disposition studio semantic summarization method of claim 1, wherein the S2 comprises the steps of:
for inputted text
Figure DEST_PATH_IMAGE001
Is composed of
Figure 124203DEST_PATH_IMAGE002
The number of the sentences is composed of,
Figure DEST_PATH_IMAGE003
wherein
Figure 235420DEST_PATH_IMAGE004
To represent the second in the text
Figure 483999DEST_PATH_IMAGE005
Sentence words, the text is preprocessed in sequence, word segmentation is carried out by using an LTP word segmentation device, then noise words and stop words are removed, training linguistic data are generated in a standardized mode, and a label is distributed to each sentence
Figure DEST_PATH_IMAGE006
Where 0 means no sentence is recognized and 1 means a sentence is recognized.
3. The multi-level self-attentive cyber-security-co-disposition studio semantic summarization method of claim 2, wherein the S2 further comprises the steps of: changing the character symbols of the processed text into digital sequence word vectors through a word vector layer, and marking [ CLS ] on the head mark]And tail mark (SEP)]And generating a sentence segmentation vector for distinguishing sentences and a position vector representing an absolute position of each word, wherein the vector dimensions of the token vector, the sentence segmentation vector and the position vector are all z, and then splicing the corresponding token vector, sentence segmentation vector and position vector of the input sequence for use
Figure 613629DEST_PATH_IMAGE007
Represents:
Figure DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 100105DEST_PATH_IMAGE009
a token vector representing each word in the sentence,
Figure 188146DEST_PATH_IMAGE010
a statement segmentation vector is represented which is,
Figure DEST_PATH_IMAGE011
the corresponding parities divide the sentence into AB blocks,
Figure 127284DEST_PATH_IMAGE012
a position vector is represented by a vector of positions,
Figure 293561DEST_PATH_IMAGE013
corresponding to the maximum length of the sentence,
Figure DEST_PATH_IMAGE014
representing input text
Figure 696860DEST_PATH_IMAGE001
Is obtained by splicing three vectors, and the dimensions of the rows and the columns in the vector space are all
Figure 159065DEST_PATH_IMAGE015
4. The multi-level self-attentive cyber-security-co-disposition studio semantic summarization method of claim 3 wherein the S3 comprises the steps of: for inputted text
Figure 116657DEST_PATH_IMAGE001
Is formed by inputting a sequence
Figure 322510DEST_PATH_IMAGE007
Firstly, a multi-head attention block composed of a plurality of attention modules is introduced, the head number of the multi-head is super parameter, t heads are defined by user, and the output is t heads
Figure 111475DEST_PATH_IMAGE015
Then using the initialization matrix
Figure DEST_PATH_IMAGE016
And input sequence
Figure 479002DEST_PATH_IMAGE007
Multiply to obtain
Figure 923890DEST_PATH_IMAGE017
The query vector matrix, the key value vector matrix and the value vector matrix are respectively corresponded.
5. The multi-level self-attentive cyber-security-co-disposition studio semantic summarization method of claim 4, wherein the S3 further comprises the steps of: since there are t heads of attention, i.e. division into
Figure DEST_PATH_IMAGE018
For the attention weight at the current moment, firstly, the association degree of the current word and other words is calculated, and the similarity is calculated by using the query vector and the key value vectors of other words
Figure 431970DEST_PATH_IMAGE019
Figure DEST_PATH_IMAGE020
By query vectors and key-value vectors for calculating similarityProduct of
Figure 747544DEST_PATH_IMAGE019
Performing a reduction by dividing by the same factor
Figure 551552DEST_PATH_IMAGE021
Then, carrying out normalization processing by using a softmax function, wherein the obtained value is the value representation of the current word and the current word, and the expression is as follows:
Figure DEST_PATH_IMAGE022
weighted value obtained by current word
Figure 952578DEST_PATH_IMAGE023
To update the attention weight of the current word
Figure DEST_PATH_IMAGE024
Figure 202031DEST_PATH_IMAGE025
And circulating the same steps for other input sequences to obtain all outputs.
6. The multi-level self-attentive cyber-security-co-disposition studio semantic summarization method of claim 5, wherein the S3 further comprises the steps of: the attention weight is updated by the following formula:
Figure 965588DEST_PATH_IMAGE026
then, a plurality of attention weights are output and spliced together by a multi-head attention module and then are mixed with the input sequence
Figure 206077DEST_PATH_IMAGE007
Residual error jump output is carried out, and then the residual error jump output is input into a specification layer LN to output a new value, wherein the attention weight of an input sequence is shown as the following expression:
Figure DEST_PATH_IMAGE027
calculated, output vector as input for fully connected layer, also through residual jump and normative layer LN, and encapsulated with superimposed linear layer
Figure 828819DEST_PATH_IMAGE028
Function activation, the expression is:
Figure DEST_PATH_IMAGE029
vector of output for full connection layer
Figure 445745DEST_PATH_IMAGE030
As the input of the coding part of the next layer of transformer, the expression is:
Figure 1491DEST_PATH_IMAGE031
7. the multi-level self-attentive cyber-security-co-disposition studio semantic summarization method of claim 3 wherein the S4 comprises the steps of:
then repeating the calculation process, processing the characteristics by using 12 layers of transformer coding parts, outputting vectors after passing through the training layers of the stacked bidirectional transformer coding parts
Figure DEST_PATH_IMAGE032
Figure 678460DEST_PATH_IMAGE032
Is [ CLS ] of each sentence head in input]The symbolic tag vector is also an information vector containing the whole sentence, and the expression is as follows:
Figure 585236DEST_PATH_IMAGE033
8. the multi-level self-attentive cyber-security-co-disposition studio semantic summarization method of claim 7, wherein the S5 comprises the steps of:
to obtain
Figure 973230DEST_PATH_IMAGE032
And then, introducing the input of a decoding part of the multilayer transformer for decoding, and splicing the output of each layer, wherein the expression is as follows:
Figure 649062DEST_PATH_IMAGE034
wherein
Figure DEST_PATH_IMAGE035
Obtained by the weight summation and the average of the information vectors of the multilayer transformer and additionally input into a sigmod function
Figure 496932DEST_PATH_IMAGE036
To predict the semantic extraction score of each sentence, the expression is:
Figure 891005DEST_PATH_IMAGE037
Figure DEST_PATH_IMAGE038
is shown as
Figure 787416DEST_PATH_IMAGE039
The result of each sentence.
9. The multi-level self-attentive cyber-security-co-disposition studio semantic summarization method of claim 8, wherein the S6 comprises the steps of:
ranking according to the score of each sentence which is output by sequential training, selecting a label corresponding to the sentence with the highest score by adding a softmax layer, and mapping a corresponding semantic abstract expression from an operation set S according to the corresponding label as follows:
Figure 317755DEST_PATH_IMAGE040
CN202210329999.9A 2022-03-31 2022-03-31 Multistage self-attention network security cooperative disposal battle room semantic abstraction method Pending CN114490995A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210329999.9A CN114490995A (en) 2022-03-31 2022-03-31 Multistage self-attention network security cooperative disposal battle room semantic abstraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210329999.9A CN114490995A (en) 2022-03-31 2022-03-31 Multistage self-attention network security cooperative disposal battle room semantic abstraction method

Publications (1)

Publication Number Publication Date
CN114490995A true CN114490995A (en) 2022-05-13

Family

ID=81489079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210329999.9A Pending CN114490995A (en) 2022-03-31 2022-03-31 Multistage self-attention network security cooperative disposal battle room semantic abstraction method

Country Status (1)

Country Link
CN (1) CN114490995A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114818721A (en) * 2022-06-30 2022-07-29 湖南工商大学 Event joint extraction model and method combined with sequence labeling

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114254655A (en) * 2022-02-28 2022-03-29 南京众智维信息科技有限公司 Network security traceability semantic identification method based on prompt self-supervision learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114254655A (en) * 2022-02-28 2022-03-29 南京众智维信息科技有限公司 Network security traceability semantic identification method based on prompt self-supervision learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114818721A (en) * 2022-06-30 2022-07-29 湖南工商大学 Event joint extraction model and method combined with sequence labeling
CN114818721B (en) * 2022-06-30 2022-11-01 湖南工商大学 Event joint extraction model and method combined with sequence labeling

Similar Documents

Publication Publication Date Title
Zhang et al. SG-Net: Syntax guided transformer for language representation
CN114254655B (en) Network security tracing semantic identification method based on prompt self-supervision learning
CN107562792A (en) A kind of question and answer matching process based on deep learning
CN109635109A (en) Sentence classification method based on LSTM and combination part of speech and more attention mechanism
CN113065358B (en) Text-to-semantic matching method based on multi-granularity alignment for bank consultation service
CN110188348B (en) Chinese language processing model and method based on deep neural network
Zhao et al. Enhancing Chinese character representation with lattice-aligned attention
Scholak et al. DuoRAT: towards simpler text-to-SQL models
Zhang et al. n-BiLSTM: BiLSTM with n-gram Features for Text Classification
CN114492460B (en) Event causal relationship extraction method based on derivative prompt learning
CN114254102B (en) Natural language-based collaborative emergency response SOAR script recommendation method
CN114155477B (en) Semi-supervised video paragraph positioning method based on average teacher model
CN115687609A (en) Zero sample relation extraction method based on Prompt multi-template fusion
CN114490995A (en) Multistage self-attention network security cooperative disposal battle room semantic abstraction method
Xiong et al. A multi-gate encoder for joint entity and relation extraction
CN112559741B (en) Nuclear power equipment defect record text classification method, system, medium and electronic equipment
CN116521857A (en) Method and device for abstracting multi-text answer abstract of question driven abstraction based on graphic enhancement
Li Analysis of semantic comprehension algorithms of natural language based on robot’s questions and answers
CN114330352A (en) Named entity identification method and system
Niu et al. Video captioning by learning from global sentence and looking ahead
Wang et al. Evolutionary Relationship Extraction of Emergencies Based on Two-way GRU and Multi-channel Self-attention Mechanism
CN114611487B (en) Unsupervised Thai dependency syntax analysis method based on dynamic word embedding alignment
CN117744657B (en) Medicine adverse event detection method and system based on neural network model
Wang et al. Chinese Text Implication Recognition Method based on ERNIE-Gram and CNN
CN114818644B (en) Text template generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination