CN117271701A - Method and system for extracting system operation abnormal event relation based on TGGAT and CNN - Google Patents
Method and system for extracting system operation abnormal event relation based on TGGAT and CNN Download PDFInfo
- Publication number
- CN117271701A CN117271701A CN202311178825.8A CN202311178825A CN117271701A CN 117271701 A CN117271701 A CN 117271701A CN 202311178825 A CN202311178825 A CN 202311178825A CN 117271701 A CN117271701 A CN 117271701A
- Authority
- CN
- China
- Prior art keywords
- vector matrix
- matrix
- vector
- extracting
- sentence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 15
- 238000000034 method Methods 0.000 title claims description 14
- 239000011159 matrix material Substances 0.000 claims abstract description 82
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 29
- 238000000605 extraction Methods 0.000 claims abstract description 23
- 238000012512 characterization method Methods 0.000 claims abstract description 15
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims abstract description 10
- 230000006870 function Effects 0.000 claims description 9
- 239000010410 layer Substances 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 3
- 230000004931 aggregating effect Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 239000002356 single layer Substances 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims 4
- 230000001419 dependent effect Effects 0.000 description 7
- 230000001364 causal effect Effects 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000002123 temporal effect Effects 0.000 description 4
- 238000012549 training Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004140 cleaning Methods 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a system operation abnormal event relation extraction method and system based on TGGAT and CNN, which are used for acquiring text data and preprocessing; processing the preprocessed text data by using a BERT model to generate a feature vector matrix; inputting the feature vector matrix into a multi-scale convolutional neural network CNN, and extracting the local features of sentences; inputting the feature vector matrix into a type guidance graph annotation network TGGAT, and extracting the global features of sentences; splicing the local features and the global features, and obtaining a characterization vector matrix weighted by the attention value through self-attention; the weighted token vector matrix is event relationship classified using a softmax classifier. And the type guidance graph annotation network is used for capturing long-range syntax dependency relationships and simultaneously capturing type information by considering different contributions of different types of dependency relationships, so that global event knowledge is accurately captured.
Description
Technical Field
The invention relates to text data identification, in particular to a method and a system for extracting a system operation abnormal event relation based on TGGAT and CNN.
Background
Event relationship extraction is an important step in event knowledge graph, and is beneficial to natural language processing applications such as information extraction, investment strategy and problem solution. By capturing the association relationship, the related events can complement each other, and the value of the related events can be fully exerted.
Currently, most research is focused on extracting causal and temporal relationships in the task of event relationship extraction. Machine learning is widely used in the task of extracting event relationships. Kruengkrai et al employ a multi-column convolutional neural network to process multiple sources of background knowledge to extract causal relationships that lack explicit cues to indicate the existence of event causal relationships. Ning et al extract the time relationship in combination with integer linear programming and common sense knowledge. Hu et al adaptively clustered temporal relationship features using a pre-trained language model. Jiang et al propose a model that incorporates syntactic tree structures in the classification. In order to improve the classification effect, fan et al propose a relational classification model that utilizes a bilinear neural network to extract syntactic information to accomplish a relational classification task. Various ideas have also been proposed to deal with the causal relationships of document-level events, trong et al devised a reinforcement learning mechanism to select key contexts from documents.
Since rich language knowledge is contained in syntactic dependencies, many methods based on syntactic dependencies have been proposed. Aldawsari et al use an ancestor event that relies on another event in the tree, while fusing speech and narrative features to obtain a richer representation of the event. Meng et al use a sequence encoder to encode the shortest dependency path between events to identify the relationship between them. Since the sequence encoder does not learn the structural information well, some graph-based approaches have been proposed to accomplish this task. Wang et al propose different constraints to extract the temporal relationship and sub-event relationship. Zhang et al propose the use of a graph converter to capture temporal knowledge in a grammar graph. Mathur et al and Tran et al use the syntactic dependencies and introduce other knowledge such as tutorials, utterances and semantics to enrich the representation of events through interactions between nodes in different graphs.
The relation extraction models are partly general relation extraction methods, partly used for relation extraction models in the financial field, and are not designed for the database operation and maintenance field. In addition, most existing dependency-based approaches typically treat different types of dependencies equally when modeling semantic contexts between events, resulting in reduced performance in event relationship extraction.
Disclosure of Invention
The invention aims to: in view of the above drawbacks, the present invention provides a method for extracting a relationship between abnormal events in system operation based on TGGAT and CNN, comprising the steps of:
(1) Acquiring text data and preprocessing;
(2) Processing the preprocessed text data by using a BERT model to generate a feature vector matrix;
(3) Inputting the feature vector matrix into a multi-scale convolutional neural network CNN, and extracting the local features of sentences; inputting the feature vector matrix into a type guidance graph annotation network TGGAT, and extracting the global features of sentences;
(4) Splicing the local features and the global features, and obtaining a characterization vector matrix weighted by the attention value through self-attention;
(5) The weighted token vector matrix is event relationship classified using a softmax classifier.
Further, in the step (1), the database system alarm log text data is obtained for preprocessing: tens of thousands of pieces of text data of the alarm log records are acquired in databases on a plurality of business systems.
Further, the expression of the eigenvector matrix V generated in the step (2) is as follows:
where t represents the sequence length, 768 represents the dimension of the word vector, cls represents the initial tag of the sentence, seq represents the ending tag of the sentence, and v represents the word vector matrix.
Further, in the step (3), the feature vector matrix is input into a multi-scale convolutional neural network CNN, and the extracting of the local features of the sentence specifically includes:
(3.11) one-dimensional convolution of the input eigenvector matrix to form a local eigenvector f i :
fi=ReLu(k i v t:t+j-1 +b)
Where v represents the input word vector matrix and j represents the volumeCore k i B represents the offset value;
(3.12) performing n convolution operations on the feature vector matrix input by the n convolution checks to form a word vector v t High-dimensional vector F of context feature set t :
F t ={f 1 ,f 2 ,...,f n }
(3.13) for high-dimensional vector F t Performing maximum pooling operation with reduced dimension to form word vector v t Local context feature M of (2) t :
M t =max(F t ),
(3.14) for a feature vector matrix of input length t, scanning the entire text using a convolution set K to form a local feature set S of the entire text:
S={M 1 ,M 2 ,...,M t }
further, in the step (3), a syntax dependency tree of the sentence in the input feature vector matrix is obtained through a Stanford CoreNLP tool, the sentence is converted into a graph structure for representation by constructing a syntax structure of the input sentence in the syntax dependency tree, words are represented as nodes, and dependency relations among the words constructed by the syntax dependency tree are represented as edges;
inputting the feature vector matrix into a type guide graph annotation network TGGAT, wherein the extracting of the global features of sentences specifically comprises the following steps:
(3.21) modeling a syntactic dependency tree of the sentence through type guidance using TGGAT to obtain syntactic knowledge related to each word in the sentence, and then transferring and aggregating word information in the sentence using dependency paths to obtain the attention value of each node and all neighboring words as:
where W represents a learnable weight matrix, a represents a single layer feed-forward feedback network,semantic vector representing node i, ++>Semantic vector representing node j, +.>Representing the dependency paths of node i and node j;
(3.22) normalizing the attention value using a SoftMax function to obtain an attention coefficient:
α i,r,j =softmax(d i,r,t )
(3.23) obtaining new vector characteristics of the node i by aggregate calculation through weighted summation of attention coefficients and addition of initial information of the original node i
Wherein sigma represents an activation function, W r 、W 0 Representing a learnable weight, N i The method is a set of all neighbor nodes j of the node i in the syntax dependency graph, and R is a set of all edges of the node i in the syntax dependency graph;
the global features of the sentence (3.24) are as follows:
in the step (4), the local features and the global features are spliced, and then the characterization vector matrix weighted by the attention value is obtained through self-attention, wherein the specific formula is as follows:
HS=contact(H,S)
Q=W q ·HS
K=W k ·HS
V=W v ·HS
A′=softmax(K T ·Q)
O=V·A′
wherein contact represents a splicing function, Q represents a query vector, W q Represents a query matrix, K represents a key vector, W k Representing a key matrix, V representing a value vector, W v Representing a value matrix, A' representing the weight obtained by calculating the inner product of the query vector and the key vector and scaling the normalization, and O representing the characterization vector matrix.
Further, the step (5) predicts the relationship of the event pairs in the text using a softmax classifier and outputs y to obtain an event relationship classification:
y=relu(O·w+b)
where w and b are parameters and offset entries for the fully connected layer.
The invention also adopts a system operation abnormal event relation extraction system based on TGGAT and CNN, comprising:
the acquisition module is used for acquiring text data and preprocessing the text data;
the processing module is used for processing the preprocessed text data by using the BERT model to generate a feature vector matrix; inputting the feature vector matrix into a multi-scale convolutional neural network CNN, and extracting the local features of sentences; inputting the feature vector matrix into a type guidance graph annotation network TGGAT, and extracting the global features of sentences;
the splicing module is used for splicing the local features and the global features, and then obtaining a characterization vector matrix weighted by the attention value through self-attention;
and the classification module is used for classifying the event relationship of the weighted characterization vector matrix by using a softmax classifier.
The beneficial effects are that: compared with the prior art, the method has the remarkable advantages that the long-range syntax dependency relationship and the type information can be captured through the type guidance graph annotation network, which is important for capturing the global context semantic information of the event. Most sentences in the event relation extraction task belong to long-range difficult sentences, wherein two related words can be far away, the syntax structure is complex, and different types of dependency relations can have different contributions, so that the global event knowledge is difficult to accurately capture by using only the surface information of the sentences. Thus, syntactic information is introduced to sort the structure of sentences and model the type-dependent knowledge in the syntax to further capture global features.
Drawings
FIG. 1 is a schematic diagram of a database anomaly event relationship extraction process according to the present invention;
FIG. 2 is a block diagram illustrating the structure of an event relationship extraction system according to the present invention;
FIG. 3 is a schematic diagram of a syntax dependency tree of an example sentence of the present invention;
FIG. 4 is a schematic diagram of the attention mechanism of TGGAT in the present invention.
Detailed Description
Example 1
As shown in fig. 1, in this embodiment, a method for extracting a system operation abnormal event relationship based on TGGAT and CNN includes the following steps:
(1) Tens of thousands of pieces of text data of the alarm log records are acquired in an Oracle database on a plurality of business systems.
(2) And (3) performing data cleaning and preprocessing on the Oracle database log text data obtained in the step (1). The data cleaning comprises mainly removing unnecessary fields and text with inconsistent formats; preprocessing the cleaned database system log text data, dividing the daily alarm log into sentences, and extracting trigger words by using a BERT+CRF model. For example, "queue resources [ trap ] dead" due to transaction [ wait ] resources, where "wait" and "trap" are trigger words. The event pairs are then labeled into five categories, denoted by the numerals 0, 1, 2, 3 and 4, respectively, forming a dataset, according to their type, cause, accompaniment, disposition, carry and sub-event category. Finally, the data set is divided into a training set, a verification set and a test set according to the proportion of 8:1:1, and the training set is used for model training of Oracle database event relation extraction.
(3) The preprocessed data is input to the encoder. And processing the preprocessed text data by using the BERT model. Alert text sentence { w to be input 1 ,w 2 ,…,w t Generated as a feature vector matrix V. The expression of the eigenvector matrix V is as follows:
where t is the sequence length, 768 is the dimension of the word vector, cls is the initial tag of the sentence, seq is the end tag of the sentence, v represents the word vector matrix, and each sentence input can be converted into a feature vector matrix by the Bert model.
(4) And (3) inputting the feature vector matrix V generated in the step (3) into a multi-scale convolutional neural network CNN layer, and extracting the local features of sentences. Performing convolution calculation on the eigenvector matrix V by using a convolution kernel K to obtain local features F t And the design pooling layer reduces the dimension of the word vector, and finally obtains the syntactic characteristic of the word vector. In the event relationship extraction task, k= { K 1 ,k 2 ,…,k n And n is the number of convolution kernels. The method specifically comprises the following steps:
(4.11) one-dimensional convolution of the input eigenvector matrix, the target word vector may form a local eigenvector f i . The specific formula is as follows:
f i =ReLu(k i v t:t+j-1 +b)
where v represents the input word vector matrix and j represents the convolution kernel k i And b represents the offset value.
(4.12) the entire set of convolution kernels K acts on the window center word vector and forms the word vector v t Is used to obtain word vector v t High-dimensional vector F of context feature set t . The n convolution operations are represented as follows:
v t =F t ={f 1 ,f 2 ,...,f n }
wherein F is t Is obtained from the target word v after n convolution operations t A set of formed contextual features.
(4.13) due to F t Is a multi-feature high-dimensional vector, reduced using pooling operationsIts dimension. The causal semantic role as a salient feature can be preserved as a feature using a max pooling operation, expressed as:
M t =max(F t )
(4.14) each local feature vector f i The features m that remain after the max pooling operation are fully connected to fix their dimensional output and ultimately form the center word vector v t Is expressed by the following expression:
V t =M t ={m 1 ,m 2 ,...,m n }
(4.15) for a feature vector matrix of input length t, scanning the entire text using a convolution set K to form a local feature set S of the entire text having the following expression:
S={V 1 ,V 2 ,...,V t }={M 1 ,M 2 ,...,M t }
(5) Inputting the feature vector matrix V generated in the step (3) into a TGGAT model, adding type dependency into graph attention mechanism calculation, and aggregating key information of texts. A syntax-dependent tree of the input sentence is obtained using a Stanford CoreNLP tool. For example, a syntax dependency tree of "queue resources may fall into the dead office due to transaction waiting resources" is shown in FIG. 3. Syntax information is introduced to sort the structure of sentences and model the type-dependent knowledge in the syntax to further capture global features. After the BERT coding layer, the TGGAT is used for coding the syntax tree, so that not only the semantic features of words are extracted, but also the syntax dependency features are extracted, the understanding of the sentence by the model is enhanced, and the relation extraction is more accurate.
Sentences can be converted into graph structures for representation by constructing the syntactic structure of the input sentence in the syntactic dependency tree, representing the words as nodes, and representing the dependency relationships between the syntactic dependency tree construction words as edges. That is, in the figure, the type-dependent information in the text is converted into the corresponding adjacency matrix a by the syntax-dependent tree. Each element a in the matrix i,j Representing the ith wordWhether there is a dependency relationship between the jth word. If there is a dependency between two words, a i,j =1, otherwise a i,j =0, as shown in fig. 4.
Inputting the feature vector matrix into a type guide graph annotation network TGGAT, wherein the extracting of the global features of sentences specifically comprises the following steps:
(5.11) to make full use of the syntactic information of the sentence, the syntactic dependency tree of the sentence is modeled by type guidance using TGGAT to obtain syntactic knowledge about each word in the sentence, and then the dependency path is effectively utilized to transfer and aggregate word information in the sentence. Thereby enhancing the feature representation of the word. The attention value for each node and all neighboring words is:
where W represents a learnable weight matrix, a represents a single layer feed-forward feedback network,semantic vector representing node i, ++>Semantic vector representing node j, +.>Representing the dependency paths of node i and node j.
(5.12) normalizing the attention values of the central node and all neighboring entities using a SoftMax function, the normalized attention weights being the final attention coefficients:
the new vector representation of the (5.13) node is a weighted summation of the calculated attention coefficients and adds the initial information of the original node. New vector characteristics of node i are obtained through aggregation calculation
Wherein sigma represents an activation function, W r 、W 0 Representing a learnable weight, N i Is the set of all neighbor nodes j of node i in the syntax dependency graph, and R is the set of all edges of node i in the syntax dependency graph.
(5.14) to further distinguish the importance of different contextual features, the present solution adds a multi-headed attention mechanism. Note that the mechanism may meet the need to distinguish between different contexts, helping the model to focus on some important information in the sentence, while ignoring irrelevant context information:
(5.15) inputting a global feature matrix of the text as follows:
(6) Splicing the local feature vector obtained in the step (4) and the global feature vector obtained in the step (5), calculating the attention value between each word and other words in the text by using self-attention, and correlating the words so as to enable each word to have different importance, thereby obtaining a weighted characterization vector matrix, wherein the specific formula is as follows:
HS=contact(H,S)
Q=W q ·HS
K=W k ·HS
V=W v ·HS
A′=softmax(K T ·Q)
O=V·A′
wherein contact represents a splicing function, Q represents a query vector, W q Represents a query matrix, K represents a key vector, W k Representing a key matrix, V representing a value vector, W v Representing a value matrix, A' representing the weight obtained by calculating the inner product of the query vector and the key vector and scaling the normalization, and O representing the characterization vector matrix.
(7) Using a softmax classifier to classify the event relationship of the weighted token vector matrix, outputting y to obtain the event relationship classification:
y=relu(O·w+b)
where w and b are parameters and offset entries for the fully connected layer.
Example 2
As shown in fig. 2, in this embodiment, a system based on type guidance graph annotation network (type-guided graph attention network, TGGAT) and CNN operates an abnormal event relation extraction system, which includes an acquisition module, configured to acquire text data and perform preprocessing; the processing module is used for processing the preprocessed text data by using the BERT model to generate a feature vector matrix; inputting the feature vector matrix into a multi-scale convolutional neural network CNN, and extracting the local features of sentences; inputting the feature vector matrix into a type guidance graph annotation network TGGAT, and extracting the global features of sentences; the splicing module is used for splicing the local features and the global features, and then obtaining a characterization vector matrix weighted by the attention value through self-attention; and the classification module is used for classifying the event relationship of the weighted characterization vector matrix by using a softmax classifier.
And coding the input text by using the BERT model to obtain a semantic feature vector matrix. Then, in order to extract global information, the type-guided graph annotation network encodes the syntactic dependency tree of the text with the type-dependent knowledge as a guide signal. Meanwhile, CNN is used to extract local information to enhance the representativeness of text.
Note that the network can capture not only long-range syntactic dependencies, but also type information through the type-directed graph, which is critical to capturing global context semantic information for events. Most sentences in the event relation extraction task belong to long-range difficult sentences, wherein two related words can be far away, and the syntax structure is complex, so that the global event knowledge is difficult to accurately capture only by using the surface information of the sentences. Thus, syntactic information is introduced to sort the structure of sentences and model the type-dependent knowledge in the syntax to further capture global features.
Claims (10)
1. The system operation abnormal event relation extraction method based on the TGGATT and the CNN is characterized by comprising the following steps:
(1) Acquiring text data and preprocessing;
(2) Processing the preprocessed text data by using a BERT model to generate a feature vector matrix;
(3) Inputting the feature vector matrix into a multi-scale convolutional neural network CNN, and extracting the local features of sentences; inputting the feature vector matrix into a type guide graph annotation network TGGATT, and extracting the global features of sentences;
(4) Splicing the local features and the global features, and obtaining a characterization vector matrix weighted by the attention value through self-attention;
(5) The weighted token vector matrix is event relationship classified using a softmax classifier.
2. The system operation abnormal event relation extraction method according to claim 1, wherein in the step (1), database system alarm log text data is obtained for preprocessing: tens of thousands of pieces of text data of the alarm log records are acquired in databases on a plurality of business systems.
3. The system operation abnormal event relation extracting method according to claim 1, wherein the expression of the feature vector matrix V generated in the step (2) is as follows:
where t represents the sequence length, 768 represents the dimension of the word vector, cls represents the initial tag of the sentence, seq represents the ending tag of the sentence, and v represents the word vector matrix.
4. The system operation abnormal event relation extraction method according to claim 3, wherein in the step (3), the feature vector matrix is input into a multi-scale convolutional neural network CNN, and the extracting of the local feature of the sentence specifically comprises:
(3.11) one-dimensional convolution of the input eigenvector matrix to form a local eigenvector f i :
f i =ReLu(k i v t:t+j-1 +b)
Where v represents the input word vector matrix and j represents the convolution kernel k i B represents the offset value;
(3.12) performing n convolution operations on the feature vector matrix input by the n convolution checks to form a word vector v t High-dimensional vector F of context feature set t :
F t ={f 1 ,f 2 ,...,f n }
(3.13) for high-dimensional vector F t Performing maximum pooling operation with reduced dimension to form word vector v t Local context feature M of (2) t :
M t =max(F t ),
(3.14) for a feature vector matrix of input length t, scanning the entire text using a convolution set K to form a local feature set S of the entire text:
S={M 1 ,M 2 ,...,M t }。
5. the system operation abnormal event relation extraction method according to claim 4, wherein in the step (3), a syntax dependency tree of the sentence in the input feature vector matrix is obtained through a Stanford CoreNLP tool, the sentence is converted into a graph structure by constructing a syntax structure of the input sentence in the syntax dependency tree, wherein the word is represented as a node, and the dependency relation between the words constructed by the syntax dependency tree is represented as an edge;
inputting the feature vector matrix into a type guide graph meaning network TGGATT, wherein the extracting of the global features of sentences specifically comprises the following steps:
(3.21) modeling a syntactic dependency tree of the sentence through type guidance using TGGATT to obtain syntactic knowledge related to each word in the sentence, and then transferring and aggregating word information in the sentence using dependency paths to obtain the attention value of each node and all adjacent words as:
where W represents a learnable weight matrix, a represents a single layer feed-forward feedback network,semantic vector representing node i, ++>Semantic vector representing node j, +.>Representing the dependency paths of node i and node j;
(3.22) normalizing the attention value using a SoftMax function to obtain an attention coefficient:
α i,r,j =softmax(d i,r,t )
(3.23) obtaining new vector characteristics of the node i by aggregate calculation through weighted summation of attention coefficients and addition of initial information of the original node i
Wherein sigma represents an activation function, W r 、W 0 Representing a learnable weight, N i Is the set of all neighbor nodes j of node i in the syntax dependency graph, R is the nodeThe point i is a set of all edges in the syntax dependency graph;
the global features of the sentence (3.24) are as follows:
。
6. the method for extracting abnormal event relation in system operation according to claim 5, wherein in the step (4), the local feature and the global feature are spliced, and then the characterization vector matrix weighted by the attention value is obtained through self-attention, and the specific formula is as follows:
HS=contact(H,S)
Q=W q ·HS
K=W k ·HS
V=W v ·HS
A′=softmax(K T ·Q)
O=V·A′
wherein contact represents a splicing function, Q represents a query vector, W q Represents a query matrix, K represents a key vector, W k Representing a key matrix, V representing a value vector, W v Representing a value matrix, A' representing the weight obtained by calculating the inner product of the query vector and the key vector and scaling the normalization, and O representing the characterization vector matrix.
7. The system operation anomaly event relationship extraction method of claim 6, wherein the step (5) uses a softmax classifier to predict the relationship of event pairs in text and outputs y to obtain event relationship classifications:
y=relu(O·w+b)
where w and b are parameters and offset entries for the fully connected layer.
8. A system operation abnormal event relation extraction system based on TGGAT and CNN, comprising:
the acquisition module is used for acquiring text data and preprocessing the text data;
the processing module is used for processing the preprocessed text data by using the BERT model to generate a feature vector matrix; inputting the feature vector matrix into a multi-scale convolutional neural network CNN, and extracting the local features of sentences; inputting the feature vector matrix into a type guide graph annotation network TGGATT, and extracting the global features of sentences;
the splicing module is used for splicing the local features and the global features, and then obtaining a characterization vector matrix weighted by the attention value through self-attention;
and the classification module is used for classifying the event relationship of the weighted characterization vector matrix by using a softmax classifier.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 7 when the computer program is executed by the processor.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311178825.8A CN117271701A (en) | 2023-09-13 | 2023-09-13 | Method and system for extracting system operation abnormal event relation based on TGGAT and CNN |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311178825.8A CN117271701A (en) | 2023-09-13 | 2023-09-13 | Method and system for extracting system operation abnormal event relation based on TGGAT and CNN |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117271701A true CN117271701A (en) | 2023-12-22 |
Family
ID=89207307
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311178825.8A Pending CN117271701A (en) | 2023-09-13 | 2023-09-13 | Method and system for extracting system operation abnormal event relation based on TGGAT and CNN |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117271701A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117725468A (en) * | 2024-02-06 | 2024-03-19 | 四川鸿霖科技有限公司 | Intelligent medical electric guarantee method and system |
-
2023
- 2023-09-13 CN CN202311178825.8A patent/CN117271701A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117725468A (en) * | 2024-02-06 | 2024-03-19 | 四川鸿霖科技有限公司 | Intelligent medical electric guarantee method and system |
CN117725468B (en) * | 2024-02-06 | 2024-04-26 | 四川鸿霖科技有限公司 | Intelligent medical electric guarantee method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Deshmukh et al. | Towards accurate duplicate bug retrieval using deep learning techniques | |
CN104834747B (en) | Short text classification method based on convolutional neural networks | |
CN110825877A (en) | Semantic similarity analysis method based on text clustering | |
WO2020211720A1 (en) | Data processing method and pronoun resolution neural network training method | |
CN112232087B (en) | Specific aspect emotion analysis method of multi-granularity attention model based on Transformer | |
CN107844533A (en) | A kind of intelligent Answer System and analysis method | |
CN110175221B (en) | Junk short message identification method by combining word vector with machine learning | |
CN116304748B (en) | Text similarity calculation method, system, equipment and medium | |
CN114896388A (en) | Hierarchical multi-label text classification method based on mixed attention | |
CN111984791A (en) | Long text classification method based on attention mechanism | |
CN112000802A (en) | Software defect positioning method based on similarity integration | |
CN104881399B (en) | Event recognition method and system based on probability soft logic PSL | |
CN113742733A (en) | Reading understanding vulnerability event trigger word extraction and vulnerability type identification method and device | |
WO2021004118A1 (en) | Correlation value determination method and apparatus | |
CN117271701A (en) | Method and system for extracting system operation abnormal event relation based on TGGAT and CNN | |
CN111581365B (en) | Predicate extraction method | |
CN117436451A (en) | Agricultural pest and disease damage named entity identification method based on IDCNN-Attention | |
CN110377753B (en) | Relation extraction method and device based on relation trigger word and GRU model | |
CN115795037B (en) | Multi-label text classification method based on label perception | |
CN114372145B (en) | Scheduling method for dynamic allocation of operation and maintenance resources based on knowledge graph platform | |
CN113569014B (en) | Operation and maintenance project management method based on multi-granularity text semantic information | |
CN113342964B (en) | Recommendation type determination method and system based on mobile service | |
CN114595324A (en) | Method, device, terminal and non-transitory storage medium for power grid service data domain division | |
CN113821571A (en) | Food safety relation extraction method based on BERT and improved PCNN | |
Shen et al. | A compressive sensing model for speeding up text classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |