CN112199496A - Power grid equipment defect text classification method based on multi-head attention mechanism and RCNN (Rich coupled neural network) - Google Patents

Power grid equipment defect text classification method based on multi-head attention mechanism and RCNN (Rich coupled neural network) Download PDF

Info

Publication number
CN112199496A
CN112199496A CN202010778393.4A CN202010778393A CN112199496A CN 112199496 A CN112199496 A CN 112199496A CN 202010778393 A CN202010778393 A CN 202010778393A CN 112199496 A CN112199496 A CN 112199496A
Authority
CN
China
Prior art keywords
text
word
matrix
attention
rcnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010778393.4A
Other languages
Chinese (zh)
Inventor
祝云
陆世豪
周振茂
苏琪
姚梦婷
何鹏辉
徐泽天
伍文侠
封之聪
潘柯良
兰慧颖
冯帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi University
Laibin Power Supply Bureau of Guangxi Power Grid Co Ltd
Tianshengqiao Bureau of Extra High Voltage Power Transmission Co
Original Assignee
Guangxi University
Laibin Power Supply Bureau of Guangxi Power Grid Co Ltd
Tianshengqiao Bureau of Extra High Voltage Power Transmission Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi University, Laibin Power Supply Bureau of Guangxi Power Grid Co Ltd, Tianshengqiao Bureau of Extra High Voltage Power Transmission Co filed Critical Guangxi University
Priority to CN202010778393.4A priority Critical patent/CN112199496A/en
Publication of CN112199496A publication Critical patent/CN112199496A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for classifying a defect text of power grid equipment based on a multi-head attention mechanism and an RCNN (Rich coupled neural network), which comprises the following steps of: firstly, preprocessing a power grid defect text by word segmentation and word removal; step two, embedding word vectors into the text after word segmentation to obtain a text matrix; inputting the text matrix into a multi-head attention model to obtain a text matrix containing attention, and fusing the attention text matrix with the original text matrix; step four, using an RCNN network model to extract the characteristics of the fused text matrix, and outputting the final classification result; and fifthly, testing and optimizing the multi-head attention model and the RCNN model by using the power grid primary equipment defect text. The method applies a multi-head attention mechanism and an RCNN (Rich coupled neural network) to classification of the defect texts of the power grid equipment, and realizes automatic classification of the defect texts of the equipment.

Description

Power grid equipment defect text classification method based on multi-head attention mechanism and RCNN (Rich coupled neural network)
Technical Field
The invention belongs to the technical field of electric power systems, and particularly relates to a power grid equipment defect text classification method based on a multi-head attention mechanism and an RCNN (remote control neural network).
Background
The power grid company classifies equipment defects into three grades of normal, important and urgent according to different severity degrees. The power grid company operators find the defects of the equipment in the process of inspection, operation and maintenance, report the information of the faults, the defects, the defect grades and the like of the equipment in a Chinese form, and arrange corresponding teams to eliminate the defects only after the equipment defect is verified and rated. The classification work of the equipment defects is usually carried out manually, the workload is large, time and labor are consumed, and the classification correctness is difficult to guarantee due to personal subjective factors and knowledge and experience differences. A large number of classified and graded equipment defect texts are stored in a defect information management system of a power grid company, and conditions can be created for quickly classifying the equipment defect texts by reasonably utilizing the texts. Therefore, the research of the text classification method of the equipment defect based on the natural language processing technology is very important and urgent.
At present, a plurality of text classification methods are applied to classification of the defect texts of the power grid equipment, and the traditional machine learning algorithm, the deep learning algorithm and the like are mainly adopted. The traditional machine learning algorithm comprises a support vector machine, a decision tree, Bayes and the like, and the feature extraction of the methods usually adopts LDA, TF-IDF and other shallow layer extraction and then is classified by classifiers such as the support vector machine and the decision tree, so that the classification effect of the shallow layer learning method is general and the semantic information of a text cannot be deeply learned; the result of the text classification method based on the traditional deep learning (such as TextCNN, TextRNN, etc.) is more accurate, but the text classification effect for the text with long-distance dependency and semantic transition is not good, so the applicability is limited.
The method has the characteristics that the defect text of the power grid equipment has strong specialization, the expressions are different from person to person, the text length is different, a large number of numbers and units are mixed in the defect text, the degraded defect text can contain the expression of semantic turn, and the like, and the classification accuracy of the equipment defect text is influenced by the characteristics.
Therefore, an urgent need exists in the art for a method for quickly and accurately classifying the defect texts of the power grid equipment based on the multi-head attention mechanism and the RCNN network.
Disclosure of Invention
In view of the above, the invention provides a power grid device defect text classification method based on a multi-head attention mechanism and an RCNN network, which applies the multi-head attention mechanism and the RCNN network to the classification of power grid device defect text to realize the automatic classification of device defect text.
In order to achieve the purpose, the invention adopts the following technical scheme:
a power grid equipment defect text classification method based on a multi-head attention mechanism and an RCNN network comprises the following steps:
firstly, preprocessing a power grid defect text by word segmentation and word removal;
step two, embedding word vectors into the text after word segmentation to obtain a text matrix;
inputting the text matrix into a multi-head attention model to obtain a text matrix containing attention, and fusing the attention text matrix with the original text matrix;
step four, using an RCNN network model to extract the characteristics of the fused text matrix, and outputting the final classification result;
and fifthly, testing and optimizing the multi-head attention model and the RCNN model by using the power grid primary equipment defect text.
Preferably, in the first step, the text preprocessing process is as follows:
(1) acquiring a data set file for text classification in advance, wherein the data set file comprises a power equipment defect text and a corresponding labeled defect grade class label;
(2) and establishing a proper name word bank and a stop word bank aiming at the text content, segmenting the text by utilizing a Chinese word segmentation component of python, and converting each text into a word sequence.
Preferably, the method for embedding word vectors into the text after word segmentation to obtain the text matrix is as follows:
(1) performing unsupervised training on a word sequence obtained by word segmentation by using a CBOW algorithm in a word2vec component in a genesis library to obtain a word vector corresponding to each word;
(2) and performing word embedding on the trained word vectors by using an embedding layer to obtain a text matrix.
Preferably, the CBOW algorithm is used to predict the probability p (w | context (w)) generated by w from the context (w) of the word w, and to train the word vector by maximizing the objective function T:
T=∑log p(w|Context(w))。
preferably, the method for inputting the text matrix into the multi-head attention model to obtain the text matrix containing attention and fusing the attention text matrix and the original text matrix comprises the following steps:
(1) the dimension of the word vector is dkThe length of the sentence is L, and the first word vector in the sentence is represented as el(1. ltoreq. L. ltoreq.L) to give LxdkText matrix E ═ E1…el…eL];
(2) Inputting the text matrix into the multi-head attention model to obtain a text matrix representation which is subjected to multi-head attention optimization
Figure BDA0002619315390000031
Figure BDA0002619315390000032
Head=MultiHead(EWQ,EWK,EWV)=Concat(head1,…,headh)WO
Where, Q, K, V is the input matrix,
Figure BDA0002619315390000033
for scaling factor, Attetnion is the scaling dot product attention operation, Multihead is the Multi-head attention function, Concat is the splicing function, head is the result of the Single self-attention model operation, headiFor the ith self-attention model operation result, WQ、WK、WV、WOIs a linear transformation matrix;
(3) fusing the attention text matrix with the original text matrix and outputting the matrix
Figure BDA0002619315390000041
E′1=Residual_Connect(E,Head)
E1=LayerNorm(E1′)
Wherein E is1' is the matrix after residual concatenation, Residul _ Connect is the residual concatenation operation, LayerNorm is the layer normalization.
Preferably, the method for extracting the features of the fused text matrix by using the RCNN network model and outputting the final classification result is as follows:
(1) the recurrent neural network part in the RCNN network adopts a bidirectional GRU network, the network consists of a forward input GRU and a reverse input GRU, and the GRU network is used for learning the current word w respectively on the left and the rightiLeft context of (c) represents cl (w)i) And the right context representation cr (w)i) And then with the attention word vector e (w) of the current wordi)∈E1Input x connected to form subsequent convolutional layersi
cl(wi)=f(W(l)cl(wi-1)+W(sl)e(wi-1))
cr(wi)=f(W(r)cr(wi-1)+W(sr)e(wi-1))
xi=[cl(wi);e(wi);cr(wi)]
Wherein, W(l),W(r)For converting a hidden layer into a matrix of the next hidden layer, W(sl),W(sr)F is a non-linear activation function for a matrix combining the semantics of the current word with the left or right text of the next word;
(2) convolution layer using the number of convolution kernel columns and xiConvolution kernel with equal number of rows 1
Figure BDA0002619315390000044
The activation function is tanh, the output of the Bi-GRU network is convoluted by the convolution layer to obtain the convolution result
Figure BDA0002619315390000042
Figure BDA0002619315390000043
Wherein b is an offset;
(3) the Pooling layer part adopts Global Average potential firing (GAP) to sample the characteristics of the output result, y(3)∈R3mFor the extracted feature vectors:
Figure BDA0002619315390000051
(4) inputting the feature vector into a softmax function, and outputting a final classification result:
Figure BDA0002619315390000052
wherein, PiThe probability that the text classification is classified into i classes is shown, n is the number of classes, and the class with the highest probability is the classification result of the text.
Preferably, the method for testing and tuning the classification model by the power grid primary equipment defect text is as follows:
and testing the model by adopting a five-fold cross validation method for the data set, wherein the Macro-average comprehensive index Macro-F1 is adopted as the evaluation index, and the Adam gradient descent method is adopted for model training to update the model weight parameters, so that the test tuning is realized.
The invention has the beneficial effects that:
according to the invention, by mining the defect text data of the power grid equipment, the classification problem of the defect text of the power grid equipment is effectively solved, and the defects and shortcomings of the conventional RCNN are overcome. Firstly, in order to enable the RCNN to extract important information related to classification to have higher weight in classification, and enable irrelevant features to be selectively ignored, a multi-attention mechanism is introduced so as to be capable of distributing more attention weight to the information related to classification. Secondly, aiming at the problem that the word vector generated by word2vec cannot be dynamically optimized for a specific task, learning the word dependence relationship in the text by a multi-head attention mechanism, capturing the internal structure of the text, and converting the word vector from static state to dynamic state. And finally, the generated attention text matrix and the original text matrix are fused through residual connection, so that network degradation is prevented.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic flow chart of the present invention.
FIG. 2 is a diagram of a multi-head attention model according to the present invention.
FIG. 3 is a schematic diagram of a Bi-GRU of the present invention.
FIG. 4 is a graph showing experimental results of different comparative models of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present invention provides a method for classifying a defect text of a power grid device based on a multi-head attention mechanism and an RCNN network, the method includes the following steps:
firstly, preprocessing a power grid defect text such as word segmentation and word stop;
step two, embedding word vectors into the text after word segmentation to obtain a text matrix;
inputting the text matrix into a multi-head attention model to obtain a text matrix containing attention, and fusing the attention text matrix with the original text matrix;
step four, using an RCNN network model to extract the characteristics of the fused text matrix, and outputting the final classification result;
and fifthly, testing and optimizing the multi-head attention model and the RCNN model by using the power grid primary equipment defect text.
The process of the power grid defect text preprocessing is as follows:
(1) the method comprises the steps of obtaining a data set file for text classification in advance, wherein the data set file comprises a power equipment defect text and a corresponding labeled defect grade class label, and the data set is a 2016-:
(2) table 1 data set information table
Name (R) Number of classification Training set Test set
2016 + 2019 defect of primary equipment of certain power supply station 3 1548 387
(3) Establishing a proper name word bank and a stop word bank aiming at text contents, segmenting the text by utilizing a python Chinese segmentation component, converting each equipment defect text into a word sequence, and exemplifying the equipment defect text: the knife switch can not be electrically operated on site and is converted into a word sequence: 'knife', 'fail', 'in place', 'power', 'operate'.
Embedding word vectors into the text after word segmentation, and obtaining a text matrix in the following specific process:
(1) the word sequence obtained by word segmentation is subjected to unsupervised training by using a CBOW algorithm in a word2vec component in a genesis library to obtain a word vector corresponding to each word. The principle of the CBOW algorithm is to predict the probability p (w | Context (w)) generated by w according to the context Context (w) of the word w, and train the word vector by maximizing the objective function T:
T=∑log p(w|Context(w))
and (3) carrying out word sequence: the 'knife', 'fail', 'in place', 'power up', 'operation' training results in a 128-dimensional word vector as shown in table 2:
(2) TABLE 2 word-corresponding word vectors
Figure BDA0002619315390000071
Figure BDA0002619315390000081
(3) And performing word embedding on the trained word vectors by using an embedding layer to obtain a text matrix.
Inputting the text matrix into a multi-head attention model to obtain a text matrix containing attention, and fusing the attention text matrix with the original text matrix in the following specific process:
(1) the dimension of the word vector is dk128, the sentence length is L, and the ith word vector in the sentence is denoted as el(1. ltoreq. L. ltoreq.L) to give LxdkText matrix E ═ E1…el…eL];
(2) Inputting the text matrix into the multi-head attention model to obtain a text matrix representation which is subjected to multi-head attention optimization
Figure BDA0002619315390000082
The multi-head attention model is shown in FIG. 2;
Figure BDA0002619315390000083
Head=MultiHead(EWQ,EWK,EWV)=Concat(head1,…,headh)WO
where, Q, K, V is the input matrix,
Figure BDA0002619315390000084
for scaling factor, Attetnion is the scaling dot product attention operation, Multihead is the Multi-head attention function, Concat is the splicing function, head is the result of the Single self-attention model operation, WQ、WK、WV、WOFor linear transformation matrices, the text matrix optimized for multi-headed attention is shown in table 3.
(3) TABLE 3 attention text matrix
Figure BDA0002619315390000085
Figure BDA0002619315390000091
Comparing table 1 and table 2, it can be known that the text matrix keyword vectors optimized by the multi-head attention mechanism are enhanced from different dimensions;
(4) fusing the attention text matrix with the original text matrix and outputting the matrix
Figure BDA0002619315390000092
E′1=Residual_Connect(E,Head)
E1=LayerNorm(E1′)
Wherein E is1' is the matrix after residual concatenation, Residul _ Connect is the residual concatenation operation, LayerNorm is the layer normalization.
The specific process of using the RCNN model to extract the characteristics of the fused text matrix and outputting the final classification result is as follows:
(1) the recurrent neural network part in the RCNN network adopts a bidirectional GRU network, the schematic diagram is shown in FIG. 3, the network consists of a forward input GRU and a reverse input GRU, and the GRU network is used for learning the current word w respectively on the left and the rightiLeft context of (c) represents cl (w)i) And the right context representation cr (w)i) And then with the attention word vector e (w) of the current wordi)∈E1Input x connected to form subsequent convolutional layersi
cl(wi)=f(W(l)cl(wi-1)+W(sl)e(wi-1))
cr(wi)=f(W(r)cr(wi-1)+W(sr)e(wi-1))
xi=[cl(wi);e(wi);cr(wi)]
Wherein, W(l),W(r)For converting a hidden layer into a matrix of the next hidden layer, W(sl),W(sr)F is a non-linear activation function for a matrix combining the semantics of the current word with the left or right text of the next word;
(2) convolution layer using the number of convolution kernel columns and xiConvolution kernel with equal number of rows 1
Figure BDA0002619315390000093
The activation function is tanh, the output of the Bi-GRU network is convoluted by the convolution layer to obtain the convolution result
Figure BDA0002619315390000094
Figure BDA0002619315390000101
Wherein b is an offset.
(3) The Pooling layer part adopts Global Average potential firing (GAP) to sample the characteristics of the output result, y(3)∈R3mFor the extracted feature vectors:
Figure BDA0002619315390000102
(4) and inputting the feature vector into a softmax function, and outputting a final classification result.
Figure BDA0002619315390000103
Wherein, PiThe probability that the text classification is classified into i classes is shown, n is the number of classes, and the class with the highest probability is the classification result of the text.
(5) The model training adopts a five-fold cross validation method, the model training adopts an Adam gradient descent method to update model weight parameters, and the prediction result of the model on the example text sequence after tuning is shown in Table 3.
(6) TABLE 3 results of the classification
Figure BDA0002619315390000104
Predicting equipment defects through an algorithm: the knife switch can not be electrically operated on site, and the general defects are consistent with the grades of the actual defects.
(7) To examine the effect of the classification algorithm of this embodiment, this embodiment also designed comparative experiments with other models. The experimental environment is CPU Intel Core i7-8550U, the experimental framework is Tensorflow, the evaluation index is Macro-average comprehensive index Macro-F1(MF1), and the experimental result is shown in FIG. 3.
FIG. 4 shows the comparison between the multi-head attention mechanism and the RCNN classification algorithm (MAT-RCNN) and other classification algorithms. It can be seen from the figure that the model of the present invention is superior to the compared classification algorithm in classification effect, and MF1 reaches 94.51%.
According to the method, the deep semantic learning algorithm is applied to classification of the power grid defect texts, and the power grid equipment defect texts are classified quickly, so that quick grading of equipment defects is achieved, the maintenance efficiency of the power grid equipment is improved, the fault elimination time is shortened, and the practicability is high. The method not only saves labor cost, but also has good classification effect on the power grid defect texts.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (7)

1. A power grid equipment defect text classification method based on a multi-head attention mechanism and an RCNN (Rich coupled neural network) is characterized by comprising the following steps:
firstly, preprocessing a power grid defect text by word segmentation and word removal;
step two, embedding word vectors into the text after word segmentation to obtain a text matrix;
inputting the text matrix into a multi-head attention model to obtain a text matrix containing attention, and fusing the attention text matrix with the original text matrix;
step four, using an RCNN network model to extract the characteristics of the fused text matrix, and outputting the final classification result;
and fifthly, testing and optimizing the multi-head attention model and the RCNN model by using the power grid primary equipment defect text.
2. The method as claimed in claim 1, wherein in the first step, the text preprocessing is performed as follows:
(1) acquiring a data set file for text classification in advance, wherein the data set file comprises a power equipment defect text and a corresponding labeled defect grade class label;
(2) and establishing a proper name word bank and a stop word bank aiming at the text content, segmenting the text by utilizing a Chinese word segmentation component of python, and converting each text into a word sequence.
3. The method for classifying the defect texts of the power grid equipment based on the multi-head attention mechanism and the RCNN is characterized in that word vector embedding is performed on the text after word segmentation, and a text matrix is obtained by the following method:
(1) performing unsupervised training on a word sequence obtained by word segmentation by using a CBOW algorithm in a word2vec component in a genesis library to obtain a word vector corresponding to each word;
(2) and performing word embedding on the trained word vectors by using an embedding layer to obtain a text matrix.
4. The method for classifying the defect texts of the power grid equipment based on the multi-head attention mechanism and the RCNN network as claimed in claim 3, wherein the CBOW algorithm is used to predict the probability p (w | Context (w)) generated by the word w according to the context (w) of the word w, and the word vector is trained by maximizing the objective function T:
T=∑logp(w|Context(w))。
5. the method for classifying the defect texts of the power grid equipment based on the multi-head attention mechanism and the RCNN is characterized in that the text matrix is input into a multi-head attention model to obtain a text matrix containing attention, and the method for fusing the attention text matrix and the original text matrix comprises the following steps:
(1) the dimension of the word vector is dkThe length of the sentence is L, the first word vector in the sentence is expressed asel(1. ltoreq. L. ltoreq.L) to give LxdkText matrix E ═ E1…el…eL];
(2) Inputting the text matrix into the multi-head attention model to obtain a text matrix representation which is subjected to multi-head attention optimization
Figure FDA0002619315380000021
Figure FDA0002619315380000022
Head=MultiHead(EWQ,EWK,EWV)=Concat(head1,…,headh)WO
Where, Q, K, V is the input matrix,
Figure FDA0002619315380000023
for scaling factor, Attetnion is the scaling dot product attention operation, Multihead is the Multi-head attention function, Concat is the splicing function, head is the result of the Single self-attention model operation, headiFor the ith self-attention model operation result, WQ、WK、WV、WOIs a linear transformation matrix;
(3) fusing the attention text matrix with the original text matrix and outputting the matrix
Figure FDA0002619315380000024
E′1=Residual_Connect(E,Head)
E1=LayerNorm(E′1)
Wherein, E'1The matrix after residual concatenation, Residul _ Connect, and LayerNorm, are the residual concatenation operations and layer normalization.
6. The method for classifying the defect texts of the power grid equipment based on the multi-head attention mechanism and the RCNN is characterized in that the method for extracting the features of the fused text matrix by using the RCNN model and outputting the final classification result comprises the following steps:
(1) the recurrent neural network part in the RCNN network adopts a bidirectional GRU network, the network consists of a forward input GRU and a reverse input GRU, and the GRU network is used for learning the current word w respectively on the left and the rightiLeft context of (c) represents cl (w)i) And the right context representation cr (w)i) And then with the attention word vector e (w) of the current wordi)∈E1Input x connected to form subsequent convolutional layersi
cl(wi)=f(W(l)cl(wi-1)+W(sl)e(wi-1))
cr(wi)=f(W(r)cr(wi-1)+W(sr)e(wi-1))
xi=[cl(wi);e(wi);cr(wi)]
Wherein, W(l),W(r)For converting a hidden layer into a matrix of the next hidden layer, W(sl),W(sr)F is a non-linear activation function for a matrix combining the semantics of the current word with the left or right text of the next word;
(2) convolution layer using the number of convolution kernel columns and xiConvolution kernel with equal number of rows 1
Figure FDA0002619315380000031
The activation function is tanh, the output of the Bi-GRU network is convoluted by the convolution layer to obtain the convolution result
Figure FDA0002619315380000032
Figure FDA0002619315380000033
Wherein b is an offset;
(3) the part of the Pooling layer adopts Global Average Potential (GAP) to output the resultLine feature sampling, y(3)∈R3mFor the extracted feature vectors:
Figure FDA0002619315380000034
(4) inputting the feature vector into a softmax function, and outputting a final classification result:
Figure FDA0002619315380000041
wherein, PiThe probability that the text classification is classified into i classes is shown, n is the number of classes, and the class with the highest probability is the classification result of the text.
7. The method for classifying the defect texts of the power grid equipment based on the multi-head attention mechanism and the RCNN is characterized in that the method for testing and optimizing the classification model by the defect texts of the primary equipment of the power grid is as follows:
and testing the model by adopting a five-fold cross validation method for the data set, wherein the Macro-average comprehensive index Macro-F1 is adopted as the evaluation index, and the Adam gradient descent method is adopted for model training to update the model weight parameters, so that the test tuning is realized.
CN202010778393.4A 2020-08-05 2020-08-05 Power grid equipment defect text classification method based on multi-head attention mechanism and RCNN (Rich coupled neural network) Pending CN112199496A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010778393.4A CN112199496A (en) 2020-08-05 2020-08-05 Power grid equipment defect text classification method based on multi-head attention mechanism and RCNN (Rich coupled neural network)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010778393.4A CN112199496A (en) 2020-08-05 2020-08-05 Power grid equipment defect text classification method based on multi-head attention mechanism and RCNN (Rich coupled neural network)

Publications (1)

Publication Number Publication Date
CN112199496A true CN112199496A (en) 2021-01-08

Family

ID=74006164

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010778393.4A Pending CN112199496A (en) 2020-08-05 2020-08-05 Power grid equipment defect text classification method based on multi-head attention mechanism and RCNN (Rich coupled neural network)

Country Status (1)

Country Link
CN (1) CN112199496A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112765353A (en) * 2021-01-22 2021-05-07 重庆邮电大学 Scientific research text-based biomedical subject classification method and device
CN113297380A (en) * 2021-05-27 2021-08-24 长春工业大学 Text classification algorithm based on self-attention mechanism and convolutional neural network
CN113886524A (en) * 2021-09-26 2022-01-04 四川大学 Network security threat event extraction method based on short text
CN115617990A (en) * 2022-09-28 2023-01-17 浙江大学 Electric power equipment defect short text classification method and system based on deep learning algorithm
CN116484262A (en) * 2023-05-06 2023-07-25 南通大学 Textile equipment fault auxiliary processing method based on text classification

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109472024A (en) * 2018-10-25 2019-03-15 安徽工业大学 A kind of file classification method based on bidirectional circulating attention neural network
CN109885673A (en) * 2019-02-13 2019-06-14 北京航空航天大学 A kind of Method for Automatic Text Summarization based on pre-training language model
CN110209824A (en) * 2019-06-13 2019-09-06 中国科学院自动化研究所 Text emotion analysis method based on built-up pattern, system, device
CN110532386A (en) * 2019-08-12 2019-12-03 新华三大数据技术有限公司 Text sentiment classification method, device, electronic equipment and storage medium
CN110569508A (en) * 2019-09-10 2019-12-13 重庆邮电大学 Method and system for classifying emotional tendencies by fusing part-of-speech and self-attention mechanism
CN110619034A (en) * 2019-06-27 2019-12-27 中山大学 Text keyword generation method based on Transformer model
CN110727824A (en) * 2019-10-11 2020-01-24 浙江大学 Method for solving question-answering task of object relationship in video by using multiple interaction attention mechanism
CN110781305A (en) * 2019-10-30 2020-02-11 北京小米智能科技有限公司 Text classification method and device based on classification model and model training method
CN111079532A (en) * 2019-11-13 2020-04-28 杭州电子科技大学 Video content description method based on text self-encoder
CN111259666A (en) * 2020-01-15 2020-06-09 上海勃池信息技术有限公司 CNN text classification method combined with multi-head self-attention mechanism
CN111461190A (en) * 2020-03-24 2020-07-28 华南理工大学 Deep convolutional neural network-based non-equilibrium ship classification method

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109472024A (en) * 2018-10-25 2019-03-15 安徽工业大学 A kind of file classification method based on bidirectional circulating attention neural network
CN109885673A (en) * 2019-02-13 2019-06-14 北京航空航天大学 A kind of Method for Automatic Text Summarization based on pre-training language model
CN110209824A (en) * 2019-06-13 2019-09-06 中国科学院自动化研究所 Text emotion analysis method based on built-up pattern, system, device
CN110619034A (en) * 2019-06-27 2019-12-27 中山大学 Text keyword generation method based on Transformer model
CN110532386A (en) * 2019-08-12 2019-12-03 新华三大数据技术有限公司 Text sentiment classification method, device, electronic equipment and storage medium
CN110569508A (en) * 2019-09-10 2019-12-13 重庆邮电大学 Method and system for classifying emotional tendencies by fusing part-of-speech and self-attention mechanism
CN110727824A (en) * 2019-10-11 2020-01-24 浙江大学 Method for solving question-answering task of object relationship in video by using multiple interaction attention mechanism
CN110781305A (en) * 2019-10-30 2020-02-11 北京小米智能科技有限公司 Text classification method and device based on classification model and model training method
CN111079532A (en) * 2019-11-13 2020-04-28 杭州电子科技大学 Video content description method based on text self-encoder
CN111259666A (en) * 2020-01-15 2020-06-09 上海勃池信息技术有限公司 CNN text classification method combined with multi-head self-attention mechanism
CN111461190A (en) * 2020-03-24 2020-07-28 华南理工大学 Deep convolutional neural network-based non-equilibrium ship classification method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SIWEI LAI等: "Recurrent Convolutional Neural Networks for Text Classification", 《PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112765353A (en) * 2021-01-22 2021-05-07 重庆邮电大学 Scientific research text-based biomedical subject classification method and device
CN112765353B (en) * 2021-01-22 2022-11-04 重庆邮电大学 Scientific research text-based biomedical subject classification method and device
CN113297380A (en) * 2021-05-27 2021-08-24 长春工业大学 Text classification algorithm based on self-attention mechanism and convolutional neural network
CN113886524A (en) * 2021-09-26 2022-01-04 四川大学 Network security threat event extraction method based on short text
CN115617990A (en) * 2022-09-28 2023-01-17 浙江大学 Electric power equipment defect short text classification method and system based on deep learning algorithm
CN115617990B (en) * 2022-09-28 2023-09-05 浙江大学 Power equipment defect short text classification method and system based on deep learning algorithm
CN116484262A (en) * 2023-05-06 2023-07-25 南通大学 Textile equipment fault auxiliary processing method based on text classification
CN116484262B (en) * 2023-05-06 2023-12-08 南通大学 Textile equipment fault auxiliary processing method based on text classification

Similar Documents

Publication Publication Date Title
CN112199496A (en) Power grid equipment defect text classification method based on multi-head attention mechanism and RCNN (Rich coupled neural network)
CN108304468B (en) Text classification method and text classification device
CN106250934B (en) Defect data classification method and device
Halibas et al. Application of text classification and clustering of Twitter data for business analytics
CN106021410A (en) Source code annotation quality evaluation method based on machine learning
CN108536756A (en) Mood sorting technique and system based on bilingual information
CN110377901B (en) Text mining method for distribution line trip filling case
CN111767398A (en) Secondary equipment fault short text data classification method based on convolutional neural network
CN113590764B (en) Training sample construction method and device, electronic equipment and storage medium
CN110895565A (en) Method and system for classifying fault defect texts of power equipment
CN109446423B (en) System and method for judging sentiment of news and texts
CN111309859B (en) Scenic spot network public praise emotion analysis method and device
CN110910175A (en) Tourist ticket product portrait generation method
CN112966708A (en) Chinese crowdsourcing test report clustering method based on semantic similarity
CN113886562A (en) AI resume screening method, system, equipment and storage medium
CN112417893A (en) Software function demand classification method and system based on semantic hierarchical clustering
TWI734085B (en) Dialogue system using intention detection ensemble learning and method thereof
CN115757695A (en) Log language model training method and system
CN114416991A (en) Method and system for analyzing text emotion reason based on prompt
CN117768618A (en) Method for analyzing personnel violation based on video image
CN113378024A (en) Deep learning-based public inspection field-oriented related event identification method
CN110362828B (en) Network information risk identification method and system
CN117390198A (en) Method, device, equipment and medium for constructing scientific and technological knowledge graph in electric power field
CN115357718B (en) Method, system, device and storage medium for discovering repeated materials of theme integration service
CN111160756A (en) Scenic spot assessment method and model based on secondary artificial intelligence algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210108

RJ01 Rejection of invention patent application after publication