CN111274386A - Work order text classification algorithm based on convolutional neural network and multi-attention machine mechanism - Google Patents

Work order text classification algorithm based on convolutional neural network and multi-attention machine mechanism Download PDF

Info

Publication number
CN111274386A
CN111274386A CN201911147815.1A CN201911147815A CN111274386A CN 111274386 A CN111274386 A CN 111274386A CN 201911147815 A CN201911147815 A CN 201911147815A CN 111274386 A CN111274386 A CN 111274386A
Authority
CN
China
Prior art keywords
word
sentence
neural network
level
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911147815.1A
Other languages
Chinese (zh)
Inventor
王晓峰
周艳
范华
尉耀稳
霍凯龙
陈杰
翁利国
施凌震
徐舒妍
姜川
陶燕增
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Zhejiang Zhongxin Electric Power Engineering Construction Co Ltd
Original Assignee
Hangzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Zhejiang Zhongxin Electric Power Engineering Construction Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd, Zhejiang Zhongxin Electric Power Engineering Construction Co Ltd filed Critical Hangzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Priority to CN201911147815.1A priority Critical patent/CN111274386A/en
Publication of CN111274386A publication Critical patent/CN111274386A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a worksheet text classification algorithm based on a convolutional neural network and a multi-attention machine system, which comprises a training set acquisition step, a text word segmentation step, a word vector training step, a sentence splitting step, a word vector conversion step and a sentence-level convolutional neural network step; a sentence-level attention mechanism step; a sentence-level full-concatenation step, a document processing step, and a class step, including performing linear transformation on the document feature vector acquired in step S9, and then generating probabilities of the respective classes on the class set C using a softmax function. The model is designed into two parts, namely a sentence level and a document level. Sentence features are extracted first at the sentence level and then document features are extracted at the document level for classification. The model structure can ensure that all texts are input into the model and can avoid the waste in calculation caused by the excessively large model.

Description

Work order text classification algorithm based on convolutional neural network and multi-attention machine mechanism
Technical Field
The invention relates to the field of text classification in computer natural language processing, in particular to a work order text classification algorithm based on a convolutional neural network and a multi-attention machine system.
Background
For a power supply company, complaints of users represent opportunities and bring huge challenges, and the huge complaints of work orders enable shunting personnel to accurately distribute the work orders to proper processing departments while reading all the work orders, so that the efficiency is low, and the accuracy cannot be guaranteed. The method can timely and efficiently deal with the complaints of the users, can improve the image of enterprises, increase public praise, timely adjust the business direction of the users, improve the service quality and improve the equal competitiveness greatly. On the contrary, if the user appeal cannot be solved in time, or the enterprise image is greatly discounted when the user appeal is distributed to wrong processing staff, the overstock of the complaint work order can be caused, the word-of-mouth slides down, even a large area of complaint is caused by the user panic, and negative messages are generated. Therefore, how to classify the electric power complaint texts quickly and accurately is a great challenge facing each power supply company nowadays.
With the rise of artificial neural network methods in recent years, various artificial neural network-based algorithms are also applied to the field of text classification for natural language processing, and exhibit superior performance to other conventional classification methods. The most common implementation is to classify text by convolutional neural networks. The convolutional neural network has strong sentence modeling capability because it can capture local features of different positions in the text through the convolutional window. Convolutional neural networks can capture the semantics of text better than other artificial neural network models. However, due to the size limitation of the convolution window, the convolution neural network model cannot capture the dependency relationship between long-distance words, thereby causing information loss when extracting text features. Second, the importance of all text features extracted by the convolutional neural network default is equal, which results in that the important features related to classification cannot play a role commensurate with their importance. While extraneous noise features may interfere with the effectiveness of text classification. Thirdly, the length of the input text of the convolutional neural network is fixed, and if the length of the text is longer, partial text information is discarded. Therefore, there is a great disadvantage in classifying the electric power complaint texts by using such a method.
Disclosure of Invention
The invention aims to provide a work order text classification algorithm based on a convolutional neural network and a multi-attention machine system, aiming at solving a plurality of problems existing in manual classification of power complaint work order texts. The method can be divided into two levels of sentence level and document level, firstly, the characteristics of each sentence of the text are extracted, and then the document characteristics are extracted. The specific operation steps are as follows:
s1, a training set obtaining step, which comprises the steps of obtaining training set files for text classification in advance, wherein the training set files comprise electric power complaint work order texts and corresponding labeled complaint category labels;
s2, a text word segmentation step, which comprises the steps of segmenting the texts obtained in the step S1 by using a python Chinese word segmentation component, and converting each text into a word sequence;
s3, training word vectors, namely performing unsupervised training on the word sequences obtained in the step S2 by using a skip-gram algorithm in a word2vec component in a genetic library to obtain the word vectors corresponding to each word;
s4, a sentence splitting step, which includes splitting the obtained word vector to obtain a word sequence;
s5, converting word vectors, including converting each word in the word sequence obtained in the S4 into corresponding word vectors obtained through S3 training;
s6, a sentence level convolution neural network step, including respectively using the 2-dimensional matrixes obtained in the step S5 as the first layer of the sentence level convolution neural network;
s7, a sentence-level attention mechanism step including assigning different attention weights to the respective word feature vectors in the first layer of the output of the convolutional neural network obtained in step S6 by an attention mechanism formula;
s8, sentence-level full-connection step, which comprises the step of carrying out linear transformation on the output vector S of the S7 sentence-level attention step through a full-connection neural network to obtain a sentence characteristic vector;
s9, a document processing step, which comprises the step of splicing the sentence characteristic vectors acquired in the step S8 into a vector as the input of the document level part;
and S10, a classification step, which comprises the steps of carrying out linear transformation on the document feature vectors acquired in the step S9 and then generating the probability of each classification on the class set C by using a softmax function.
Optionally, the obtaining a word vector corresponding to each word by performing unsupervised training on the word sequence obtained in step S2 by using a skip-gram algorithm in a word2vec component in a generic library includes:
performing unsupervised training on the word sequence obtained in the step S2 by using a skip-gram algorithm in a word2vec component in a genetic library in python to obtain a word vector corresponding to each word, wherein the dimension of the word vector is d, the principle of the skip-gram algorithm is to train word vectors of words w1, w2, … and wN to maximize average logarithmic probability as shown in formula 1, wherein N represents that N non-repeated words are contained in a training set,
Figure RE-GDA0002474884320000031
in the formula: c is the word wtThe context range of (a) represents c words from the first c words of the word t to the c words after the word t in the word sequence, and t belongs to (1, N);
Figure RE-GDA0002474884320000041
in the formula: e (w)t) The expression wtAnd (3) corresponding word vectors, wherein the word vectors corresponding to all words are initialized randomly, and then the vector parameters are continuously updated in an iterative manner through gradient descent until the target function formula 2 is converged.
Optionally, the respectively using the 2-dimensional matrix obtained in step S5 as the first layer of the sentence-level convolutional neural network includes:
using the convolution window matrix W ∈ Rh*dH denotes the size of the convolution window and d denotes the dimension of the word vector, X is given by equation 3layer1Extracting text features, result Xlayer2Is m 2-dimensional matrixes with the shape of n X d, and the rest is repeated until L layers, and the output X islayerLAs the final output of the convolutional neural network,
Figure RE-GDA0002474884320000042
wherein W is ∈ Rh*dFor the convolution window matrix, b ∈ Rd、c∈RdFor deviation, σ is a nonlinear activation function, V ∈ Rh*dIs a matrix of gate cells.
Optionally, the assigning different attention weights to the word feature vectors in the first layer of the output of the convolutional neural network obtained in step S6 through the attention mechanism formula includes:
the output X of the convolutional neural network in step S6 is given by equations 4 and 5layerL=[xL1,xL2,…,xLn]Each word feature vector in (1) is assigned a different attention weight, where xLiRepresents the feature vector of the i-th word,
Figure RE-GDA0002474884320000051
Figure RE-GDA0002474884320000052
where a is the attention vector αiTo assign to the word feature vector xLiN is the length of the sentence set artificially, and s is the sentence feature vector obtained by weighting and summing each word feature vector and the attention weight.
Compared with the prior art, the invention has the beneficial effects that:
the method not only effectively solves the problem of classification and classification of the electric power complaint work order text, but also overcomes the defects and shortcomings of the convolutional neural network in the prior art. Firstly, in order to break through the limitation of the size of a convolution window, capture the dependency relationship between long-distance words, increase the depth of a network, and perform convolution operation on each layer of the network. Thus, as the number of network layers increases, the characteristics output by each layer of network can capture more words than the previous layer. Secondly, in order to enable important features extracted by the convolutional neural network and related to classification to occupy higher weight during classification, and irrelevant noise features to be selectively ignored, a multi-attention mechanism is introduced so as to be capable of assigning more attention to the features related to classification, namely, giving higher weight. Thirdly, the number of neurons at the input end of the convolutional neural network model is fixed, and if the model can be completely input in order to ensure that long text can be input, the number of neurons at the input end is too large, and the time complexity of the model is greatly increased. To solve this problem, the model is designed in two parts, namely, sentence level and document level. Sentence features are extracted first at the sentence level and then document features are extracted at the document level for classification. The model structure can ensure that all texts are input into the model and can avoid the waste in calculation caused by the excessively large model.
Drawings
FIG. 1 is an overall flow chart of the method of the present invention.
Detailed Description
The preferred embodiments of the present invention are described below with reference to the accompanying drawings:
the work order text classification algorithm based on the convolutional neural network and the multi-attention machine system, as shown in fig. 1, includes the following steps:
s1, a training set obtaining step, which comprises the steps of obtaining training set files for text classification in advance, wherein the training set files comprise electric power complaint work order texts and corresponding labeled complaint category labels;
s2, a text word segmentation step, which comprises the steps of segmenting the texts obtained in the step S1 by using a python Chinese word segmentation component, and converting each text into a word sequence;
s3, training word vectors, namely performing unsupervised training on the word sequences obtained in the step S2 by using a skip-gram algorithm in a word2vec component in a genetic library to obtain the word vectors corresponding to each word;
s4, a sentence splitting step, which includes splitting the obtained word vector to obtain a word sequence;
s5, converting word vectors, including converting each word in the word sequence obtained in the S4 into corresponding word vectors obtained through S3 training;
s6, a sentence level convolution neural network step, including respectively using the 2-dimensional matrixes obtained in the step S5 as the first layer of the sentence level convolution neural network;
s7, a sentence-level attention mechanism step including assigning different attention weights to the respective word feature vectors in the first layer of the output of the convolutional neural network obtained in step S6 by an attention mechanism formula;
s8, sentence-level full-connection step, which comprises the step of carrying out linear transformation on the output vector S of the S7 sentence-level attention step through a full-connection neural network to obtain a sentence characteristic vector;
s9, a document processing step, which comprises the step of splicing the sentence characteristic vectors acquired in the step S8 into a vector as the input of the document level part;
and S10, a classification step, which comprises the steps of carrying out linear transformation on the document feature vectors acquired in the step S9 and then generating the probability of each classification on the class set C by using a softmax function.
In implementation, the specific step S3 of training word vectors is: and (4) performing unsupervised training on the word sequence obtained in the step S2 by using a skip-gram algorithm in a word2vec component in a genetic library in python to obtain a word vector corresponding to each word. The dimension of the word vector is d. The principle of the skip-gram algorithm is to train a word vector of words w1, w2, …, wN with a maximum mean logarithmic probability (equation 1), where N denotes that the training set contains N words that are not repeated.
Figure RE-GDA0002474884320000071
In the formula: c is the word wtRepresents c words from the first c words of the word t to c words after the word t in the word sequence, and t belongs to (1, N).
Figure RE-GDA0002474884320000072
In the formula: e (w)t) The expression wtA corresponding word vector;
the word vectors corresponding to all words are initialized randomly, and then the vector parameters are continuously updated iteratively through gradient descent until the objective function (formula 1) converges.
S4, sentence splitting: a sequence of words obtained through step S2 is split into several sentences with a sentence end punctuation (e.g., ". If the word sequence contains m sentences, splitting the word sequence into m parts;
s5, converting word vectors: each word w in the m word sequences obtained by step S4tAre converted into corresponding word vectors e (w) trained through S3t). Thus, m 2-dimensional matrixes with the shape of n x d are finally obtained and are used for training the subsequent neural network. Wherein n is the sentence length set manually, if the sentence length exceeds n, the exceeding part is intercepted, if the sentence length is less than n, the station bit symbol is filled;
s6, sentence level convolution neural network step: using the m 2-dimensional matrices with n × d shapes obtained in step S5 as the first layer X of the sentence-level convolutional neural networklayer 1. Using the convolution window matrix W ∈ Rh*dH denotes the size of the convolution window and d denotes the dimension of the word vector, X is given by equation 3layer 1Extracting text features, result Xlayer 2There are still m 2-dimensional matrices shaped as n x d. Repeating the steps until the L layer, and outputting Xlayer LAs the final output of the convolutional neural network. The range of text that the convolution window can capture increases as the number of layers increases.
Figure RE-GDA0002474884320000081
Wherein W is ∈ Rh*dFor the convolution window matrix, b ∈ Rd、c∈RdIs a deviation. σ is a nonlinear activation function, V is equal to Rh*dFor the gate cell matrix, the gate cell may determine whether the text feature can be passed to the next layer, thereby preserving the text feature associated with classification and screening out noise features that are not associated with classification.
S7, sentence-level attention mechanism: giving the output X of the convolutional neural network in step S6 by the attention mechanism equations 4, 5layerL=[xL1,xL2,…,xLn]Each word feature vector in (1) is assigned a different attention weight, where xLiRepresenting the ith word feature vector.
Figure RE-GDA0002474884320000091
Figure RE-GDA0002474884320000092
Wherein a is attention vector αiTo assign to the word feature vector xLiN is the sentence length set by the person. And s is a sentence feature vector obtained by weighting and summing each word feature vector and the attention weight. Because a multi-attention mechanism is used, the attention vector has a plurality of correlations for respectively examining the word feature vector and the classification from different angles, and each attention vector generates a corresponding sentence feature vector. And finally, splicing all sentence feature vectors into a vector S as the output of the sentence-level attention step.
S8, sentence-level full-connection step: and (3) performing linear transformation on the output vector S of the sentence-level attention step of S7 through a fully-connected neural network (formula 6) to obtain m sentence feature vectors with dimension d.
ys=S*Ws+bsEquation 6
In the formula: wsIs a linear transformation matrix, bsIs a deviation, ysIs a d-dimensional sentence feature vector.
S9, document level part: the m sentence feature vectors of dimension d obtained in step S8 are spliced into one vector as input to the document-level part. The operation of the document level part is completely the same as that of the sentence level part, and the document level part comprises a document level convolution neural network step, a document level attention mechanism step and a document level full connection step. And finally, outputting the document feature vector, which is not described in detail herein.
S10, classification step: the document feature vector acquired in step S9 is linearly transformed by equation 6, and then the probability of each classification on the class set C is generated using the softmax function of equation 7.
yc=D*Wc+bcIn the formula 7, the first and second groups,
Figure RE-GDA0002474884320000101
in the formula: wc,bcIs a variable. PcIs the probability that the text classification is classified into class c, and the classification with the highest probability is the classification result of the text. In the model training stage, the Adam gradient descent method is adopted to update the model weight parameters. The data of this example is the text data of the complaints of the accepted electric power in the whole year of a power supply bureau 95598 in 2018, and the specific characteristics are shown in the following table.
TABLE 1 data set characteristic information Table
Name (R) Number of classification Training set size Test set size
2018 complaint about power supply bureau 5 1200 400
S2, text word segmentation step: text: the client reflects that a plurality of users have no electricity and can convert the electricity into a word sequence: client, reflection, multiple households, no electricity.
S3, word vector training: and (3) carrying out word sequence: client, reflection, multiple households and no electricity, and the corresponding d-dimensional word vector obtained through training is shown in table 2. In the example d is 128.
Figure RE-GDA0002474884320000102
TABLE 2 word vectors corresponding to words
S4, sentence splitting: text: in the last 2 months, 3 times of power failure occurs. The customer reflects that a plurality of customers do not have power. Requiring verification of the cause of the power outage. Can be split into three sentences: 1) in the last 2 months, 3 times of power failure occurs. 2) The customer reflects that a plurality of customers do not have power. 3) Requiring verification of the cause of the power outage.
S5, converting word vectors: since the 3 word sequences obtained in step S4 have different lengths, the length n needs to be unified, and the filling placeholder pad is less than n. In the examples, n was 12, and the results are shown in Table 3.
1) Near to 2 An Moon cake Inner part Appear 3 Next time Power cut pad pad
2) Customer Reflecting Multi-family Without electricity pad pad pad pad pad pad pad
3) Require that Verification Power cut Reason for pad pad pad pad pad pad pad
TABLE 3 unifying sentence length
All words are then converted into corresponding word vectors obtained by training at S3, which ultimately results in 3 2-dimensional matrices of 12 × 128 shape for subsequent training of the neural network.
S6, sentence level convolution neural network step: will pass through step S5And obtaining 3 sentence-level convolutional neural networks with L layers of 2-dimensional matrixes with the shapes of 12 x 128. Using convolution window W ∈ Rh*dAnd extracting text features. h denotes the size of the convolution window, d denotes the dimension of the word vector, h takes 3 and L takes 5 in the example. Thus, a text feature with a distance dependency of 1 can be captured at the second layer of the 2 nd convolutional neural network, and a text feature with a distance dependency of 3 can be captured at the third layer of the 3 rd convolutional neural network. By analogy, at the last layer, i.e., layer 5, a text feature with dependency of 7 may be captured.
S7, sentence-level attention mechanism: the importance of different words in the text also differs. Different attention weights may be assigned to the respective word feature vectors by an attention mechanism. For example, the sequence of words: the weight of "no electricity" in "customer, reflection, multi-family, no electricity" will be higher, and the specific weight assignment is shown in table 4:
word Customer Reflecting Multi-family Without electricity
Weight of 0.03 0.01 0.14 0.82
TABLE 4 attention mechanism Allocation of word weights
Because a multi-attention mechanism is used, the attention vector has a plurality of correlations for respectively examining the word feature vector and the classification from different angles, and each attention vector generates a corresponding sentence feature vector. And finally, splicing all sentence feature vectors into a vector to be used as the output of the sentence-level attention step.
S8, sentence-level full-connection step: and (3) performing linear transformation on the output of the sentence-level attention step of S7 through a fully-connected neural network to obtain 3 sentence feature vectors with dimension of 128.
S9, document level part: the 3 sentence feature vectors of dimension 128 obtained in step S8 are spliced into one vector as input to the document-level part. The operation of the document level part is completely the same as that of the sentence level part, and the document level part comprises a document level convolution neural network step, a document level attention mechanism step and a document level full connection step. And finally, outputting the document feature vector, which is not described in detail herein.
S10, classification step: the document feature vector acquired in step S9 is subjected to linear transformation, and then the probability of each classification on the class set C is generated by the softmax function, and the classification with the highest probability is the classification result of the text, as shown in table 5.
TABLE 5 results of the classification
Figure RE-GDA0002474884320000121
To examine the effect of the classification algorithm of this example, the following comparative experiment was also designed. The hardware configuration of the experimental environment is 4GB RAM, Nvidia Geforce GTX 970M and video memory 3GB, and the experimental framework is tenserflow (1.1.0).
The classification algorithm of the sub-embodiment obtains an effect superior to other algorithms on a power supply bureau 95598 power complaint data set in 2018.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (4)

1. The work order text classification algorithm based on the convolutional neural network and the multi-attention machine system is characterized by comprising the following steps of:
s1, a training set obtaining step, which comprises the steps of obtaining training set files for text classification in advance, wherein the training set files comprise electric power complaint work order texts and corresponding labeled complaint category labels;
s2, a text word segmentation step, which comprises the steps of segmenting the texts obtained in the step S1 by using a python Chinese word segmentation component, and converting each text into a word sequence;
s3, training word vectors, namely performing unsupervised training on the word sequences obtained in the step S2 by using a skip-gram algorithm in a word2vec component in a genetic library to obtain the word vectors corresponding to each word;
s4, a sentence splitting step, which includes splitting the obtained word vector to obtain a word sequence;
s5, converting word vectors, including converting each word in the word sequence obtained in the S4 into corresponding word vectors obtained through S3 training;
s6, a sentence level convolution neural network step, including respectively using the 2-dimensional matrixes obtained in the step S5 as the first layer of the sentence level convolution neural network;
s7, a sentence-level attention mechanism step including assigning different attention weights to the respective word feature vectors in the first layer of the output of the convolutional neural network obtained in step S6 by an attention mechanism formula;
s8, sentence-level full-connection step, which comprises the step of carrying out linear transformation on the output vector S of the S7 sentence-level attention step through a full-connection neural network to obtain a sentence characteristic vector;
s9, a document processing step, which comprises the step of splicing the sentence characteristic vectors acquired in the step S8 into a vector as the input of the document level part;
and S10, a classification step, which comprises the steps of carrying out linear transformation on the document feature vectors acquired in the step S9 and then generating the probability of each classification on the class set C by using a softmax function.
2. The convolutional neural network and multi-attention machine based worksheet text classification algorithm of claim 1, wherein the unsupervised training of the word sequence obtained in step S2 is performed by using skip-gram algorithm in word2vec component in generatim library to obtain word vector corresponding to each word, and the method comprises:
unsupervised training of the word sequence obtained in the step S2 is carried out by using a skip-gram algorithm in a word2vec component in a genetic library in python, a word vector corresponding to each word is obtained, the dimension of the word vector is d, the principle of the skip-gram algorithm is that the word vectors of words w1, w2, and wN are trained as shown in the formula 1 by maximizing the average logarithmic probability, wherein N represents that the training set contains N non-repeated words,
Figure RE-FDA0002474884310000021
in the formula: c is the word wtThe context range of (a) represents c words from the first c words of the word t to the c words after the word t in the word sequence, and t belongs to (1, N);
Figure RE-FDA0002474884310000022
in the formula: e (w)t) The expression wtAnd (3) corresponding word vectors, wherein the word vectors corresponding to all words are initialized randomly, and then the vector parameters are continuously updated in an iterative manner through gradient descent until the target function formula 2 is converged.
3. The convolutional neural network and multi-attention mechanism based work order text classification algorithm as claimed in claim 1, wherein the step of using the 2-dimensional matrix obtained in step S5 as the first layer of the sentence-level convolutional neural network respectively comprises:
using the convolution window matrix W ∈ Rh*dH denotes the size of the convolution window and d denotes the dimension of the word vector byFormula 3 vs. Xlayer1Extracting text features, result Xlayer2Is m 2-dimensional matrixes with the shape of n X d, and the rest is repeated until L layers, and the output X islayerLAs the final output of the convolutional neural network,
Figure RE-FDA0002474884310000031
wherein W is ∈ Rh*dFor the convolution window matrix, b ∈ Rd、c∈RdFor deviation, σ is a nonlinear activation function, V ∈ Rh*dIs a matrix of gate cells.
4. The convolutional neural network and multi-attention mechanism based work order text classification algorithm of claim 1, wherein the assigning different attention weights to the word feature vectors in the first layer of the output of the convolutional neural network obtained in step S6 through the attention mechanism formula comprises:
the output X of the convolutional neural network in step S6 is given by equations 4 and 5layerL=[xL1,xL2,...,xLn]Each word feature vector in (1) is assigned a different attention weight, where xLiRepresents the feature vector of the i-th word,
Figure RE-FDA0002474884310000032
Figure RE-FDA0002474884310000033
where a is the attention vector αiTo assign to the word feature vector xLiN is the length of the sentence set artificially, and s is the sentence feature vector obtained by weighting and summing each word feature vector and the attention weight.
CN201911147815.1A 2019-11-21 2019-11-21 Work order text classification algorithm based on convolutional neural network and multi-attention machine mechanism Pending CN111274386A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911147815.1A CN111274386A (en) 2019-11-21 2019-11-21 Work order text classification algorithm based on convolutional neural network and multi-attention machine mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911147815.1A CN111274386A (en) 2019-11-21 2019-11-21 Work order text classification algorithm based on convolutional neural network and multi-attention machine mechanism

Publications (1)

Publication Number Publication Date
CN111274386A true CN111274386A (en) 2020-06-12

Family

ID=71002926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911147815.1A Pending CN111274386A (en) 2019-11-21 2019-11-21 Work order text classification algorithm based on convolutional neural network and multi-attention machine mechanism

Country Status (1)

Country Link
CN (1) CN111274386A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114332872A (en) * 2022-03-14 2022-04-12 四川国路安数据技术有限公司 Contract document fault-tolerant information extraction method based on graph attention network
CN115731243A (en) * 2022-11-29 2023-03-03 北京长木谷医疗科技有限公司 Spine image segmentation method and device based on artificial intelligence and attention mechanism

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107895051A (en) * 2017-12-08 2018-04-10 宏谷信息科技(珠海)有限公司 A kind of stock news quantization method and system based on artificial intelligence
CN108388554A (en) * 2018-01-04 2018-08-10 中国科学院自动化研究所 Text emotion identifying system based on collaborative filtering attention mechanism
CN108681537A (en) * 2018-05-08 2018-10-19 中国人民解放军国防科技大学 Chinese entity linking method based on neural network and word vector
CN109558487A (en) * 2018-11-06 2019-04-02 华南师范大学 Document Classification Method based on the more attention networks of hierarchy
CN109902174A (en) * 2019-02-18 2019-06-18 山东科技大学 A kind of feeling polarities detection method of the memory network relied on based on aspect
US20190188277A1 (en) * 2017-12-18 2019-06-20 Fortia Financial Solutions Method and device for processing an electronic document

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107895051A (en) * 2017-12-08 2018-04-10 宏谷信息科技(珠海)有限公司 A kind of stock news quantization method and system based on artificial intelligence
US20190188277A1 (en) * 2017-12-18 2019-06-20 Fortia Financial Solutions Method and device for processing an electronic document
CN108388554A (en) * 2018-01-04 2018-08-10 中国科学院自动化研究所 Text emotion identifying system based on collaborative filtering attention mechanism
CN108681537A (en) * 2018-05-08 2018-10-19 中国人民解放军国防科技大学 Chinese entity linking method based on neural network and word vector
CN109558487A (en) * 2018-11-06 2019-04-02 华南师范大学 Document Classification Method based on the more attention networks of hierarchy
CN109902174A (en) * 2019-02-18 2019-06-18 山东科技大学 A kind of feeling polarities detection method of the memory network relied on based on aspect

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HAIZHOU DU等: "Hierarchical Gated Convolutional Networks with Multi-Head Attention for Text Classification" *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114332872A (en) * 2022-03-14 2022-04-12 四川国路安数据技术有限公司 Contract document fault-tolerant information extraction method based on graph attention network
CN114332872B (en) * 2022-03-14 2022-05-24 四川国路安数据技术有限公司 Contract document fault-tolerant information extraction method based on graph attention network
CN115731243A (en) * 2022-11-29 2023-03-03 北京长木谷医疗科技有限公司 Spine image segmentation method and device based on artificial intelligence and attention mechanism
CN115731243B (en) * 2022-11-29 2024-02-09 北京长木谷医疗科技股份有限公司 Spine image segmentation method and device based on artificial intelligence and attention mechanism

Similar Documents

Publication Publication Date Title
US11392838B2 (en) Method, equipment, computing device and computer-readable storage medium for knowledge extraction based on TextCNN
CN110348535B (en) Visual question-answering model training method and device
WO2018218708A1 (en) Deep-learning-based public opinion hotspot category classification method
CN112966114B (en) Literature classification method and device based on symmetrical graph convolutional neural network
CN110781406B (en) Social network user multi-attribute inference method based on variational automatic encoder
US11804216B2 (en) Generating training datasets for a supervised learning topic model from outputs of a discovery topic model
CN108416535A (en) The method of patent valve estimating based on deep learning
CN103810999A (en) Linguistic model training method and system based on distributed neural networks
CN109993543A (en) A kind of complaint handling method and system
CN112419096A (en) Automatic user power demand worksheet transferring method based on NLP information extraction and few-sample self-learning
CN109308355B (en) Legal judgment result prediction method and device
CN111274386A (en) Work order text classification algorithm based on convolutional neural network and multi-attention machine mechanism
CN107886231A (en) The QoS evaluating method and system of customer service
CN111583911A (en) Speech recognition method, device, terminal and medium based on label smoothing
CN112015562A (en) Resource allocation method and device based on transfer learning and electronic equipment
CN112995414A (en) Behavior quality inspection method, device, equipment and storage medium based on voice call
CN116049387A (en) Short text classification method, device and medium based on graph convolution
CN110288191B (en) Data matching method, device, computer equipment and storage medium
CN111428038A (en) Self-attention mechanism-based electric power complaint work order multi-label text classification method
CN112069822B (en) Word vector representation acquisition method, device, equipment and readable medium
CN109978013B (en) Deep clustering method for character action recognition
CN111581386A (en) Construction method, device, equipment and medium of multi-output text classification model
CN116108127A (en) Document level event extraction method based on heterogeneous graph interaction and mask multi-head attention mechanism
CN112002306B (en) Speech class recognition method and device, electronic equipment and readable storage medium
Gong Analysis of internet public opinion popularity trend based on a deep neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200612

RJ01 Rejection of invention patent application after publication