CN112800756B - Entity identification method based on PRADO - Google Patents
Entity identification method based on PRADO Download PDFInfo
- Publication number
- CN112800756B CN112800756B CN202011334119.4A CN202011334119A CN112800756B CN 112800756 B CN112800756 B CN 112800756B CN 202011334119 A CN202011334119 A CN 202011334119A CN 112800756 B CN112800756 B CN 112800756B
- Authority
- CN
- China
- Prior art keywords
- gate
- projection
- output
- word
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
The invention relates to the technical field of computer networks, in particular to an entity identification method based on PRADO, which comprises the steps of obtaining original data, and performing word segmentation and labeling processing on the original data; on a PRADO layer, based on a projection Embedding model, a projection network is constructed by using local sensitive hashing, and each character in a sentence is converted into a low-dimensional Embedding word list; extracting Embedding vector features by using the context association characteristics of the BilSTM neural network; distributing the feature vectors acquired by the BilSTM layer to different attention weights by an attention mechanism method; using CRF to complete the task of sequence labeling; the invention adopts the LSH algorithm to construct the projection network so as to achieve the purpose of reducing word embedded vector parameters, and simultaneously uses an attention mechanism to ensure the relation between the feature vector and the whole text so as to eliminate the hidden danger that the LSH algorithm can not well relate to the context.
Description
Technical Field
The invention relates to the technical field of computer networks, in particular to an entity identification method based on PRADO.
Background
In recent years, due to the continuous development of the technical level of the internet, a large amount of data of all walks of life appears on the network, the large amount of data has high value, and how to efficiently acquire, store, analyze and apply the data is a problem to be researched in the big data era. In the data, not only structured data which is already arranged, but also a large amount of unstructured and semi-structured data which is not arranged exist, and natural language processing technology can be used for processing and classifying the data. With the rapid increase of the total amount of internet information, the traditional semantic network is not suitable, and the appearance of the knowledge graph provides a new idea for solving the problem.
The extraction of entity relations is an indispensable link for constructing the knowledge graph, the quality of the extracted entities and the quality of the relation lays the quality of the graph, the technology is not only used for search engines, but also used in other industries including the fields of medical treatment, education, securities investment, finance and the like, in general, each field is related, the existence of the relation provides a foundation for constructing the knowledge graph, and meanwhile, the value of the knowledge graph can be extracted.
An existing entity relationship extraction model, such as a Skip-Gram model, is a method for predicting context word vectors based on a selected target word vector, and firstly, a word in a sequence is selected as a reference point, and then, another word is found near the reference point by using a sliding window as a label, so that a plurality of reference point-label pairs can be obtained, and the reference point-label pairs are used as input of the model. However, the vector dimensions trained by these conventional word vector techniques are large, so that the input parameters in the network are extremely large, and the training of the model is extremely difficult.
Disclosure of Invention
In order to reduce the size of parameters in an Embedding stage and reduce the number of parameters on the premise of ensuring comprehensive word vector description information, so that the training of a model can be simpler and more portable, the invention provides an entity identification method based on PRADO, which specifically comprises the following steps as shown in FIG. 1:
acquiring original data, and performing word segmentation and labeling processing on the original data;
on a PRADO layer, based on a projection Embedding model, a projection network is constructed by using local sensitive hashing, and each character in a sentence is converted into a low-dimensional Embedding word list;
extracting Embedding vector features by using the context association characteristics of the BilSTM neural network;
distributing the feature vectors acquired by the BilSTM layer to different attention weights by an attention mechanism method;
and (4) completing the task of sequence labeling by using the CRF.
Further, the process of converting each word in the sentence into a low-dimensional Embedding word list includes:
Using a projection matrix P generated by an initial random number and using the matrix pairProjecting to obtain d-dimension vector
Using pairs of activation functionsActivating to obtain the low-dimensional Embedding word list e of the wordi。
Further, the process of optimizing the projection matrix P generated by using an initial random number includes:
and comparing the final output result of the model with the actual value, performing a back propagation algorithm, and adaptively updating the projection matrix P through gradient inspection.
wherein, PkIn order to be a function of the projection,representing a vectorAnd vectorThe angle therebetween;is composed ofThe projection of (a) is performed,
further, the ith word is a low-dimensional Embedding word list eiExpressed as:
wherein, WpA weight parameter for the projection network; b ispIs the bias parameter of the projection network.
Further, the feature vectors obtained by the projection layer are assigned with different attention weights by an attention mechanism method, including:
wherein alpha isi,t′Indicating the generation result yiHow much attention is needed to be put to et′Upper, i.e. attention weight factor, et,t′Ensuring the sum of the weights as an auxiliary parameter is 1, yiTo output the result, TxThe length of the input sequence.
Further, the characteristics of the context association of the BilSTM neural network are utilized to extract the Embedding vector characteristics, namely, at each moment, the data to be deleted is added, the newly added content is added, the memory cell is updated, and the data at the current moment is output, the BilSTM neural network comprises a forgetting gate, an input gate and an output gate, wherein the forgetting gate is used for selecting the information to be discarded and left in the memory cell, the input gate is used for updating the control factor and updating the content, the output gate is used for determining the final output content, and the forgetting gate is expressed as:
Γf=σ(Wf[a<t-1>,x<t>,c<t-1>]+bf);
the input gates are represented as:
Γu=σ(Wu[a<t-1>,x<t>,c<t-1>]+bu);
the output gate is represented as:
Γo=σ(Wo[a<t-1>,x<t>,c<t-1>]+bo);
a<t>=Γo*c<t>;
wherein, gamma isfA factor for forgetting the gate, WfWeight of forgetting gate, bfBias value for forgetting gate;a<t-1>Is an activation value; c. C<t-1>The value of the memory cell at the last moment is recorded; gamma-shapeduFactor of input gate, WuAs the weight of the input gate, buIs the offset value of the input gate;the content is to be newly added; c. C<t>Newly added content; x is the number of<t>Is the t-th input parameter; gamma-shapedoFactor of output gate, WoAs weights of output gates, boIs the offset value of the output gate; bcIs composed ofThe corresponding offset value.
Further, the tasks of completing sequence labeling by using the CRF comprise:
wherein the content of the first and second substances,for transfer matrix, representing slave label yi-1To yiThe transition probability of (a) is,is that the predicted result is the yiScore of individual labels, Z (x) is a normalization factor, tkAnd siAs a characteristic function, muiAnd λkIs a weight parameter.
According to the entity recognition model provided by the invention, the idea of a PRADO algorithm is borrowed at a word embedding layer, and a projection network is constructed by adopting an LSH algorithm, so that the purpose of reducing word embedding vector parameters is achieved, and meanwhile, the relation between the feature vector and the whole text is ensured by using an attention mechanism, so that the hidden danger of the context which cannot be well connected by the LSH algorithm is eliminated; then, the characteristic that the network is strongly associated in a local scope is used in a BilSTM layer, so that the trained result has better association with the whole text and the local part; and finally, completing the task of sequence labeling on a CRF layer, and continuously adjusting the weight parameters of each layer through a back propagation mechanism in the whole model.
Drawings
FIG. 1 is a flow chart of a PRADO-based entity identification method of the present invention;
FIG. 2 is a schematic diagram of the PRADO-BilSTM-CRF model employed in the present invention;
FIG. 3 is a schematic structural diagram of an attention model employed in the present invention;
FIG. 4 is a schematic structural diagram of a BilSTM model employed in the present invention;
FIG. 5 is a schematic diagram of an LSTM cell unit according to the present invention;
FIG. 6 is a schematic view of a CRF structure according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The invention provides a PRADO-based entity identification method, as shown in figure 1, which specifically comprises the following steps:
acquiring original data, and performing word segmentation and labeling processing on the original data;
on a PRADO layer, based on a projection Embedding model, a projection network is constructed by using local sensitive hashing, and each character in a sentence is converted into a low-dimensional Embedding word list;
extracting Embedding vector features by using the context association characteristics of the BilSTM neural network;
distributing the feature vectors acquired by the BilSTM layer to different attention weights by an attention mechanism method;
and (4) completing the task of sequence labeling by using the CRF.
As shown in fig. 2, firstly, performing operations such as word segmentation and labeling on original data, and then injecting the original data into a PRADO layer, wherein the layer uses the idea of a projection Embedding model, uses Local Sensitive Hashing (LSH) to construct a projection network, converts each word in a sentence into a low-dimensional Embedding word list, and then distributes feature vectors acquired by a BiLSTM layer to different attention weights by an attention mechanism method, thereby eliminating the defect that an LSH algorithm cannot contact the whole text; the second layer is a BilSTM layer, and the characteristics of context correlation of the BilSTM neural network are utilized to extract the Embedding vector characteristics, so that the defect that the previous and following relations cannot be fully considered by the LSH in the first layer is improved; and the third layer is a CRF layer, and the CRF layer is used for completing the task of sequence marking. Next, this embodiment will describe in detail the manner of using the model in each layer.
(I) PRADO
In the traditional embedding concept, assume that the input text has T tokens or words, WiRepresents the ith word, where i ∈ {0,1.. T-1 }. If V is the number of words in the vocabulary, including the out-of-vocabulary token representing all the missing words, then each word WiAre all mapped to deltaiE.g. V. In most linguistic neural networks, words are typically mapped to fixed-length d-dimensional vectors e using an embedding layer with trainable parameters W ∈ Rd · Vi=W·δiWherein e isiE Rd is the word vector. Since most parameters in the network mainly come from the word vectors trained by W, and a word vector matrix capable of describing W in detail is to be obtained, the completeness of the vocabulary V, that is, the dimension of V, is particularly large, and only then, the word vectors obtained by training have relatively good performance. However, this method is premised on the fact that the dimension of V is large, and the dimension of W is also large, so that the number of parameters of the whole neural network is extremely large, and training is performedThe process of the network is particularly difficult, so that the method for training the word vectors by using a projection Embedding mode is proposed in the Embedding stage so as to achieve the purpose of reducing network parameters and enable the network training to be faster.
In the Embedding stage, if the dimension of the trained W is too large, although the representation of the word vector is complete, the parameters trained by the network can explode, the dimension is too small, and the description of the word vector is inaccurate and the network cannot be correctly trained, so that the mode adopted by the PRADO is a compromise method, and the mode of projecting the network is used, so that a certain word does not need to be particularly accurately represented, and only the trained word vector can describe the attributes of the word to a certain extent. For example, in entity classification, specific differences between the Chongqing university and the Chongqing post and telecommunications university do not need to be known, and only the fact that the Chongqing university and the Chongqing post and telecommunications university refer to each other needs to be understood, that is, in some specific fields, the meaning of the designation of some entities does not need to be completely known, and only the class to which the entity belongs needs to be known.
In the embodiment, a basic projection model is constructed by using Local Sensitive Hashing (LSH), the size and precision of word vectors trained by a traditional word2vec method mainly depend on the dimension of a vocabulary, and the LSH, as a dimension reduction technology in a clustering algorithm, can more independently control the dimension and sparsity of the word vectors, so that some vocabularies needing high-latitude representation can be controlled in the dimension in a certain range to achieve the purpose of reducing parameters and generate compact embedding, thereby optimizing the training effect of the whole model. The main steps are as follows:
1. for each W in the input textiIteratively performing binary hashing to obtain vectorsHere, assume that max (i) ═ N;
2. projection matrix P generated by an initial random number (P can be optimally adjusted by back propagation mechanism)) Will beIs converted intoAs shown in equation (1), to obtain a d-dimension vector
This results in a d-dimensional vector representation, and each dimension corresponds to a Pk=1,2,...,dThe vector is projected.
wherein WpAnd BpRepresenting the weights and bias functions of the projection network, respectively. It can be known from the above formulaA total of N x d parameters that can be mapped into N d-dimensional word-embedding vectors eiThus the resulting eigenvector matrix (e)1,e-2,...,en-1,en)。
Through the method, a feature vector representation compressed by a traditional word embedding method can be obtained, and a token is not required to be described in great detail by using a one-hot vector; meanwhile, the dimensionality of N and d can be limited in a relatively small range by the coarse granularity of word segmentation, so that the number of parameters input into the neural network is smaller, and the operation speed of the network is higher. However, because the LSH algorithm is used, the feature vector obtained by the training can only describe the word within a certain range, and cannot better relate to the following relation before and after, therefore, before the feature vector obtained at this stage is fed into the BiLSTM model, the feature vector needs to be processed by means of an attention mechanism, so as to reduce the disadvantages of the LSH algorithm, as shown in fig. 4, the method specifically includes the following steps:
1. by alphai,t′Indicating the generation result yiHow much attention is needed to be put to et′And satisfies the following equation:
at this time alphai,t′Then again expressed as output yiTo ensure that the sum of weights is 1, softmax is used, and an auxiliary parameter e is introducedt,t′Such that:
2. the above equation requires calculating et,t′Therefore, a simple neural network model needs to be established, and then e is calculated by using a gradient descent algorithmt,t′。
3. The result Y output in the above step is { Y ═ Y1,y2,...,yn-1,ynLet E be the eigenvector matrix E ═ E1,e-2,...,en-1E: ═ Y, and is taken as input to the BiLSTM, by means of which the network is better pairedThe front and back related characteristics of the section improve the accuracy of the final output.
(II) BilsTM layer
In the field of natural language processing, entity naming recognition problems are often expressed as sequence models, if two problems exist using standard neural network models: the first point is that different text sequences can obtain matched output after the model is input due to different sequences, but the input and the output of the model after another sequence is changed are different from the previous sequence and are not equal; the second point is that due to the particularity of the text sequence, the context information of the sequence is related, but the context of the sequence cannot be related by the common model. The general neural network has a natural inverse potential in solving the sequence problem, and thus a Recurrent Neural Network (RNN) is proposed to solve the sequence model problem.
In the task of named entity recognition, compared with the early rule matching and machine learning method, the accuracy of entity recognition can be obviously improved by using a recurrent neural network model, but because the model is large, two defects generally exist: (1) in the back propagation process of the model, due to the sequence of the RNN model, a plurality of hidden layers and a plurality of weight data of each layer, the problem of gradient explosion or disappearance is particularly easy to occur; (2) for longer sequences, the model is not good at capturing long-term dependent effects before and after the sequence.
To solve the above problem, the concept of adding a gate control unit, namely, long-short-term memory (LSTM), to the hidden unit of the RNN changes the hidden layer of the RNN, so that it can better capture deep junctions and improve the gradient hour problem. The main role of these gate structures is to control the number of information flow transmission processes, and in the training phase of the model, there are more and more intermediate data due to the properties of RNN, and the gate structures can know which of these intermediate data are important and need to be retained, and which are relatively unimportant and can be discarded. LSTM has a structure with three gates to control and regulate the transmission of information, a forgetting gate, an input gate and an output gate respectively. But because LSTM only remembers a single-direction text sequence, the BiLSTM model is finally chosen to solve this problem.
The last layer network obtains the word vector part after data preprocessing, and a local sensitive hashing algorithm (LSH) is used in the projection network, and because the algorithm cannot sufficiently link the relationship between two words with relatively long distance, the BiLSTM network needs to be used for better relationship between the words before and after the connection in the future. Next we will input the word vector into the constructed sequence processing model.
The forward propagation formula of the LSTM model is as follows:
(1) forget the door: the forgetting gate is used for determining the information discarded and left in the memory cell and has an output value of gammafOf value between 0 and 1, when ΓfAbout close to 0 means that the more should one discard, ΓfCloser to 1 means that the more should be kept, ΓfThe formula for forward propagation is:
Γf=σ(Wf[a<t-1>,x<t>,c<t-1>]+bf) (7)
(2) an input gate: the input gate determines the new updated content, and the two parts are the update control factor and the updated content respectively. First is the update factor, i.e. the update gate ΓuThe value range is [0, 1]]Because the values of the update are different, the information required to be retained is also different, that is, the importance of the information is different from 0 to 1, which indicates that the importance is changed from low to high. The formula is as follows:
Γu=σ(Wu[a<t-1>,x<t>,c<t-1>]+bu) (8)
finally obtaining the memory cells c at the t moment<t>Disclosure of the inventionOvercombination update gate gammauAnd new added valueCalculated, the formula is:
(3) an output gate: determining the final output content, where the output gate has a value range of [0, 1], and the final output content is expressed by the following formula:
Γo=σ(Wo[a<t-1>,x<t>,c<t-1>]+bo) (11)
a<t>=Γo*c<t> (12)
the main principle of the network is that data to be deleted is removed at each moment, then new content is added and memory cells are updated, and finally data at the current moment are output. In this layer, the main steps are as follows:
1. building a BilST model, and setting a word vector matrix E ═ E obtained in the first step1,e-2,...,en-1E, inputting a BilSTM model;
2. training network weights through a back propagation algorithm;
3. optimizing overfitting phenomena by using techniques such as Dropout and L2 according to requirements;
4. outputting a sentence-level feature vector matrix (y)1,y2,...,yn-1,yn)。
(III) CRF layer
Generally speaking, softmax model can be directly selected to directly obtain the desired result, but because the sentence-level feature vectors obtained in the BilsTM model have the possibility of labeling offset, and the traditional softmax model has the defect of processing the problem, a CRF model is selected to solve the problem so as to obtain the optimal output result facing the global sequence, and the effect is better than that of the BilsTM model alone or the softmax model directly.
The LSTM output vector Y in the previous layer is determined as { Y ═ Y1,y2,...,yn-1,ynAnd (4) inputting the data into the model, and combining the constraint of conditional probability distribution with an input-output sequence to obtain a final result so as to reduce the error of the data. The specific principle is as follows:
first, let us set the output sequence Y of the BiLSTM layer as { Y ═ Y1,y2,...,yn-1,ynThe input sequence X ═ X for CRF1,x2,...,xn-1,xnThen let the correct notation sequence Y ═ Y1,y2,...,yn-1,ynAnd constructing a conditional probability P ═ y |, x, and the main formula is as follows:
whereinFor transfer matrix, representing slave label yi-1To yiThe transition probability of (a) is,is that the predicted result is yiScore of label, Z (x) is normalization factor, tkAnd siAs a characteristic function, muiAnd λkIs a weight parameter.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (4)
1. A PRADO-based entity identification method is characterized by comprising the following steps:
acquiring original data, and performing word segmentation and labeling processing on the original data;
on the PRADO layer, based on a projection Embedding model, a projection network is constructed by using locality sensitive hashing, and each word in a sentence is converted into a low-dimensional Embedding word list, namely the method comprises the following steps:
A projection matrix P generated by an initial random number is utilized, wherein the optimization of the projection matrix P comprises the steps of comparing the final output result of the model with an actual value, carrying out a back propagation algorithm, and adaptively updating the projection matrix P through gradient checking;
and using projection matrix pairsPerforming projection to obtain a d-dimension vectorThe method comprises the following steps:
wherein, PkIn order to be a function of the projection,representing a vectorAnd vectorThe angle therebetween;is composed ofThe projection of (a) is performed,
using pairs of activation functionsActivating to obtain the low-dimensional Embedding word list e of the wordiExpressed as:
wherein, WpThe weight parameter is the projection network; b ispIs a bias parameter of the projection network;
extracting Embedding vector characteristics by using context correlation characteristics of a BilSTM neural network;
distributing the feature vectors acquired by the BilSTM layer to different attention weights by an attention mechanism method;
and (4) completing the task of sequence labeling by using the CRF.
2. The PRADO-based entity recognition method of claim 1, wherein the assigning the feature vectors obtained from the projection layer with different attention weights by an attention mechanism method comprises:
wherein alpha isi,t′Indicating the generation result yiHow much attention is needed to be put to et′Upper, i.e. attention weight factor, et,t′Ensuring as an auxiliary parameter that the sum of the weights is 1, yiTo output the result, TxThe length of the input sequence.
3. A PRADO-based entity recognition method according to claim 1, wherein the bilst neural network context-dependent features are used to extract the Embedding vector features, i.e. the data to be deleted at each time, add new contents and update the memory cells, and output the current time data, the bilst neural network includes a forgetting gate, an input gate and an output gate, the forgetting gate is used to select the information to be discarded and left in the memory cells, the input gate is used to update the control factors, and the content is updated, the output gate is used to determine the final output content, the forgetting gate is expressed as:
Γf=σ(Wf[a<t-1>,x<t>,c<t-1>]+bf);
the input gates are represented as:
Γu=σ(Wu[a<t-1>,x<t>,c<t-1>]+bu);
the output gate is represented as:
Γo=σ(Wo[a<t-1>,x<t>,c<t-1>]+bo);
a<t>=Γo*c<t>;
wherein, gamma isfA factor for forgetting the gate, WfWeight of forgetting gate, bfIs the offset value of the forgetting gate; a is<t-1>Is an activation value; c. C<t-1>The value of the memory cell at the last moment is recorded; gamma-shapeduFactor of input gate, WuAs the weight of the input gate, buIs the offset value of the input gate;the content is to be newly added; c. C<t>Newly added content; x is the number of<t>Is the t-th input parameter; gamma-shapedoFactor of output gate, WoAs weights of output gates, boIs the offset value of the output gate; bcTo be newly added with contentThe corresponding offset value.
4. The PRADO-based entity identification method of claim 1, wherein the task of performing sequence labeling by using CRF comprises:
output sequence Y of the BiLSTM layer ═ Y1,y2,...,yn-1,ynThe input sequence X ═ X for CRF1,x2,...,xn-1,xn};
Make the training network correctThe sequence of the label is Y ═ Y1,y2,...,yn-1,ynAnd constructing a conditional probability P ═ y |, which specifically includes:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011334119.4A CN112800756B (en) | 2020-11-25 | 2020-11-25 | Entity identification method based on PRADO |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011334119.4A CN112800756B (en) | 2020-11-25 | 2020-11-25 | Entity identification method based on PRADO |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112800756A CN112800756A (en) | 2021-05-14 |
CN112800756B true CN112800756B (en) | 2022-05-10 |
Family
ID=75806276
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011334119.4A Active CN112800756B (en) | 2020-11-25 | 2020-11-25 | Entity identification method based on PRADO |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112800756B (en) |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107194414A (en) * | 2017-04-25 | 2017-09-22 | 浙江工业大学 | A kind of SVM fast Incremental Learning Algorithms based on local sensitivity Hash |
CN108628823A (en) * | 2018-03-14 | 2018-10-09 | 中山大学 | In conjunction with the name entity recognition method of attention mechanism and multitask coordinated training |
EP3398115A1 (en) * | 2016-03-01 | 2018-11-07 | Google LLC | Compressed recurrent neural network models |
CN109902145A (en) * | 2019-01-18 | 2019-06-18 | 中国科学院信息工程研究所 | A kind of entity relationship joint abstracting method and system based on attention mechanism |
CN110263332A (en) * | 2019-05-28 | 2019-09-20 | 华东师范大学 | A kind of natural language Relation extraction method neural network based |
CN110825845A (en) * | 2019-10-23 | 2020-02-21 | 中南大学 | Hierarchical text classification method based on character and self-attention mechanism and Chinese text classification method |
CN110832596A (en) * | 2017-10-16 | 2020-02-21 | 因美纳有限公司 | Deep convolutional neural network training method based on deep learning |
WO2020093761A1 (en) * | 2018-11-05 | 2020-05-14 | 扬州大学 | Entity and relationship joint extraction method oriented to software bug knowledge |
CN111291556A (en) * | 2019-12-17 | 2020-06-16 | 东华大学 | Chinese entity relation extraction method based on character and word feature fusion of entity meaning item |
CN111522965A (en) * | 2020-04-22 | 2020-08-11 | 重庆邮电大学 | Question-answering method and system for entity relationship extraction based on transfer learning |
CN111611775A (en) * | 2020-05-14 | 2020-09-01 | 沈阳东软熙康医疗系统有限公司 | Entity identification model generation method, entity identification method, device and equipment |
CN111914097A (en) * | 2020-07-13 | 2020-11-10 | 吉林大学 | Entity extraction method and device based on attention mechanism and multi-level feature fusion |
-
2020
- 2020-11-25 CN CN202011334119.4A patent/CN112800756B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3398115A1 (en) * | 2016-03-01 | 2018-11-07 | Google LLC | Compressed recurrent neural network models |
CN107194414A (en) * | 2017-04-25 | 2017-09-22 | 浙江工业大学 | A kind of SVM fast Incremental Learning Algorithms based on local sensitivity Hash |
CN110832596A (en) * | 2017-10-16 | 2020-02-21 | 因美纳有限公司 | Deep convolutional neural network training method based on deep learning |
CN108628823A (en) * | 2018-03-14 | 2018-10-09 | 中山大学 | In conjunction with the name entity recognition method of attention mechanism and multitask coordinated training |
WO2020093761A1 (en) * | 2018-11-05 | 2020-05-14 | 扬州大学 | Entity and relationship joint extraction method oriented to software bug knowledge |
CN109902145A (en) * | 2019-01-18 | 2019-06-18 | 中国科学院信息工程研究所 | A kind of entity relationship joint abstracting method and system based on attention mechanism |
CN110263332A (en) * | 2019-05-28 | 2019-09-20 | 华东师范大学 | A kind of natural language Relation extraction method neural network based |
CN110825845A (en) * | 2019-10-23 | 2020-02-21 | 中南大学 | Hierarchical text classification method based on character and self-attention mechanism and Chinese text classification method |
CN111291556A (en) * | 2019-12-17 | 2020-06-16 | 东华大学 | Chinese entity relation extraction method based on character and word feature fusion of entity meaning item |
CN111522965A (en) * | 2020-04-22 | 2020-08-11 | 重庆邮电大学 | Question-answering method and system for entity relationship extraction based on transfer learning |
CN111611775A (en) * | 2020-05-14 | 2020-09-01 | 沈阳东软熙康医疗系统有限公司 | Entity identification model generation method, entity identification method, device and equipment |
CN111914097A (en) * | 2020-07-13 | 2020-11-10 | 吉林大学 | Entity extraction method and device based on attention mechanism and multi-level feature fusion |
Non-Patent Citations (5)
Title |
---|
PRADO: Projection Attention Networks for Document Classification On-Device;Kaliamoorthi Prabhu 等;《Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)》;20191130;5012-5021 * |
Quantization and training of neural networks for efficient integer-arithmetic-only inference;Jacob Benoit 等;《Proceedings of the IEEE conference on computer vision and pattern recognition》;20181231;2704-2713 * |
The Prediction Model of Saccade Target Based on LSTM-CRF for Chinese Reading;Wan Xiaoming 等;《International Conference on Brain Inspired Cognitive Systems》;20180731;44-53 * |
基于改进随机决策树算法的分布式数据挖掘;石红姣;《计算机与数字工程》;20170920;第45卷(第9期);1802-1808 * |
应用预处理技术的深度学习特征融合的文字识别算法;冯玮;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20190115(第01期);I138-5329 * |
Also Published As
Publication number | Publication date |
---|---|
CN112800756A (en) | 2021-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11631007B2 (en) | Method and device for text-enhanced knowledge graph joint representation learning | |
CN108733792B (en) | Entity relation extraction method | |
CN111476294B (en) | Zero sample image identification method and system based on generation countermeasure network | |
CN109902293B (en) | Text classification method based on local and global mutual attention mechanism | |
CN110929030B (en) | Text abstract and emotion classification combined training method | |
WO2021212749A1 (en) | Method and apparatus for labelling named entity, computer device, and storage medium | |
CN112579778B (en) | Aspect-level emotion classification method based on multi-level feature attention | |
CN112270379A (en) | Training method of classification model, sample classification method, device and equipment | |
CN111506732B (en) | Text multi-level label classification method | |
CN110046223B (en) | Film evaluation emotion analysis method based on improved convolutional neural network model | |
CN113591483A (en) | Document-level event argument extraction method based on sequence labeling | |
CN115081437B (en) | Machine-generated text detection method and system based on linguistic feature contrast learning | |
CN110580287A (en) | Emotion classification method based ON transfer learning and ON-LSTM | |
CN111984791B (en) | Attention mechanism-based long text classification method | |
WO2023134083A1 (en) | Text-based sentiment classification method and apparatus, and computer device and storage medium | |
CN110276396B (en) | Image description generation method based on object saliency and cross-modal fusion features | |
CN113051914A (en) | Enterprise hidden label extraction method and device based on multi-feature dynamic portrait | |
CN114186063A (en) | Training method and classification method of cross-domain text emotion classification model | |
CN114417851A (en) | Emotion analysis method based on keyword weighted information | |
CN112347245A (en) | Viewpoint mining method and device for investment and financing field mechanism and electronic equipment | |
CN114048314A (en) | Natural language steganalysis method | |
CN110569355A (en) | Viewpoint target extraction and target emotion classification combined method and system based on word blocks | |
CN113627550A (en) | Image-text emotion analysis method based on multi-mode fusion | |
CN113761885A (en) | Bayesian LSTM-based language identification method | |
CN116956228A (en) | Text mining method for technical transaction platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |