CN111090724B - Entity extraction method capable of judging relevance between text content and entity based on deep learning - Google Patents
Entity extraction method capable of judging relevance between text content and entity based on deep learning Download PDFInfo
- Publication number
- CN111090724B CN111090724B CN201911148302.2A CN201911148302A CN111090724B CN 111090724 B CN111090724 B CN 111090724B CN 201911148302 A CN201911148302 A CN 201911148302A CN 111090724 B CN111090724 B CN 111090724B
- Authority
- CN
- China
- Prior art keywords
- entity
- text
- attention
- correlation
- determined
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013135 deep learning Methods 0.000 title claims abstract description 14
- 238000000605 extraction Methods 0.000 title claims abstract description 14
- 239000013598 vector Substances 0.000 claims abstract description 30
- 238000012935 Averaging Methods 0.000 claims abstract description 5
- 230000007246 mechanism Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 10
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 3
- 238000000034 method Methods 0.000 abstract description 6
- 238000010801 machine learning Methods 0.000 abstract description 4
- 238000012545 processing Methods 0.000 abstract description 3
- 238000013136 deep learning model Methods 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 229940119265 sepp Drugs 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to the technical field of entity extraction, in particular to an entity extraction method capable of judging relevance between text content and an entity based on deep learning. The method comprises the following steps: indicating that the text consists of n words, containing m entities; using LSTM to represent text literal semantics; splicing text literal semantic representations and averaging other entity representations except for the entity of the correlation to be determined, and finally generating text semantic context; carrying out attention mechanism operation on text semantic context by a correlation entity to be determined to obtain an attention vector; further representations of text semantics are computed from the attention vectors. The design of the invention uses the end-to-end deep learning model to avoid writing a great deal of complicated rules, thereby improving the model universality. The processing of machine learning of a large number of characteristic projects is avoided, the iteration speed of the model is improved, and the model is easy to convert.
Description
Technical Field
The invention relates to the technical field of entity extraction, in particular to an entity extraction method capable of judging relevance between text content and an entity based on deep learning.
Background
An entity is also called a "special name" and refers to an entity with a specific meaning in text, and mainly includes characters, places, institutions and the like. Named entity recognition is intended to identify entities and types of entities that occur therein, and the technology is now well developed. Named entity recognition does not indicate what the degree of relatedness of entities and articles appear herein. Entity relevance refers to a representation of the relevance of an entity to an article, typically where a large number of entities may appear in an article, but not all entities are strongly related to an article. In the practical use process, only the entity strongly related to the article is often concerned, so it is very important to find and judge the correlation strength between the entity and the article. There is little research on the relevance of entities to articles at this stage, and only research is based on rules and machine learning. The invention provides a deep learning network structure which can solve the problem of strong and weak relativity of entities and articles end to end, avoid the problem of poor generality caused by rules, and can automatically perform feature screening, thereby reducing the processing work of a large number of feature engineering of machine learning and improving the iteration speed of a model.
Disclosure of Invention
The invention aims to provide an entity extraction method capable of judging the relevance of text content and an entity based on deep learning, so as to solve the problems in the background technology.
In order to achieve the above object, the present invention provides an entity extraction method for evaluating relevance between text content and entities based on deep learning, comprising the following steps:
step one: indicating that text is composed of n wordsComposition, containing m entities [ E 1 ,E 2 ,E 3 ,…,E m ];
Step two: representing text literal semantics in step one using LSTM;
step three: splicing text literal semantic representations and averaging other entity representations except for the entity of the correlation to be determined, and finally generating text semantic context;
step four: carrying out attention mechanism operation on text semantic context by a correlation entity to be determined to obtain an attention vector;
step five: calculating further expression of text semantics according to the attention vectors in the step four, multiplying the attention vectors with elements of context respectively, and then adding to obtain text semantic expression aiming at entity attention:
step six: text semantic representation C based on entity attention in step five r And a correlation entity representation to be determined E m And (5) splicing the vector d, and sending the vector d into a classifier to finally obtain the probability of the correlation strength between the entity and the text.
Preferably, in the first step, w is a word2vec vector of the corresponding word, and E represents a TransH representation of the corresponding entity.
Preferably, in the second step, the LSTM algorithm is defined as: given word vector w k The previous cell state is c k -1 Previous hidden layer state h k-1 The current cell state is c k The current hidden layer state is h k-1 The LSTM network is therefore as follows:
h k =o k ⊙tanh(c k ) (6)
wherein i, f, o are input gate, forget gate, output gate, sigma is activation function, and text literal semantic representation is obtained:
preferably, in the third step, the averaging of the representations of the entities other than the entity of the correlation to be determined is defined as: set E m For the correlation entity to be determined, the other entity semantics besides the correlation entity to be determined are averaged as follows:
the spliced text is expressed as:
preferably, in the fourth step, the attention vector of the entity to the text is:
where γ is a attention score function, defined as:
W a ,b a is a weight matrix and an offset.
Preferably, in the sixth step, d=c r +E m The classifier is as follows:
x=tanh(W l .d+bl),
wherein W is l ,b l Respectively a weight matrix and an offset.
Compared with the prior art, the invention has the beneficial effects that:
1. in the entity extraction method capable of judging the relevance of text content and entities based on deep learning, the end-to-end deep learning model is used to avoid writing a large number of complicated rules, and the model universality is improved. The processing of machine learning of a large number of characteristic projects is avoided, the iteration speed of the model is improved, and the model is easy to convert.
2. In the entity extraction method for judging the relevance of text content and entities based on deep learning, transH is introduced to represent entity information, so that the implicit relation between the entities can be captured.
3. In the entity extraction method capable of judging the relevance of text content and the entity based on deep learning, the attention mechanism of the entity to the text more efficiently extracts text information related to the entity.
Drawings
FIG. 1 is a diagram of an algorithm architecture of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the present invention provides a technical solution:
the invention provides a method for extracting entities, which is based on deep learning and can judge the relativity between text contents and entities, and the method comprises the following steps:
step one: indicating that text is composed of n wordsComposition, containing m entities [ E 1 ,E 2 ,E 3 ,…,E m ]The method comprises the steps of carrying out a first treatment on the surface of the Where w is word2vec vector of the corresponding word and E represents the TransH representation of the corresponding entity. word2vec is a tool developed by Google in 2013 that converts words into vectors, which in this patent is used to convert words in text into corresponding word vectors. TransH is a distributed vector representation based on entities and relationships, acquiring dense representations of entities and relationships in a low-dimensional space, used in the present invention to captureSemantic relationships between entities and text.
Step two: representing text literal semantics in step one using LSTM; the LSTM algorithm, which was fully known as Long short-term, was originally proposed by Sepp Hochretiter and Jurgen Schmidhuber in 1997, and is a specific form of RNN (Recurrent neural network ). Mainly aims to solve the problems of gradient elimination and gradient explosion in the long sequence training process. LSTM can perform better in longer sequences than normal RNN. Literal semantics for capturing text are used in this patent. Defining a given word vector based on LSTM k The previous cell state is c k-1 Previous hidden layer state h k-1 The current cell state is c k The current hidden layer state is h k-1 The LSTM network is therefore as follows:
h k =o k ⊙tanh(c k ) (6)
wherein i, f, o are input gate, forget gate, output gate, sigma is activation function, and text literal semantic representation is obtained:
step three: splicing text literal semantic representations and averaging other entity representations except for the entity of the correlation to be determined, and finally generating text semantic context; set E m For the correlation entity to be determined, the other entity semantics besides the correlation entity to be determined are averaged as follows:
the spliced text is expressed as:
step four: causing the correlation entity E to be determined in the third step m Performing attention vector calculation on text semantic context; let the attention vector of the entity to the text be:
where γ is a attention score function, defined as:
W a ,b a is a weight matrix and an offset.
Step five: calculating further expression of text semantics according to the attention vectors in the step four, multiplying the attention vectors with elements of context respectively, and then adding to obtain text semantic expression aiming at entity attention:
in the sixth step, the text semantic representation C based on the entity attention in the fifth step is represented r And a correlation entity representation to be determined E m And (5) splicing the vector d, and sending the vector d into a classifier to finally obtain the probability of the correlation strength between the entity and the text. d=c r +E m The classifier is as follows:
x=tanh(W l ·d+b l ),
wherein W is l ,b l And respectively obtaining the probability of the correlation strength between the entity and the text through a softmax function, wherein the probability is respectively a weight matrix and an offset:
where C is 2, indicating whether the correlation is strong or weak. The softmax function, also known as the normalized exponential function, is a generalization of the logic function. It can "compress" a K-dimensional vector z containing arbitrary real numbers into another K-dimensional real vector y (z) such that each element ranges between (0, 1) and the sum of all elements is 1.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the above-described embodiments, and that the above-described embodiments and descriptions are only preferred embodiments of the present invention, and are not intended to limit the invention, and that various changes and modifications may be made therein without departing from the spirit and scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (3)
1. An entity extraction method capable of judging relevance between text content and an entity based on deep learning comprises the following steps:
step one: indicating that text is composed of n wordsComposition, containing m entities [ E 1 ,E 2 ,E 3 ,…,E m ];
Step two: representing text literal semantics in step one using LSTM;
step three: splicing text literal semantic representations and averaging other entity representations except for the entity of the correlation to be determined, and finally generating text semantic context;
step four: carrying out attention mechanism operation on text semantic context by a correlation entity to be determined to obtain an attention vector;
step five: calculating further expression of text semantics according to the attention vectors in the step four, multiplying the attention vectors with elements of context respectively, and then adding to obtain text semantic expression aiming at entity attention:
step six: text semantic representation C based on entity attention in step five r And a correlation entity representation to be determined E m The vector d is spliced and sent to a classifier, and finally the probability of the correlation strength between the entity and the text is obtained;
in the third step, the other entity representations except the correlation entity to be determined are defined as: set E m For the correlation entity to be determined, the other entity semantics besides the correlation entity to be determined are averaged as follows:
the spliced text is expressed as:
in the fourth step, the attention vector of the entity to the text is:
where γ is a attention score function, defined as:
W a ,b a the weight matrix and the offset are adopted;
in the sixth step, d=c r +E m The classifier is as follows:
x=tanh(W l ·d+b l ),
wherein W is l ,b l Respectively a weight matrix and an offset.
2. The deep learning based entity extraction method for judging relevance of text content to an entity of claim 1, wherein: in the first step, w is word2vec vector of the corresponding word, and E represents the TransH representation of the corresponding entity.
3. The deep learning based entity extraction method for judging relevance of text content to an entity of claim 1, wherein: in the second step, the LSTM algorithm is defined as: given word vector w k The previous cell state is c k-1 Previous hidden layer state h k-1 The current cell state is c k The current hidden layer state is h k-1 The LSTM network is therefore as follows:
i k =σ(W i w ·w k +W i h ·h k-1 +b i ) (1)
h k =o k ⊙tanh(c k ) (6)
wherein i, f, o are input gate, forget gate, output gate, sigma is activation function, and text literal semantic representation is obtained:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911148302.2A CN111090724B (en) | 2019-11-21 | 2019-11-21 | Entity extraction method capable of judging relevance between text content and entity based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911148302.2A CN111090724B (en) | 2019-11-21 | 2019-11-21 | Entity extraction method capable of judging relevance between text content and entity based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111090724A CN111090724A (en) | 2020-05-01 |
CN111090724B true CN111090724B (en) | 2023-05-12 |
Family
ID=70393523
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911148302.2A Active CN111090724B (en) | 2019-11-21 | 2019-11-21 | Entity extraction method capable of judging relevance between text content and entity based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111090724B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112307761A (en) * | 2020-11-19 | 2021-02-02 | 新华智云科技有限公司 | Event extraction method and system based on attention mechanism |
CN113569559B (en) * | 2021-07-23 | 2024-02-02 | 北京智慧星光信息技术有限公司 | Short text entity emotion analysis method, system, electronic equipment and storage medium |
CN113743104B (en) * | 2021-08-31 | 2024-04-16 | 合肥智能语音创新发展有限公司 | Entity linking method, related device, electronic equipment and storage medium |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102146261B1 (en) * | 2014-02-14 | 2020-08-20 | 삼성전자 주식회사 | Electronic Device And Method For Extracting And Using Semantic Entity In Conversation Message Of The Same |
CN107391706B (en) * | 2017-07-28 | 2020-06-23 | 湖北文理学院 | Urban tourism question-answering system based on mobile internet |
CN109408812A (en) * | 2018-09-30 | 2019-03-01 | 北京工业大学 | A method of the sequence labelling joint based on attention mechanism extracts entity relationship |
CN109522547B (en) * | 2018-10-23 | 2020-09-18 | 浙江大学 | Chinese synonym iteration extraction method based on pattern learning |
CN109902145B (en) * | 2019-01-18 | 2021-04-20 | 中国科学院信息工程研究所 | Attention mechanism-based entity relationship joint extraction method and system |
CN110110324B (en) * | 2019-04-15 | 2022-12-02 | 大连理工大学 | Biomedical entity linking method based on knowledge representation |
CN110196978A (en) * | 2019-06-04 | 2019-09-03 | 重庆大学 | A kind of entity relation extraction method for paying close attention to conjunctive word |
-
2019
- 2019-11-21 CN CN201911148302.2A patent/CN111090724B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN111090724A (en) | 2020-05-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111783462B (en) | Chinese named entity recognition model and method based on double neural network fusion | |
CN110765775B (en) | Self-adaptive method for named entity recognition field fusing semantics and label differences | |
CN109582956B (en) | Text representation method and device applied to sentence embedding | |
CN110427461B (en) | Intelligent question and answer information processing method, electronic equipment and computer readable storage medium | |
CN111090724B (en) | Entity extraction method capable of judging relevance between text content and entity based on deep learning | |
CN111401061A (en) | Method for identifying news opinion involved in case based on BERT and Bi L STM-Attention | |
CN110263325B (en) | Chinese word segmentation system | |
CN110347847A (en) | Knowledge mapping complementing method neural network based | |
CN112232053B (en) | Text similarity computing system, method and storage medium based on multi-keyword pair matching | |
CN112069831A (en) | Unreal information detection method based on BERT model and enhanced hybrid neural network | |
CN110717330A (en) | Word-sentence level short text classification method based on deep learning | |
CN112163089B (en) | High-technology text classification method and system integrating named entity recognition | |
CN113627550A (en) | Image-text emotion analysis method based on multi-mode fusion | |
CN111858878A (en) | Method, system and storage medium for automatically extracting answer from natural language text | |
CN110619045A (en) | Text classification model based on convolutional neural network and self-attention | |
CN113987167A (en) | Dependency perception graph convolutional network-based aspect-level emotion classification method and system | |
CN113283524A (en) | Anti-attack based deep neural network approximate model analysis method | |
CN110276396A (en) | Picture based on object conspicuousness and cross-module state fusion feature describes generation method | |
CN116246279A (en) | Graphic and text feature fusion method based on CLIP background knowledge | |
WO2022228127A1 (en) | Element text processing method and apparatus, electronic device, and storage medium | |
CN116226357B (en) | Document retrieval method under input containing error information | |
CN110321565B (en) | Real-time text emotion analysis method, device and equipment based on deep learning | |
CN115033689B (en) | Original network Euclidean distance calculation method based on small sample text classification | |
CN116680407A (en) | Knowledge graph construction method and device | |
CN115659242A (en) | Multimode emotion classification method based on mode enhanced convolution graph |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |