CN111090724A - Entity extraction method capable of judging relevance between text content and entity based on deep learning - Google Patents

Entity extraction method capable of judging relevance between text content and entity based on deep learning Download PDF

Info

Publication number
CN111090724A
CN111090724A CN201911148302.2A CN201911148302A CN111090724A CN 111090724 A CN111090724 A CN 111090724A CN 201911148302 A CN201911148302 A CN 201911148302A CN 111090724 A CN111090724 A CN 111090724A
Authority
CN
China
Prior art keywords
entity
text
relevance
deep learning
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911148302.2A
Other languages
Chinese (zh)
Other versions
CN111090724B (en
Inventor
李举
刘方然
李金波
徐常亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xinhua Zhiyun Technology Co ltd
Original Assignee
Xinhua Zhiyun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xinhua Zhiyun Technology Co ltd filed Critical Xinhua Zhiyun Technology Co ltd
Priority to CN201911148302.2A priority Critical patent/CN111090724B/en
Publication of CN111090724A publication Critical patent/CN111090724A/en
Application granted granted Critical
Publication of CN111090724B publication Critical patent/CN111090724B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of entity extraction, in particular to an entity extraction method capable of judging the relevance between text content and an entity based on deep learning. The method comprises the following steps: indicating that the text consists of n words, containing m entities; using LSTM to represent text literal semantics; splicing text literal semantic representation and averaging other entity representations except the correlation entity to be determined, and finally generating text semantic context; performing attention mechanism operation on the text semantic context by the entity to be determined to obtain an attention vector; a further representation of the text semantics is calculated from the attention vector. The design of the invention uses an end-to-end deep learning model to avoid writing a large number of complicated rules and improve the universality of the model. The method avoids the processing of a large number of characteristic projects in machine learning, improves the iteration speed of the model, and is easy to convert.

Description

Entity extraction method capable of judging relevance between text content and entity based on deep learning
Technical Field
The invention relates to the technical field of entity extraction, in particular to an entity extraction method capable of judging the relevance between text content and an entity based on deep learning.
Background
The entity is also called as a "proper name", and refers to an entity with a specific meaning in a text, and mainly comprises people, places, organizations and the like. Named entity recognition aims to identify the entities and types of entities present in the text, and the technology is now well developed. Named entity recognition does not indicate how relevant the entities and articles presented herein are. Entity relevance refers to the strong and weak relevance between an entity and an article, and generally, a plurality of entities appear in one article, but not all entities are strongly relevant to the article. In the actual use process, only entities which are strongly related to the articles need to be concerned, so that the importance of finding and judging the relevance between the entities and the articles is very important. At this stage, there is little research on the relevance of entities to articles, and the only research is based on rules and machine learning. The deep learning network structure provided by the invention can solve the problem of strong and weak correlation between an entity and an article end to end, avoids the problem of poor universality caused by rules, and can automatically perform feature screening, thereby reducing the processing work of a large number of feature projects of machine learning and improving the model iteration speed.
Disclosure of Invention
The invention aims to provide an entity extraction method capable of judging the relevance between text content and an entity based on deep learning, so as to solve the problems in the background technology.
In order to achieve the above object, the present invention provides an entity extraction method based on deep learning and capable of judging the relevance between text content and an entity, wherein the method comprises the following steps:
the method comprises the following steps: indicating that the text is composed of n words
Figure BDA0002282836790000011
Composition of containingm entities [ E ]1,E2,E3,…,Em];
Step two: using LSTM to represent the literal semantic meaning of the text in the step one;
step three: splicing text literal semantic representation and averaging other entity representations except the correlation entity to be determined, and finally generating text semantic context;
step four: performing attention mechanism operation on the text semantic context by the entity to be determined to obtain an attention vector;
step five: and calculating further representation of text semantics according to the attention vectors in the fourth step, wherein the attention vectors are multiplied by elements of context respectively and then added to obtain text semantic representation aiming at the attention of the entity:
Figure BDA0002282836790000021
step six: representing the text semantic meaning based on entity attention in the step five by CrAnd a correlation entity representation E to be determinedmAnd splicing the vectors into a vector d, and sending the vector d into a classifier to finally obtain the probability of strong and weak correlation between the entity and the text.
Preferably, in the first step, w is a word2vec vector of the corresponding word, and E represents a transm representation of the corresponding entity.
Preferably, in the second step, the LSTM algorithm is defined as: given word vector wkThe previous cell state is ck -1Previous hidden layer state hk-1The current cell state is ckThe current hidden layer state is hk-1Therefore, the LSTM network is as follows:
Figure BDA0002282836790000022
Figure BDA0002282836790000023
Figure BDA0002282836790000024
Figure BDA0002282836790000025
Figure BDA0002282836790000026
hk=ok⊙tanh(ck) (6)
wherein i, f and o are respectively an input gate, a forgetting gate and an output gate, and sigma is an activation function, so that the text literal semantic representation is obtained:
Figure BDA0002282836790000027
preferably, in the third step, the representation of the other entities except the correlation entity to be determined by averaging is defined as: let EmFor the correlation entity to be determined, the semantics of other entities except the correlation entity to be determined are expressed as:
Figure BDA0002282836790000031
the spliced text is represented as:
Figure BDA0002282836790000032
preferably, in the fourth step, the attention vector of the entity to the text is:
Figure BDA0002282836790000033
where γ is the attention score function defined as:
Figure BDA0002282836790000034
Wa,baweight matrix and offset.
Preferably, in the sixth step, d ═ Cr+EmThe classifier is as follows:
x=tanh(Wl.d+bl),
wherein Wl,blRespectively weight matrix and offset.
Compared with the prior art, the invention has the beneficial effects that:
1. in the entity extraction method based on deep learning and capable of judging the relevance between the text content and the entity, the end-to-end deep learning model is used, so that the compiling of a large number of complicated rules is avoided, and the universality of the model is improved. The method avoids the processing of a large number of characteristic projects in machine learning, improves the iteration speed of the model, and is easy to convert.
2. In the entity extraction method based on deep learning and capable of judging the relevance between the text content and the entity, the TransH representation entity information is introduced, and the implicit relation between the entities can be captured.
3. In the entity extraction method based on deep learning and capable of judging the relevance between the text content and the entity, the attention mechanism of the entity to the text extracts the text information related to the entity more efficiently.
Drawings
FIG. 1 is a diagram of the algorithm architecture of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present invention provides a technical solution:
the invention provides an entity extraction method capable of judging the relevance between text content and an entity based on deep learning, which comprises the following steps:
the method comprises the following steps: indicating that the text is composed of n words
Figure BDA0002282836790000041
Composition containing m entities [ E1,E2,E3,…,Em](ii) a Where w is the word2vec vector of the corresponding word and E represents the TransH representation of the corresponding entity. word2vec is a tool for converting words into vectors that was derived by Google in 2013, and is used in this patent to convert words in text into corresponding word vectors. TransH is a distributed vector representation based on entities and relations, and obtains dense representation of the entities and relations in a low-dimensional space, and is used for capturing semantic relations between the entities and texts.
The LSTM algorithm is named as Long short-term algorithm, which was originally proposed by Sepp Hochreiter and J ü rgen Schmidhuber in 1997, and is a specific form of RNN (Recurrent neural network) mainly aiming at solving the problems of gradient elimination and gradient explosion in the Long-sequence training processkThe previous cell state is ck-1Previous hidden layer state hk-1The current cell state is ckThe current hidden layer state is hk-1Therefore, the LSTM network is as follows:
Figure BDA0002282836790000042
Figure BDA0002282836790000043
Figure BDA0002282836790000051
Figure BDA0002282836790000052
Figure BDA0002282836790000053
hk=ok⊙tanh(ck) (6)
wherein i, f and o are respectively an input gate, a forgetting gate and an output gate, and sigma is an activation function, so that the text literal semantic representation is obtained:
Figure BDA0002282836790000054
step three: splicing text literal semantic representation and averaging other entity representations except the correlation entity to be determined, and finally generating text semantic context; let EmFor the correlation entity to be determined, the semantics of other entities except the correlation entity to be determined are expressed as:
Figure BDA0002282836790000055
the spliced text is represented as:
Figure BDA0002282836790000056
step four: making the correlation entity E to be determined in step threemPerforming attention vector calculation on the text semantic context; let the attention vector of an entity to text be:
Figure BDA0002282836790000057
where γ is the attention score function defined as:
Figure BDA0002282836790000061
Wa,baweight matrix and offset.
Step five: and calculating further representation of text semantics according to the attention vectors in the fourth step, wherein the attention vectors are multiplied by elements of context respectively and then added to obtain text semantic representation aiming at the attention of the entity:
Figure BDA0002282836790000062
in the sixth step, the text semantic meaning based on the entity attention in the fifth step is expressed CrAnd a correlation entity representation E to be determinedmAnd splicing the vectors into a vector d, and sending the vector d into a classifier to finally obtain the probability of strong and weak correlation between the entity and the text. d ═ Cr+EmThe classifier is as follows:
x=tanh(Wl·d+bl),
wherein Wl,blRespectively obtaining a weight matrix and an offset, and finally obtaining the strong and weak probability of the correlation between the entity and the text through a softmax function:
Figure BDA0002282836790000063
where C is 2, it indicates whether the correlation is strong or weak. The softmax function, also called normalized exponential function, is a generalization of the logistic function. It can "compress" a K-dimensional vector z containing arbitrary real numbers into another K-dimensional real vector y (z) such that each element ranges between (0,1) and the sum of all elements is 1.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and the preferred embodiments of the present invention are described in the above embodiments and the description, and are not intended to limit the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (6)

1. An entity extraction method based on deep learning and capable of judging the relevance between text content and an entity comprises the following steps:
the method comprises the following steps: indicating that the text is composed of n words
Figure FDA0002282836780000011
Composition containing m entities [ E1,E2,E3,...,Em];
Step two: using LSTM to represent the literal semantic meaning of the text in the step one;
step three: splicing text literal semantic representation and averaging other entity representations except the correlation entity to be determined, and finally generating text semantic context;
step four: performing attention mechanism operation on the text semantic context by the entity to be determined to obtain an attention vector;
step five: and calculating further representation of text semantics according to the attention vectors in the fourth step, wherein the attention vectors are multiplied by elements of context respectively and then added to obtain text semantic representation aiming at the attention of the entity:
Figure FDA0002282836780000012
step six: representing the text semantic meaning based on entity attention in the step five by CrAnd a correlation entity representation E to be determinedmAnd splicing the vectors into a vector d, and sending the vector d into a classifier to finally obtain the probability of strong and weak correlation between the entity and the text.
2. The entity extraction method for judging the relevance of text content and entity based on deep learning according to claim 1, wherein: in the first step, w is a word2vec vector of a corresponding word, and E represents a TransH representation of a corresponding entity.
3. The method of claim 1The entity extraction method based on deep learning and capable of judging the relevance between text content and an entity is characterized in that: in the second step, the LSTM algorithm is defined as: given word vector wkThe previous cell state is ck-1Previous hidden layer state hk-1The current cell state is ckThe current hidden layer state is hk-1Therefore, the LSTM network is as follows:
Figure FDA0002282836780000013
Figure FDA0002282836780000014
Figure FDA0002282836780000021
Figure FDA0002282836780000022
Figure FDA0002282836780000023
hk=ok⊙tanh(ck) (6)
wherein i, f and o are respectively an input gate, a forgetting gate and an output gate, and sigma is an activation function, so that the text literal semantic representation is obtained:
Figure FDA0002282836780000024
4. the entity extraction method for judging the relevance of text content and entity based on deep learning according to claim 1, wherein: in the third step, the representation of other entities except the correlation entity to be determined by averaging is defined as: let EmFor the correlation entity to be determined, so averaging is performedOther entity semantics beyond determining a relevance entity are represented as:
Figure FDA0002282836780000025
the spliced text is represented as:
Figure FDA0002282836780000026
5. the entity extraction method for judging the relevance of text content and entity based on deep learning according to claim 1, wherein: in the fourth step, the attention vector of the entity to the text is made as follows:
Figure FDA0002282836780000027
where γ is the attention score function defined as:
Figure FDA0002282836780000031
Wa,baweight matrix and offset.
6. The entity extraction method for judging the relevance of text content and entity based on deep learning according to claim 1, wherein: in the sixth step, d ═ Cr+EmThe classifier is as follows:
x=tanh(Wl·d+bl),
wherein Wl,blRespectively weight matrix and offset.
CN201911148302.2A 2019-11-21 2019-11-21 Entity extraction method capable of judging relevance between text content and entity based on deep learning Active CN111090724B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911148302.2A CN111090724B (en) 2019-11-21 2019-11-21 Entity extraction method capable of judging relevance between text content and entity based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911148302.2A CN111090724B (en) 2019-11-21 2019-11-21 Entity extraction method capable of judging relevance between text content and entity based on deep learning

Publications (2)

Publication Number Publication Date
CN111090724A true CN111090724A (en) 2020-05-01
CN111090724B CN111090724B (en) 2023-05-12

Family

ID=70393523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911148302.2A Active CN111090724B (en) 2019-11-21 2019-11-21 Entity extraction method capable of judging relevance between text content and entity based on deep learning

Country Status (1)

Country Link
CN (1) CN111090724B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112307761A (en) * 2020-11-19 2021-02-02 新华智云科技有限公司 Event extraction method and system based on attention mechanism
CN113569559A (en) * 2021-07-23 2021-10-29 北京智慧星光信息技术有限公司 Short text entity emotion analysis method and system, electronic equipment and storage medium
CN113743104A (en) * 2021-08-31 2021-12-03 合肥智能语音创新发展有限公司 Entity linking method and related device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150236991A1 (en) * 2014-02-14 2015-08-20 Samsung Electronics Co., Ltd. Electronic device and method for extracting and using sematic entity in text message of electronic device
CN107391706A (en) * 2017-07-28 2017-11-24 湖北文理学院 A kind of city tour's question answering system based on mobile Internet
CN109408812A (en) * 2018-09-30 2019-03-01 北京工业大学 A method of the sequence labelling joint based on attention mechanism extracts entity relationship
CN109522547A (en) * 2018-10-23 2019-03-26 浙江大学 Chinese synonym iteration abstracting method based on pattern learning
CN109902145A (en) * 2019-01-18 2019-06-18 中国科学院信息工程研究所 A kind of entity relationship joint abstracting method and system based on attention mechanism
CN110110324A (en) * 2019-04-15 2019-08-09 大连理工大学 A kind of biomedical entity link method that knowledge based indicates
CN110196978A (en) * 2019-06-04 2019-09-03 重庆大学 A kind of entity relation extraction method for paying close attention to conjunctive word

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150236991A1 (en) * 2014-02-14 2015-08-20 Samsung Electronics Co., Ltd. Electronic device and method for extracting and using sematic entity in text message of electronic device
CN107391706A (en) * 2017-07-28 2017-11-24 湖北文理学院 A kind of city tour's question answering system based on mobile Internet
CN109408812A (en) * 2018-09-30 2019-03-01 北京工业大学 A method of the sequence labelling joint based on attention mechanism extracts entity relationship
CN109522547A (en) * 2018-10-23 2019-03-26 浙江大学 Chinese synonym iteration abstracting method based on pattern learning
CN109902145A (en) * 2019-01-18 2019-06-18 中国科学院信息工程研究所 A kind of entity relationship joint abstracting method and system based on attention mechanism
CN110110324A (en) * 2019-04-15 2019-08-09 大连理工大学 A kind of biomedical entity link method that knowledge based indicates
CN110196978A (en) * 2019-06-04 2019-09-03 重庆大学 A kind of entity relation extraction method for paying close attention to conjunctive word

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112307761A (en) * 2020-11-19 2021-02-02 新华智云科技有限公司 Event extraction method and system based on attention mechanism
CN113569559A (en) * 2021-07-23 2021-10-29 北京智慧星光信息技术有限公司 Short text entity emotion analysis method and system, electronic equipment and storage medium
CN113569559B (en) * 2021-07-23 2024-02-02 北京智慧星光信息技术有限公司 Short text entity emotion analysis method, system, electronic equipment and storage medium
CN113743104A (en) * 2021-08-31 2021-12-03 合肥智能语音创新发展有限公司 Entity linking method and related device, electronic equipment and storage medium
CN113743104B (en) * 2021-08-31 2024-04-16 合肥智能语音创新发展有限公司 Entity linking method, related device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111090724B (en) 2023-05-12

Similar Documents

Publication Publication Date Title
CN107273355B (en) Chinese word vector generation method based on word and phrase joint training
Gan et al. Sparse attention based separable dilated convolutional neural network for targeted sentiment analysis
CN110032641B (en) Method and device for extracting event by using neural network and executed by computer
CN112347268A (en) Text-enhanced knowledge graph joint representation learning method and device
CN107944559B (en) Method and system for automatically identifying entity relationship
CN109325229B (en) Method for calculating text similarity by utilizing semantic information
CN111090724A (en) Entity extraction method capable of judging relevance between text content and entity based on deep learning
CN113505200B (en) Sentence-level Chinese event detection method combined with document key information
CN109933792A (en) Viewpoint type problem based on multi-layer biaxially oriented LSTM and verifying model reads understanding method
CN110489554B (en) Attribute-level emotion classification method based on location-aware mutual attention network model
CN111222330B (en) Chinese event detection method and system
CN111460132A (en) Generation type conference abstract method based on graph convolution neural network
CN112232053A (en) Text similarity calculation system, method and storage medium based on multi-keyword pair matching
CN110569355B (en) Viewpoint target extraction and target emotion classification combined method and system based on word blocks
CN116246279A (en) Graphic and text feature fusion method based on CLIP background knowledge
CN111914553A (en) Financial information negative subject judgment method based on machine learning
CN113627550A (en) Image-text emotion analysis method based on multi-mode fusion
CN112560440A (en) Deep learning-based syntax dependence method for aspect-level emotion analysis
CN111737467A (en) Object-level emotion classification method based on segmented convolutional neural network
CN116483979A (en) Dialog model training method, device, equipment and medium based on artificial intelligence
CN116467452A (en) Chinese complaint classification method based on multi-task learning hybrid neural network
CN115659242A (en) Multimode emotion classification method based on mode enhanced convolution graph
CN115934883A (en) Entity relation joint extraction method based on semantic enhancement and multi-feature fusion
CN113190681B (en) Fine granularity text classification method based on capsule network mask memory attention
CN116127954A (en) Dictionary-based new work specialized Chinese knowledge concept extraction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant