CN112347771A - Method and equipment for extracting entity relationship - Google Patents
Method and equipment for extracting entity relationship Download PDFInfo
- Publication number
- CN112347771A CN112347771A CN202011402086.2A CN202011402086A CN112347771A CN 112347771 A CN112347771 A CN 112347771A CN 202011402086 A CN202011402086 A CN 202011402086A CN 112347771 A CN112347771 A CN 112347771A
- Authority
- CN
- China
- Prior art keywords
- entity
- medical
- sentence
- vector
- cls
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 19
- 239000013598 vector Substances 0.000 claims abstract description 102
- 238000013528 artificial neural network Methods 0.000 claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 9
- 238000000605 extraction Methods 0.000 claims description 10
- 201000010099 disease Diseases 0.000 claims description 8
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 8
- 239000003814 drug Substances 0.000 claims description 4
- 229940079593 drug Drugs 0.000 claims description 4
- 239000000463 material Substances 0.000 claims description 4
- 208000024891 symptom Diseases 0.000 claims description 4
- 238000011282 treatment Methods 0.000 claims description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 208000035143 Bacterial infection Diseases 0.000 description 1
- 208000006545 Chronic Obstructive Pulmonary Disease Diseases 0.000 description 1
- 230000009798 acute exacerbation Effects 0.000 description 1
- 208000022362 bacterial infectious disease Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 210000000265 leukocyte Anatomy 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 230000003448 neutrophilic effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Epidemiology (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention relates to a method and a device for extracting entity relationships, wherein the method comprises the following steps: identifying the category of the medical entity in the medical text, and splitting the medical text into single sentences; respectively selecting a medical entity to combine with a single sentence to form a sentence for inputting into a pre-training BERT model aiming at each category, and obtaining a vector of the sentence and a vector of the name of each medical entity from an output layer of the BERT model; respectively inputting the vectors of the sentences and the averaged value of the vector of each medical entity name into a feedforward neural network to obtain a plurality of intermediate vectors; splicing the plurality of intermediate vectors, accessing the intermediate vectors into a fully-connected neural network, and classifying the intermediate vectors based on softmax in the fully-connected neural network to obtain classification probability; and selecting the relation class with the highest classification probability as the final relation between the medical entities. According to the scheme, the pre-trained BERT model is used for extracting the upper semantic features and the lower semantic features of the entity, and the type of the entity is added into the prediction of the relationship, so that the accuracy of recognition is improved.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a method and equipment for extracting entity relationships.
Background
In many fields, such as the medical field, a large number of entities are involved, and for this reason, the relationship between the entities needs to be known, so as to facilitate subsequent applications, but at present, the entity relationship is generally obtained by extracting features of different dimensions by using various CNN (Convolutional Neural Networks) and LSTM (Long Short-Term-Memory artificial Neural Networks) deep learning Networks, and then combining these various CNN and LSTM deep learning Networks together to select the entity relationship with the rightmost sample.
However, the current method does not take into account the semantic information of the upper and lower parts among different entities and the type information of the entities, which leads to inaccurate identification.
For this reason, there is a need for a better solution to the problems of the prior art.
Disclosure of Invention
The invention provides an entity relationship extraction method and device, which can solve the technical problem of inaccurate identification in the prior art.
The technical scheme for solving the technical problems is as follows:
the embodiment of the invention provides an entity relationship extraction method, which comprises the following steps:
identifying the category of a medical entity in a medical text, and splitting the medical text into single sentences;
respectively selecting one medical entity to combine with the single sentence to form a sentence for inputting into a pre-training BERT model aiming at each category, and obtaining a vector of the sentence and a vector of each medical entity name from an output layer of the BERT model;
respectively inputting the vectors of the sentences and the averaged value of the vectors of the medical entity names into a feedforward neural network to obtain a plurality of intermediate vectors;
splicing the intermediate vectors, accessing the intermediate vectors into a fully-connected neural network, and classifying the intermediate vectors based on softmax in the fully-connected neural network to obtain classification probability;
selecting the relationship class with the highest classification probability as the final relationship between the medical entities.
In a specific embodiment, before identifying the category of the medical entity in the medical text, the method further comprises: medical texts are acquired.
In a particular embodiment, the medical text includes any combination of one or more of the following: medical teaching materials, clinical guidelines, and medical records.
In a specific embodiment, the "identifying the category of the medical entity in the medical text" includes:
and identifying the category of the medical entity in the medical text by adopting a combination of BERT and CRF.
In a specific embodiment, the categories include: diseases, examinations, symptoms, treatments, and drugs.
In a specific embodiment, in the sentence, an identifier of a category corresponding to each medical entity is set in front of each medical entity; and setting a sentence mark before the sentence.
In a specific embodiment, the vector of sentences is determined by the following formula:
H'CLS=WCLS(tanh(HCLS))+bCLS;
wherein, H'CLSA vector for the sentence; wCLSA weight parameter for the sentence; hCLSIs the sentence; bCLSIs a bias parameter for the sentence.
In a specific embodiment, when the number of the medical entities is 2; the vector of medical entity names is determined by the following formula:
wherein ie1,je1,ie2,je2The first and last character positions of entity e1 and entity e2 respectively; he'1Is a vector of entity e 1; he'2Is a vector of entity e 2; we1A weight parameter for entity e 1; we2A weight parameter for entity e 2; be1Bias parameter for entity e 1; be2Is the bias parameter of entity e 2.
In a specific embodiment, when the number of the medical entities is 2; the classification probability is determined by the following formula:
p=softmax(W[concat(H'CLS,He'1,He'2)]+b);
p is the classification probability; w is a weight parameter of the hidden layer; b is a bias parameter of the hidden layer; h'CLSA vector for the sentence; he'1Is a vector of entity e 1; he'2Is a vector of entity e 2.
The embodiment of the present invention further provides an extraction device for entity relationships, including:
the identification module is used for identifying the category of the medical entity in the medical text and splitting the medical text into single sentences;
an obtaining module, configured to select one medical entity for each category, combine the single sentence to form a sentence, input the sentence into a pre-training BERT model, and obtain a vector of the sentence and a vector of each medical entity name from an output layer of the BERT model;
the intermediate module is used for respectively inputting the vectors of the sentences and the averaged values of the vectors of the medical entity names into a feedforward neural network to obtain a plurality of intermediate vectors;
the input module is used for splicing the intermediate vectors, accessing the intermediate vectors into a fully-connected neural network, and classifying the intermediate vectors based on softmax in the fully-connected neural network to obtain classification probability;
a determining module for selecting the relationship category with the highest classification probability as the final relationship between the medical entities.
The invention has the beneficial effects that:
according to the scheme, a pre-training BERT model is adopted, and vectors of sentences and vectors of names of medical entities are obtained from an output layer of the BERT model; then, a feedforward neural network is given to obtain a plurality of intermediate vectors which are accessed into the fully-connected neural network to obtain classification probability; and determining the final relationship between the medical entities based on the classification probability, thereby extracting the upper and lower semantic features of the entities by using a pre-trained BERT model, and adding the types of the entities into the prediction of the relationship, thereby improving the accuracy of identification.
Drawings
Fig. 1 is a schematic flowchart of an entity relationship extraction method according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of an entity relationship extraction method according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an extraction device for entity relationships according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an extraction device for entity relationships according to an embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
The method for extracting entity relationships provided by the embodiment of the invention, as shown in fig. 1 or 2, includes the following steps:
specifically, the medical text comprises any combination of one or more of the following: medical teaching materials, clinical guidelines, and medical records.
Thus, the "identifying the category of the medical entity in the medical text" in step 101 includes:
the category of the medical entity in the medical text is identified by using a combination of BERT (Bidirectional Encoder retrieval from Transformers, namely, Encoder of Bidirectional Transformer) and CRF (Conditional Random Field).
The categories include: diseases, examinations, symptoms, treatments, and drugs.
Specifically, other categories may be provided according to needs, and are not limited to the above specific categories.
102, respectively selecting one medical entity to combine with the single sentence to form a sentence for inputting into a pre-training BERT model aiming at each category, and obtaining a vector of the sentence and a vector of each medical entity name from an output layer of the BERT model;
in the sentence, an identifier of the corresponding category of the medical entity is arranged in front of each medical entity; and setting a sentence mark before the sentence.
For example, the identification of the disease category is "& disease", the identification of the examination category is "# examination", and the identification of the entire sentence is "[ CLS ]".
Specifically, the sentence identifiers are different from the category identifiers, and the different category identifiers are also different.
103, respectively inputting the vectors of the sentences and the averaged values of the vectors of the medical entity names into a feed-forward neural network to obtain a plurality of intermediate vectors;
specifically, the vector of the sentence is determined by the following formula:
H'CLS=WCLS(tanh(HCLS))+bCLS;
wherein, H'CLSA vector for the sentence; wCLSA weight parameter for the sentence; hCLSIs the sentence; bCLSIs a bias parameter for the sentence.
The description is given by taking two medical entities as examples, and the medical entities belong to different categories, when the number of the medical entities is 2; the vector of medical entity names is determined by the following formula:
wherein ie1,je1,ie2,je2The first and last character positions of entity e1 and entity e2 respectively; he'1Is a vector of entity e 1; he'2Is a vector of entity e 2; we1A weight parameter for entity e 1; we2A weight parameter for entity e 2; be1Bias parameter for entity e 1; be2Is the bias parameter of entity e 2.
still taking the above as an example, when the number of the medical entities is 2; the classification probability is determined by the following formula:
p=softmax(W[concat(H'CLS,He'1,He'2)]+b);
p is the classification probability; w is a weight parameter of the hidden layer; b is a bias parameter of the hidden layer; h'CLSA vector for the sentence; he'1Is a vector of entity e 1; he'2Is a vector of entity e 2.
And 105, selecting the relation category with the maximum classification probability as the final relation between the medical entities.
According to the scheme, medical entities in a medical text are input into a pre-trained BERT model, and vectors of sentences and the name of each medical entity are obtained from an output layer of the BERT model; then, a feedforward neural network is given to obtain a plurality of intermediate vectors which are accessed into the fully-connected neural network to obtain classification probability; and determining the final relationship between the medical entities based on the classification probability, thereby extracting the upper and lower semantic features of the entities by using a pre-trained BERT model, and adding the types of the entities into the prediction of the relationship, thereby improving the accuracy of identification.
In a specific embodiment, before identifying the category of the medical entity in the medical text, the method further comprises: medical texts are acquired.
In a specific example, as shown in fig. 2, the present solution further includes the following steps:
1. collecting medical documents, e.g. medical textbooks, clinical guidelines, medical records, etc
2. Medical entity recognition is carried out on the medical text in the step 1 by adopting a pre-training model BERT + CRF, the text is split into single sentences, and two types of entities are selected from the single sentences for relation recognition, such as disease and examination
3. And (3) randomly selecting one entity from each category of the two entities in the step (2) to be combined pairwise to obtain a plurality of entity pairs e1 and e2, wherein each entity pair is used as one input. Two entities are distinguished using a special symbol (e.g., "&" and "#") for an input sentence, and the entity is previously added to the entity type. The processing method is exemplified as follows:
[ CLS ] & diseases & chronic obstructive pulmonary disease & acute exacerbation is often induced by microbial infection, and when bacterial infection is combined, # test # blood leukocyte count # is increased, and neutrophilic granulosa nuclei are moved to the left
4. Adding the single sentence processed in the step 3 into a pre-trained BERT model, extracting [ CLS ] vectors and vectors of two entity names from an output layer of the BERT, respectively adding the [ CLS ] vectors and the vectors into a feedforward neural network, finally splicing the three updated variables, then accessing the three variables into a fully-connected neural network, and classifying the variables through softmax, wherein the method comprises the following steps:
(1) [ CLS ] ACCESS FEED-FORWARD NEURAL NETWORK
H'CLS=WCLS(tanh(HCLS))+bCLS
(2) The vectors of two entity names are averaged respectively and are connected into a feedforward neural network
Wherein; the vector averaging of the entity name is the vector averaging of each word of the entity name; i.e. ie1,je1,ie2,je2The first and last character positions of the entity e1 and the entity e2 respectively
(3) Splicing the three vectors obtained in the step (1) and the step (2), accessing the three vectors into a fully-connected neural network, and obtaining classification probability through softmax, wherein the formula is as follows:
p=softmax(W[concat(H'CLS,He'1,He'2)]+b)
5. based on the probability p of step 4, the relationship class with the highest probability is selected as the final relationship of the entities e1 and e 2.
In the scheme, the pre-trained BERT model is used for extracting the upper semantic features and the lower semantic features of the entity, and the type of the entity is added into the prediction of the relationship, so that the accuracy of recognition is effectively improved.
Example 2
The embodiment 2 of the present invention further discloses an extraction device for entity relationships, as shown in fig. 3, including:
the identification module 201 is configured to identify categories of medical entities in a medical text, and split the medical text into single sentences;
an obtaining module 202, configured to select, for each category, one medical entity to form a sentence in combination with the single sentence, and input the sentence into a pre-training BERT model, and obtain, from an output layer of the BERT model, a vector of the sentence and a vector of a name of each medical entity;
the intermediate module 203 is configured to input the vectors of the sentences and the averaged value of the vector of each medical entity name into a feed-forward neural network, so as to obtain a plurality of intermediate vectors;
the input module 204 is used for splicing the intermediate vectors, accessing the intermediate vectors to a fully-connected neural network, and classifying the intermediate vectors based on softmax in the fully-connected neural network to obtain classification probability;
a determining module 205, configured to select the relationship class with the highest classification probability as the final relationship between the medical entities.
In a specific embodiment, as shown in fig. 4, the method further includes:
a text module 206 for obtaining the medical text before identifying the category of the medical entity in the medical text.
In a particular embodiment, the medical text includes any combination of one or more of the following: medical teaching materials, clinical guidelines, and medical records.
In a specific embodiment, the identifying module 201 is configured to:
and identifying the category of the medical entity in the medical text by adopting a combination of BERT and CRF.
An identification module 201, the categories including: diseases, examinations, symptoms, treatments, and drugs.
The identification module 201 is configured to set, in the sentence, an identifier of a category corresponding to each medical entity in front of each medical entity; and setting a sentence mark before the sentence.
A recognition module 201, the vector of the sentence being determined by the following formula:
H'CLS=WCLS(tanh(HCLS))+bCLS;
wherein, H'CLSA vector for the sentence; wCLSA weight parameter for the sentence; hCLSIs the sentence; bCLSIs a bias parameter for the sentence.
An identification module 201, when the number of the medical entities is 2; the vector of medical entity names is determined by the following formula:
wherein ie1,je1,ie2,je2The first and last character positions of entity e1 and entity e2 respectively; he'1Is a vector of entity e 1; he'2Is a vector of entity e 2; we1A weight parameter for entity e 1; we2A weight parameter for entity e 2; be1Bias parameter for entity e 1; be2Is the bias parameter of entity e 2.
An identification module 201, when the number of the medical entities is 2; the classification probability is determined by the following formula:
p=softmax(W[concat(H'CLS,He'1,He'2)]+b);
p is the classification probability; w is a weight parameter of the hidden layer; b is a bias parameter of the hidden layer; h'CLSA vector for the sentence; he'1Is a vector of entity e 1; he'2Is a vector of entity e 2.
While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. An extraction method of entity relationships, comprising:
identifying the category of a medical entity in a medical text, and splitting the medical text into single sentences;
respectively selecting one medical entity to combine with the single sentence to form a sentence for inputting into a pre-training BERT model aiming at each category, and obtaining a vector of the sentence and a vector of each medical entity name from an output layer of the BERT model;
respectively inputting the vectors of the sentences and the averaged value of the vectors of the medical entity names into a feedforward neural network to obtain a plurality of intermediate vectors;
splicing the intermediate vectors, accessing the intermediate vectors into a fully-connected neural network, and classifying the intermediate vectors based on softmax in the fully-connected neural network to obtain classification probability;
selecting the relationship class with the highest classification probability as the final relationship between the medical entities.
2. The method of claim 1, further comprising, prior to identifying the category of the medical entity in the medical text: medical texts are acquired.
3. The method of claim 1 or 2, wherein the medical text comprises any combination of one or more of: medical teaching materials, clinical guidelines, and medical records.
4. The method of claim 1, wherein the identifying the category of the medical entity in the medical text comprises:
and identifying the category of the medical entity in the medical text by adopting a combination of BERT and CRF.
5. The method of claim 1 or 4, wherein the categories include: diseases, examinations, symptoms, treatments, and drugs.
6. The method according to claim 1, wherein in the sentence, each medical entity is preceded by an identification of the corresponding category of the medical entity; and setting a sentence mark before the sentence.
7. The method of claim 1, wherein the vector of sentences is determined by the following formula:
H′CLS=WCLS(tanh(HCLS))+bCLS;
wherein, H'CLSA vector for the sentence; wCLSA weight parameter for the sentence; hCLSIs the sentence; bCLSIs a bias parameter for the sentence.
8. The method of claim 1, wherein when the number of medical entities is 2; the vector of medical entity names is determined by the following formula:
wherein ie1,je1,ie2,je2The first and last character positions of entity e1 and entity e2 respectively; h'e1Is a vector of entity e 1; h'e2Is a vector of entity e 2; we1A weight parameter for entity e 1; we2A weight parameter for entity e 2; be1Bias parameter for entity e 1; be2Is the bias parameter of entity e 2.
9. The method of any one of claims 1, 7, 8, wherein when the number of medical entities is 2; the classification probability is determined by the following formula:
p=softmax(W[concat(H′CLS,H′e1,H′e2)]+b);
p is the classification probability; w is a weight parameter of the hidden layer; b is a bias parameter of the hidden layer; h'CLSA vector for the sentence; h'e1Is a vector of entity e 1; h'e2Is a vector of entity e 2.
10. An entity relationship extraction device, comprising:
the identification module is used for identifying the category of the medical entity in the medical text and splitting the medical text into single sentences;
an obtaining module, configured to select one medical entity for each category, combine the single sentence to form a sentence, input the sentence into a pre-training BERT model, and obtain a vector of the sentence and a vector of each medical entity name from an output layer of the BERT model;
the intermediate module is used for respectively inputting the vectors of the sentences and the averaged values of the vectors of the medical entity names into a feedforward neural network to obtain a plurality of intermediate vectors;
the input module is used for splicing the intermediate vectors, accessing the intermediate vectors into a fully-connected neural network, and classifying the intermediate vectors based on softmax in the fully-connected neural network to obtain classification probability;
a determining module for selecting the relationship category with the highest classification probability as the final relationship between the medical entities.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011402086.2A CN112347771A (en) | 2020-12-03 | 2020-12-03 | Method and equipment for extracting entity relationship |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011402086.2A CN112347771A (en) | 2020-12-03 | 2020-12-03 | Method and equipment for extracting entity relationship |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112347771A true CN112347771A (en) | 2021-02-09 |
Family
ID=74428052
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011402086.2A Pending CN112347771A (en) | 2020-12-03 | 2020-12-03 | Method and equipment for extracting entity relationship |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112347771A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113360582A (en) * | 2021-06-04 | 2021-09-07 | 中国人民解放军战略支援部队信息工程大学 | Relation classification method and system based on BERT model fusion multi-element entity information |
CN113609868A (en) * | 2021-09-01 | 2021-11-05 | 首都医科大学宣武医院 | Multi-task question-answer driven medical entity relationship extraction method |
CN116894436A (en) * | 2023-09-06 | 2023-10-17 | 神州医疗科技股份有限公司 | Data enhancement method and system based on medical named entity recognition |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110570920A (en) * | 2019-08-20 | 2019-12-13 | 华东理工大学 | Entity and relationship joint learning method based on attention focusing model |
CN110705293A (en) * | 2019-08-23 | 2020-01-17 | 中国科学院苏州生物医学工程技术研究所 | Electronic medical record text named entity recognition method based on pre-training language model |
US20200065374A1 (en) * | 2018-08-23 | 2020-02-27 | Shenzhen Keya Medical Technology Corporation | Method and system for joint named entity recognition and relation extraction using convolutional neural network |
CN111090988A (en) * | 2019-12-31 | 2020-05-01 | 南京新一代人工智能研究院有限公司 | Medical record symptom identification method and system based on dependency syntax analysis |
CN111522915A (en) * | 2020-04-20 | 2020-08-11 | 北大方正集团有限公司 | Extraction method, device and equipment of Chinese event and storage medium |
CN111666350A (en) * | 2020-05-28 | 2020-09-15 | 浙江工业大学 | Method for extracting medical text relation based on BERT model |
-
2020
- 2020-12-03 CN CN202011402086.2A patent/CN112347771A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200065374A1 (en) * | 2018-08-23 | 2020-02-27 | Shenzhen Keya Medical Technology Corporation | Method and system for joint named entity recognition and relation extraction using convolutional neural network |
CN110570920A (en) * | 2019-08-20 | 2019-12-13 | 华东理工大学 | Entity and relationship joint learning method based on attention focusing model |
CN110705293A (en) * | 2019-08-23 | 2020-01-17 | 中国科学院苏州生物医学工程技术研究所 | Electronic medical record text named entity recognition method based on pre-training language model |
CN111090988A (en) * | 2019-12-31 | 2020-05-01 | 南京新一代人工智能研究院有限公司 | Medical record symptom identification method and system based on dependency syntax analysis |
CN111522915A (en) * | 2020-04-20 | 2020-08-11 | 北大方正集团有限公司 | Extraction method, device and equipment of Chinese event and storage medium |
CN111666350A (en) * | 2020-05-28 | 2020-09-15 | 浙江工业大学 | Method for extracting medical text relation based on BERT model |
Non-Patent Citations (3)
Title |
---|
JIAN NI 等: "Cross-Lingual Relation Extraction with Transformers", 《HTTPS://ARXIV.ORG/PDF/2010.08652.PDF》, pages 3 * |
丁龙;文雯;林强;: "基于预训练BERT字嵌入模型的领域实体识别", 情报工程, vol. 05, no. 06, pages 65 - 74 * |
李冬梅 等: "实体关系抽取方法研究综述", 计算机研究与发展, vol. 57, no. 07, pages 1424 - 1448 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113360582A (en) * | 2021-06-04 | 2021-09-07 | 中国人民解放军战略支援部队信息工程大学 | Relation classification method and system based on BERT model fusion multi-element entity information |
CN113609868A (en) * | 2021-09-01 | 2021-11-05 | 首都医科大学宣武医院 | Multi-task question-answer driven medical entity relationship extraction method |
CN116894436A (en) * | 2023-09-06 | 2023-10-17 | 神州医疗科技股份有限公司 | Data enhancement method and system based on medical named entity recognition |
CN116894436B (en) * | 2023-09-06 | 2023-12-15 | 神州医疗科技股份有限公司 | Data enhancement method and system based on medical named entity recognition |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Boenninghoff et al. | Explainable authorship verification in social media via attention-based similarity learning | |
CN109460473B (en) | Electronic medical record multi-label classification method based on symptom extraction and feature representation | |
CN111708873B (en) | Intelligent question-answering method, intelligent question-answering device, computer equipment and storage medium | |
CN111738003B (en) | Named entity recognition model training method, named entity recognition method and medium | |
CN112347771A (en) | Method and equipment for extracting entity relationship | |
CN109726745B (en) | Target-based emotion classification method integrating description knowledge | |
CN109960727B (en) | Personal privacy information automatic detection method and system for unstructured text | |
WO2019232893A1 (en) | Method and device for text emotion analysis, computer apparatus and storage medium | |
CN113593661B (en) | Clinical term standardization method, device, electronic equipment and storage medium | |
Nabil et al. | Labr: A large scale arabic sentiment analysis benchmark | |
CN113806531A (en) | Drug relationship classification model construction method, drug relationship classification method and system | |
CN111950283A (en) | Chinese word segmentation and named entity recognition system for large-scale medical text mining | |
Marasović et al. | Multilingual modal sense classification using a convolutional neural network | |
CN113095081A (en) | Disease identification method and device, storage medium and electronic device | |
Wyner et al. | Passing a USA national bar exam: a first corpus for experimentation | |
Pratiwi et al. | Implementation of rumor detection on twitter using the svm classification method | |
CN111159405B (en) | Irony detection method based on background knowledge | |
Liu et al. | Revisit word embeddings with semantic lexicons for modeling lexical contrast | |
Salem et al. | Refining semantic similarity of paraphasias using a contextual language model | |
CN113782123A (en) | Online medical patient satisfaction measuring method based on network data | |
CN112784601A (en) | Key information extraction method and device, electronic equipment and storage medium | |
Saikh et al. | COVIDRead: A large-scale question answering dataset on COVID-19 | |
CN113869051B (en) | Named entity recognition method based on deep learning | |
CN116072306A (en) | Drug interaction information extraction method based on BioBERT and improved Focal loss | |
Huangfu et al. | OCC model-based emotion extraction from online reviews |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210209 |