CN115392256A - Drug adverse event relation extraction method based on semantic segmentation - Google Patents
Drug adverse event relation extraction method based on semantic segmentation Download PDFInfo
- Publication number
- CN115392256A CN115392256A CN202211040440.0A CN202211040440A CN115392256A CN 115392256 A CN115392256 A CN 115392256A CN 202211040440 A CN202211040440 A CN 202211040440A CN 115392256 A CN115392256 A CN 115392256A
- Authority
- CN
- China
- Prior art keywords
- drug
- mention
- adverse
- adverse event
- local context
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000003814 drug Substances 0.000 title claims abstract description 131
- 229940079593 drug Drugs 0.000 title claims abstract description 110
- 238000000605 extraction Methods 0.000 title claims abstract description 35
- 230000011218 segmentation Effects 0.000 title claims abstract description 33
- 238000000034 method Methods 0.000 claims abstract description 43
- 238000012549 training Methods 0.000 claims abstract description 36
- 208000030453 Drug-Related Side Effects and Adverse reaction Diseases 0.000 claims abstract description 31
- 239000000725 suspension Substances 0.000 claims abstract description 13
- 238000012545 processing Methods 0.000 claims abstract description 11
- 230000004927 fusion Effects 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 239000011159 matrix material Substances 0.000 claims description 50
- 230000002411 adverse Effects 0.000 claims description 26
- 238000011176 pooling Methods 0.000 claims description 14
- 238000005070 sampling Methods 0.000 claims description 10
- 238000005457 optimization Methods 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 7
- 230000009466 transformation Effects 0.000 claims description 7
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 6
- 238000012935 Averaging Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 5
- 230000009191 jumping Effects 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 4
- 239000000203 mixture Substances 0.000 claims description 3
- 238000012795 verification Methods 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 2
- 238000002372 labelling Methods 0.000 claims 1
- 230000006870 function Effects 0.000 description 23
- 238000011160 research Methods 0.000 description 5
- 238000012512 characterization method Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000005065 mining Methods 0.000 description 4
- 239000000126 substance Substances 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 208000036647 Medication errors Diseases 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000007876 drug discovery Methods 0.000 description 1
- 238000002651 drug therapy Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 239000002547 new drug Substances 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 231100000041 toxicology testing Toxicity 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a method for extracting adverse drug event relation based on semantic segmentation, which comprises the following steps: the method comprises the steps of establishing a drug adverse event relation extraction model with a local context information feature extractor, a semantic feature fusion device, a classifier and a sample imbalance processor, preprocessing data, training the model, optimizing parameters and extracting the drug adverse event relation. The method can better identify the mentioned boundary by marking before and after the drug mention by using special symbols and splicing the adverse event mention label behind the text by using the suspension mark; meanwhile, a U-shaped semantic segmentation network is introduced to fuse local context information to capture global interdependency among adverse drug events, so that key information can be found more accurately; in addition, a balanced softmax method is used for processing unbalanced relationship distribution, so that the influence of irrelevant triple on a model is avoided, and adverse drug event relationships in a medical text are extracted more accurately.
Description
Technical Field
The invention relates to the technical field of medical text data mining, in particular to a method for extracting adverse drug event relation based on semantic segmentation.
Background
Adverse Drug Event (ADE) refers to an Adverse clinical Event occurring during the course of Drug therapy, which is not necessarily causal to the Drug. There are two main causes of adverse drug events, one is a problem with drug quality and the other is medication errors. Adverse drug events seriously jeopardize the physical health of patients and bring about huge economic losses to the whole medical system and society. According to statistics, the emergency rate of medical adverse events accounts for 28% of the total rate of medical adverse events, and due to the importance and harmfulness of medical adverse events, researchers in various fields such as life sciences, biology and comprehensive medicine are concerned with the emergency rate. In addition, although the ultimate goal of drug discovery is to develop chemicals for the treatment of specific diseases, recognition of the chemical and its resulting adverse drug reactions correspondences is critical to improving chemical safety and toxicity studies, as well as facilitating new drug compound screening methods.
After long-term exploration of researchers, the drug adverse event research technology based on text mining gradually develops from an early template and rule-based method to a traditional machine learning-based method taking data as guidance, and makes a major breakthrough in both theoretical and practical researches. In addition, with the rise and development of the deep learning method, the deep learning framework based on the neural network also provides a new idea for the text mining method. Because the neural network model can automatically learn the internal features of the data from the original data through large-scale data training, breakthrough progress has been made in the field of speech and image recognition, and great potential is also shown in the field of natural language processing. Therefore, the text mining method based on deep learning will become a trend of future research development. And the method for researching adverse drug events by using the text mining method based on deep learning has great value and promotion effect on promoting the development of relevant biomedical research.
The inventors of the present application have found through research that identifying all medication and adverse event mentions from natural language texts and extracting the medication and its corresponding adverse event relationship has the following problems due to the specific definition of adverse events of the medication: (1) With the acceleration of the development process of drugs in the biomedical field, in clinical trials before the market, due to the limitation of test conditions, adverse events of many drugs are difficult to be found and are listed in adverse event reports; in addition, because some drug adverse events do not occur until after a period of time after being taken, or occur in a specific population, many potential adverse events cannot be covered by existing dictionaries or databases, and it is difficult to find such potential drug adverse event mentions by only dictionary and rule methods; (2) The same condition mention may be both a drug adverse event and an indication in different contexts, so identification of a drug adverse event mention is more dependent on understanding the context semantic relationship to distinguish specific drug adverse events; (3) There is no uniform naming mode for the same adverse drug event, there may be multiple expression modes for the same disease, such problems may lead to sparse mention names, difficult to be fully learned in limited labeled corpus, difficult to be identified; (4) In some natural language texts, adverse drug events are often represented by non-medical terms, which are often connected with preceding and following common words or adjectives to represent a mention of adverse drug events, so that it is difficult to judge the boundaries of mention of adverse drug events, thereby causing inaccurate recognition.
Disclosure of Invention
Aiming at the technical problems existing in the extraction of the existing medicines and the adverse event relations corresponding to the medicines, the invention provides a method for extracting the adverse event relations of the medicines based on semantic segmentation, which marks the medicines before and after the mention of the medicines by using special symbols and splices the mention marks of the adverse events behind texts by using suspension marks so as to better identify the boundaries of the medicines and the mention of the adverse events; meanwhile, a U-shaped semantic segmentation network is introduced to fuse local context information to capture global interdependency among adverse drug events, so that key information can be found more accurately; in addition, a balanced softmax method is used for processing unbalanced relationship distribution, so that the influence of irrelevant triple on a model is avoided, and the adverse drug event relationship in the medical text is extracted more accurately.
In order to solve the technical problem, the invention adopts the following technical scheme:
a method for extracting adverse drug event relation based on semantic segmentation comprises the following steps:
s1, establishing a drug adverse event relation extraction model:
the drug adverse event relation extraction model is used for extracting drugs in a medical text and adverse events caused by the drugs, and the model structure comprises a local context information feature extractor, a semantic feature fusion device, a classifier and a sample imbalance processor; wherein,
the local context information feature extractor is used for extracting different mentioned local context features from the input of the medical text, and specifically comprises the following steps: given a drug adverse event document containing N text labelsFirstly, fixing marks are inserted at the beginning and the end of the drug mention<s>And</s>to mark the drug mention location and then to mention the corresponding candidate adverse event with suspension marking<o>And</o>the way is spliced behind the text, wherein<o>And</o>encoding the same location as the corresponding adverse event mention, then providing the combined sequence of text labels and inserted suspension labels to the BERT pre-training model to obtain the drug mention label local context representation e s And adverse event mention flag local context representation e o E is to be s And e o Split-together as corresponding drug mention and adverse event mention vs. insert representationWherein M represents the maximum logarithm of mentions of the composition of drug mentions and adverse event mentions in the sample, and finally obtaining an attention expression by using a BERT pre-training modelWherein A is the average value of the attention heads in the last Encoder layer of the BERT pre-training model, and the attention matrix A and affine transformation from the BERT pre-training model are used for obtaining a mention pair relation matrix of the medicine and the adverse event:
wherein,is a Hadamard product, W 1 Is a learnable parameter matrix, H is a drug mention and adverse event mention pair embedded representation, A s Denotes drug mention e s Attention to all the labels of the document, obtained by averaging the mean of the drug mentions in the last Encoder layer the attention head, A o Indicating adverse event mention e o Attention to all the indicia of the document was gained by averaging the average of the adverse events mentioned in the last Encoder layer with the attention head, F (s, o) representing the drug and adverse event mentioned pair (e) s ,e o ) A relationship matrix;
the semantic feature fusion device is used for fusing the mentioned global dependencies of the local context information through a coding module and a U-shaped semantic segmentation network, and specifically comprises the following steps: firstly, a reference pair relation matrix F epsilon R containing local context information M×M×D The method is used as a D-channel image, and is combined with a coding module, then rich global features are obtained by utilizing a U-shaped semantic segmentation network, the U-shaped semantic segmentation network comprises a global feature extraction block, two upsampling blocks with jump connection and a feature output layer which are sequentially arranged, and therefore a local context and a global dependency information matrix are obtained:
Y=U(W 2 F)
wherein Y ∈ R M×M×D ' represents local context and global dependency information matrix, U ∈ R M×M×D ' representing a U-shaped semantically segmented network, W 2 Is a matrix of weights that can be learned,to reduce the dimension of F, and D' is much smaller than D, W 2 F represents an encoding module;
the classifier is used for predicting adverse drug event relations through a local context and global dependency information matrix and a smooth embedding representation, and specifically comprises the following steps: m is embedded first using the local context mentioned at different locations in the document,obtaining the same mentioned smooth embedded representation E by using the maximally pooled smooth version i :
Wherein E is i Denotes a reference to e i Is to be presented in a smooth embedded representation,indicating a drug or adverse event mention e in a document i Total number of occurrences;
smoothly embedding representation E in separately obtained drug and adverse event s And E o After the local context and the global dependency information matrix Y, the classifier firstly utilizes a feedforward neural network to convert E into s 、E o Mapping Y to a hidden representation z, and then obtaining a relation probability through a bilinear function, wherein the specific process is as follows:
z s =tanh(W s E s +Y s,o )
z o =tanh(W o E o +Y s,o )
P(r|E s ,E o )=σ(z s W r z o +b r )
wherein z is s Is a hidden representation of the drug, z o Is a hidden representation of an adverse event, P is a relationship probability, Y s,o Is a reference to drugs and adverse events in matrix Y (e) s ,e o ) Is represented by the local context and global dependency information, tanh is a nonlinear activation function, and σ isBilinear function, W s 、W o 、W r 、b r Is a learnable parameter matrix;
the sample imbalance processor is used for training by introducing a balanced softmax method and introducing an additional class 0 to process the class imbalance problem in the sample set, and the scores of target classes are expected to be larger than a threshold value t 0 The scores of the non-target categories are all less than the threshold t 0 :
Wherein L represents the target loss function, log represents the logarithm based on e, e represents the constant, t i Indicates the probability of the ith positive label, t j The probability of the jth negative label is represented, the omega pos represents the drug and the adverse event mention relationship corresponding to the drug, namely the positive label, and the omega neg represents the drug and the non-corresponding adverse event mention relationship, namely the negative label;
s2, preprocessing data, specifically adopting the following method to perform reference unification processing:
firstly, carrying out pause word removing processing on mentions in the medical text, then carrying out regularization matching, and classifying mentions with regularization matching degree higher than 90% as the same mention;
s3, model training and parameter optimization: training the extraction model by using the processed data, designing an objective optimization function to optimize network parameters, and generating an optimal extraction model, which specifically comprises the following steps of:
s31, the data set is divided into 7:2:1, dividing the training set, the verification set and the test set in proportion;
s32, adopting a balanced softmax classified cross entropy loss function as an optimization target, wherein the target function is realized by adopting the same formula as the target loss function L calculated in the sample unbalance processor in the step S1;
s33, optimizing a target function by adopting a random gradient descent algorithm, and updating network model parameters by using error back propagation;
s4, extracting adverse drug event relations:
s41, preprocessing medical text data to be extracted to obtain standardized sample data, and defining the medicine and the non-corresponding adverse event mention relationship pair category as 0;
s42, forming a training sample for a medical sample and all drug mentions and adverse event mentions contained in the medical sample, directly inserting two fixed marks of < S > and </S > before and after all drug mentions, and splicing the adverse event mentions behind a text in a suspension mark mode represented by < o > and </o >;
s43, feeding the sample into a BERT pre-training model, and for each pair of drug and adverse event mention mark pairs, splicing the local context representation of the drug mention mark and the local context representation of the adverse event mention mark together to serve as embedded representation of the corresponding drug mention and adverse event mention pairs;
s44, after embedding representation of all drug mention and adverse event mention pairs containing local context information of the sample is obtained, carrying out affine transformation on the embedded representation and an attention layer of a BERT pre-training model to obtain a mention pair relation matrix of the drug and the adverse event;
s45, combining the mention pair relation matrix containing the local context information with a coding module, and acquiring rich global features by utilizing a U-shaped semantic segmentation network so as to output all local contexts and a global dependency information matrix;
s46, obtaining smooth embedded expression of the medicine and the adverse event, mapping the smooth embedded expression of the medicine and the adverse event, the local context and the global dependency information matrix to a hidden expression by utilizing a feedforward neural network, and then obtaining a relation probability, namely a relation score through a bilinear function;
and S47, calculating scores of the positive sample relation and the negative sample relation by introducing a softmax method, and enabling the scores of the positive sample relation to be larger than 0.
Further, in the U-shaped semantic segmentation network utilized by the semantic feature fusion device in step S1, the global feature extraction block includes three convolution modules and two maximum pooling layers, the first maximum pooling layer is located behind the first convolution module, the second maximum pooling layer is located behind the second convolution module, each convolution module includes two convolution layers, and the number of channels in the feature extraction block is doubled; the two upsampling blocks comprise an anti-convolution layer and two convolution layers which are sequentially arranged, the first upsampling block is positioned behind the third convolution module, the second upsampling block is positioned behind the first upsampling block, and the number of channels in each upsampling block is reduced by half; and the output result of the deconvolution layer in the second up-sampling block is in jumping connection with the output result of the second convolution layer in the first convolution module, and the output result of the deconvolution layer in the first up-sampling block is in jumping connection with the output result of the second convolution layer in the second convolution module.
Further, the convolution kernel size of the two convolution layers in each convolution module is 3 × 3 and the step size is 1, the convolution kernel size of the two largest pooling layers is 2 × 2 and the step size is 2, the convolution kernel size of the deconvolution layer in the two upsampling blocks is 2 × 2 and the step size is 2, the convolution kernel size of the two convolution layers is 3 × 3 and the step size is 1, and the convolution kernel size of the characteristic output layer is 1 × 1 and the step size is 1.
Further, in the step S1 sample imbalance processor, the threshold t is set 0 Set to 0, the formula for calculating the target loss function L is simplified as follows:
compared with the prior art, the extraction method of adverse drug event relation based on semantic segmentation has the following beneficial effects:
1. according to the invention, when the embedded representation is obtained, the suspended mark representation is utilized, different reference embedded representations can be more effectively distinguished, and the prediction accuracy can be obviously improved.
2. The U-shaped semantic segmentation network is used for capturing the global interdependence relation among triples (medicine, adverse event and medicine adverse event relation), so that the extraction model can more effectively solve the problem that the distance between the reference pairs is too long and key information cannot be found.
3. An encoding module is introduced to capture locally mentioned context information, and the global interdependence relationship is fused, so that the extraction model can more fully understand global semantics, and specific adverse drug events can be better distinguished.
4. A balanced softmax method is used for processing the problem of unbalanced relation distribution, and the condition of 'undersampling' of an extraction model is avoided, so that the accuracy of relation classification is improved.
Drawings
Fig. 1 is a schematic flow chart of a drug adverse event relationship extraction system provided by the present invention.
Fig. 2 is a schematic diagram of a span of a suspension mark provided by the present invention.
FIG. 3 is a schematic diagram of a mention-pair relationship matrix provided by the present invention.
Fig. 4 is a schematic diagram of a drug adverse event relationship extraction network provided by the present invention.
FIG. 5 is a schematic diagram of a U-shaped semantic segmentation network structure provided by the present invention.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further explained below by combining the specific drawings.
Referring to fig. 1 to 5, the present invention provides a method for extracting adverse drug event relationship based on semantic segmentation, which includes the following steps:
s1, establishing a drug adverse event relation extraction model:
the drug adverse event relation extraction model is used for extracting drugs in a medical text and adverse events caused by the drugs, and the model structure comprises a local context information feature extractor, a semantic feature fusion device, a classifier and a sample imbalance processor; wherein,
the local context information feature extractor is used for extracting different mentioned local context features from the input of the medical text, and specifically comprises the following steps: with the benefit of the parallelism of the floating labels, a series of related mentions can be flexibly packaged into a training instance, given that one contains N text labels (for noting which of the textsCharacters belonging to common characters, no definition, which characters belong to a drug or an adverse event-referenced character)First, as shown in FIG. 2, the fixed marks are inserted at the beginning and the end of the drug mention<s>And</s>to mark the drug mention location and then to mention the corresponding candidate adverse event with suspension marking<o>And</o>means for splicing behind text, i.e. for mentioning corresponding adverse events<o>And</o>the mark is placed behind the text, wherein<o>And</o>the same position codes are mentioned as corresponding adverse events, namely the same position codes are adopted for representing<o>And</o>with corresponding adverse event mentions, the combined sequence of text labels and inserted suspension labels is then provided to the BERT pre-training model to obtain a drug mention label local context representation e s And adverse event mention flag local context representation e o E is to be s And e o Split-together as corresponding drug mention and adverse event mention vs. insert representationWherein M represents the maximum logarithm of mentions of the composition of drug mentions and adverse event mentions in the sample, and finally obtaining an attention expression by using a BERT pre-training modelWherein A is the average value of the attention heads in the last Encoder layer of the BERT pre-training model, and the attention matrix A and affine transformation from the BERT pre-training model are used for obtaining a mention pair relation matrix of the medicine and the adverse event:
wherein,is a Hadamard product, W 1 Is a learnable parameter matrix, H is a drug mention and adverse event mention pair embedded representation, A s For drug mention e s Attention to all the labels of the document, obtained by averaging the mean of the drug mentions in the last Encoder layer the attention head, A o Mention of indicating adverse events e o Attention to all the indicia of the document was gained by averaging the average of the adverse events mentioned in the last Encoder layer with the attention head, F (s, o) representing the drug and adverse event mentioned pair (e) s ,e o ) A relationship matrix, as shown in FIG. 2 in particular;
the semantic feature fusion device is used for fusing the mentioned global dependencies by the local context information through the coding module and the U-shaped semantic segmentation network (see table 1 below), and specifically comprises: firstly, a reference pair relation matrix F epsilon R containing local context information M×M×D The image is used as a D channel image, namely a document level relation prediction formula is converted into a pixel mask in F, rich global features are obtained by utilizing a U-shaped semantic segmentation network, and the U-shaped semantic segmentation network comprises sequentially arranged global feature extraction blocks (sequence numbers 1-8), two upper sampling blocks (sequence numbers 9-14) with jump connection and a feature output layer (sequence number 15); as a specific way, the global feature extraction block includes three convolution modules and two maximum pooling layers, the first maximum pooling layer is located behind the first convolution module, the second maximum pooling layer is located behind the second convolution module, each convolution module includes two convolution layers, the convolution kernel sizes of the two convolution layers in each convolution module are 3 × 3 and the step size is 1, the convolution kernel sizes of the two maximum pooling layers are 2 × 2 and the step size is 2, the number of channels in the feature extraction block is doubled, the number of channels of the first convolution module and the first maximum pooling layer is 64, the number of channels of the second convolution module and the second maximum pooling layer is 128, the number of channels of the third convolution module is 256, the partition region in the reference pair relationship matrix refers to the occurrence of the relationship between the reference pairs, the U-shaped semantic partition network can facilitate the information exchange between the reference pair in the sense field, similar to implicit reasoning, specifically, the feature extraction block can enlarge the sense field of the current reference pair embedded F, thereby providing rich global information for representing learning(ii) a The two upsampling blocks respectively comprise an deconvolution layer and two convolution layers which are sequentially arranged, the first upsampling block is positioned behind the third convolution module, the second upsampling block is positioned behind the first upsampling block, the convolution kernel size of the deconvolution layer in the two upsampling blocks is 2 x 2, the step length is 2, the convolution kernel size of the two convolution layers is 3 x 3, the step length is 1, the number of channels in each upsampling block is halved, the aggregation information can be distributed to each pixel, the number of channels in the first upsampling block is 128, and the number of channels in the second upsampling block is 64; the output result of the deconvolution layer (serial number 12) in the second upsampling block is in jump connection with the output result of the second convolution layer (serial number 2) in the first convolution module, and the output result of the deconvolution layer (serial number 9) in the first upsampling block is in jump connection with the second convolution layer (serial number 5) in the second convolution module, so that the last upsampling block is taken as an example, the last upsampling block is characterized by not only the output (same-scale characteristic) from the first convolution module but also the output (large-scale characteristic) from the first upsampling block, and therefore the multi-scale characteristics are effectively fused together; the convolution kernel size of the characteristic output layer is 1 × 1, the step length is 1, and the number of channels is 1. The specific parameters of the U-shaped semantic segmentation network model are shown in table 1 below.
TABLE 1 Hyperparameter List of Global dependent network model architecture
Then combining an encoding module and a U-shaped semantic segmentation network to obtain a local context and global dependency information matrix Y:
Y=U(W 2 F)
wherein Y ∈ R M×M×D' Representing a local context and a global dependency information matrix, U ∈ R M×M×D' Representing a U-shaped semantically segmented network, W 2 Is a learnable weight matrix to reduce the dimension of F, and D' is far awayLess than D, W 2 F represents an encoding module;
the classifier is used for predicting adverse drug event relations through a local context and global dependency information matrix and a smooth embedding expression, and specifically comprises the following steps: the same mention may occur multiple times in the document, so m is embedded first with a local context of mention at a different location in the document,obtaining the same mentioned smooth embedded representation E by using the maximally pooled smooth version i :
Wherein, E i Denotes a mention of e i Is to be represented by a smooth embedding of (c),indicating a drug or adverse event mention e in a document i Total number of occurrences;
smoothly embedding representation E in separately obtained drug and adverse event s And E o After the local context and the global dependency information matrix Y, the classifier utilizes a feedforward neural network to classify E s 、E o Mapping Y to a hidden representation z, and then obtaining a relation probability through a bilinear function, wherein the specific process is as follows:
z s =tanh(W s E s +Y s,o )
z o =tanh(W o E o +Y s,o )
P(r|E s ,E o )=σ(z s W r z o +b r )
wherein z is s Is a hidden representation of the drug, z o Is a hidden representation of an adverse event, P is a relationship probability, Y s,o Is the drug and adverse event mention in matrix Y for (e) s ,e o ) Is represented by the local context and global dependency information, tanh is a non-lineThe linear activating function is mainly used for non-linear transformation, sigma is the probability value of the bilinear function for outputting the prediction result, W s 、W o 、W r 、b r Is a learnable parameter matrix;
the sample imbalance processor is used for training by introducing a balanced softmax method and introducing an additional class 0 to process the class imbalance problem in the sample set, and the target class scores are expected to be larger than a threshold value t 0 The scores of all the non-target categories are less than the threshold value t 0 :
Wherein L represents the target loss function, log represents the logarithm based on e, e represents the constant, t i Indicates the probability of the ith positive label, t j The probability of the jth negative label is represented, the omega pos represents the drug and the adverse event mention relationship corresponding to the drug, namely the positive label, and the omega neg represents the drug and the non-corresponding adverse event mention relationship, namely the negative label;
as a specific embodiment, the threshold t is set for simplicity 0 Set to 0, the formula for calculating the target loss function L is simplified as follows:
s2, data preprocessing: in medical texts, there are cases of referring to different writing methods, some refer to only initials, some refer to abbreviations of letters, and the like, so that the name referring unification processing of an entity needs to be performed, and the name referring unification processing is performed by specifically adopting the following method:
firstly, carrying out pause word removing processing on the mentions in the medical text, then carrying out regularization matching, and classifying the mentions with the regularization matching degree higher than 90% as the same mentions.
S3, model training and parameter optimization: training the model by using the processed data, designing an objective optimization function to optimize network parameters, and generating an optimal extraction model, which specifically comprises the following steps of:
s31, the data set is divided into 7:2:1, dividing the training set, the verification set and the test set in proportion; as a specific embodiment, the inventors of the present application obtained 505 pieces of medical document data in total;
and S32, adopting a classified cross entropy loss function as an optimization target, wherein the target function is realized by adopting the same formula as the target loss function L calculated in the sample imbalance processor in the step S1, namely the target function is as follows:
and S33, optimizing the objective function by adopting the conventional stochastic gradient descent algorithm, and updating the network model parameters by using error back propagation.
S4, extracting adverse drug event relations:
s41, preprocessing medical text data to be extracted to obtain standardized sample data (see data preprocessing step), and defining the medicine and the non-corresponding adverse event mention relation pair category as 0;
s42, forming a training sample for a medical sample and all drug mentions and adverse event mentions contained in the medical sample, wherein the drug mentions adopt fixed marks, namely two fixed marks of < S > and </S > are directly inserted before and after all drug mentions, and the adverse event mentions are spliced behind a text in a suspension mark mode represented by < o > and </o >;
s43, feeding the sample into a BERT pre-training model, and for each pair of drug and adverse event mention mark pairs, splicing the local context representation of the drug mention mark and the local context representation of the adverse event mention mark together, namely splicing the characterization of the drug mention mark and the characterization of the adverse event mention mark together as corresponding embedded representations or characterizations of the drug mention mark and the adverse event mention pair;
s44, after embedding representation or characterization of all drug mention and adverse event mention pairs containing local context information in the obtained sample, carrying out affine transformation on the drug mention and adverse event mention pairs and an attention layer of a BERT pre-training model to obtain a mention pair relation matrix of the drug and the adverse event;
s45, combining the mention pair relation matrix containing the local context information with a coding module, and acquiring rich global features by utilizing a U-shaped semantic segmentation network so as to output all local contexts and a global dependency information matrix;
s46, obtaining smooth embedded expression of the medicine and the adverse event, mapping the smooth embedded expression of the medicine and the adverse event, the local context and the global dependency information matrix to a hidden expression by utilizing a feedforward neural network, and then obtaining a relation probability, namely a relation score through a bilinear function;
and S47, calculating scores of the positive sample relation and the negative sample relation by introducing a softmax method, and enabling the scores of the positive sample relation to be larger than 0.
Compared with the prior art, the extraction method of adverse drug event relation based on semantic segmentation has the following beneficial effects:
1. according to the invention, when the embedded representation is obtained, the suspension mark representation is utilized, so that different mentioned embedded representations can be more effectively distinguished, and the prediction accuracy can be obviously improved.
2. The U-shaped semantic segmentation network is used for capturing the global interdependence relation among triples (medicine, adverse event and medicine adverse event relation), so that the extraction model can more effectively solve the problem that the distance between the reference pairs is too long and key information cannot be found.
3. An encoding module is introduced to capture locally mentioned context information, and the global interdependence relationship is fused, so that the extraction model can more fully understand global semantics, and specific adverse drug events can be better distinguished.
4. A balanced softmax method is used for processing the problem of unbalanced relation distribution, and the condition of 'undersampling' of an extraction model is avoided, so that the accuracy of relation classification is improved.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.
Claims (4)
1. A method for extracting adverse drug event relation based on semantic segmentation is characterized by comprising the following steps:
s1, establishing a drug adverse event relation extraction model:
the drug adverse event relation extraction model is used for extracting drugs in a medical text and adverse events caused by the drugs, and the model structure comprises a local context information feature extractor, a semantic feature fusion device, a classifier and a sample imbalance processor; wherein,
the local context information feature extractor is used for extracting different mentioned local context features from the input of the medical text, and specifically comprises the following steps: given a drug adverse event document containing N text labelsFirstly, fixing marks are inserted at the beginning and the end of the drug mention<s>And</s>to mark drug mention locations and then mention corresponding candidate adverse events using suspension labeling<o>And</o>ways to stitch behind text, where<o>And</o>encoding the same location as the corresponding adverse event mention, then providing the combined sequence of text labels and inserted suspension labels to the BERT pre-training model to obtain the drug mention label local context representation e s And adverse event mention flag local context representation e o E is to be s And e o Split-together as corresponding drug mention and adverse event mention versus insert representationWherein M represents in the sampleMaximum logarithm of mentions of drug mention and adverse event mention composition, and finally obtaining attention representation by using BERT pre-training modelWherein A is the average value of the attention heads in the last Encoder layer of the BERT pre-training model, and the attention matrix A and affine transformation from the BERT pre-training model are used for obtaining a reference pair relation matrix of the medicine and the adverse event:
wherein,is a Hadamard product, W 1 Is a learnable parameter matrix, H is a drug mention and adverse event mention pair embedded representation, A s Denotes drug mention e s Attention to all labels of the document is gained by averaging the mean of the drug references to the attention head in the last Encoder layer, A o Mention of indicating adverse events e o Attention to all the indicia of the document was gained by averaging the average of the adverse events mentioned in the last Encoder layer with the attention head, F (s, o) representing the drug and adverse event mentioned pair (e) s ,e o ) A relationship matrix;
the semantic feature fusion device is used for fusing the mentioned global dependencies of the local context information through a coding module and a U-shaped semantic segmentation network, and specifically comprises the following steps: firstly, a reference pair relation matrix F epsilon R containing local context information M×M×D The method is used as a D-channel image, and is combined with a coding module, then rich global features are obtained by utilizing a U-shaped semantic segmentation network, the U-shaped semantic segmentation network comprises a global feature extraction block, two upsampling blocks with jump connection and a feature output layer which are sequentially arranged, and therefore a local context and a global dependency information matrix are obtained:
Y=U(W 2 F)
wherein Y ∈ R M×M×D' Represents local context and global dependency information matrix, U belongs to R M×M×D' Representing a U-shaped semantically segmented network, W 2 Is a learnable weight matrix to reduce the dimension of F, and D' is much smaller than D, W 2 F represents an encoding module;
the classifier is used for predicting adverse drug event relations through a local context and global dependency information matrix and a smooth embedding representation, and specifically comprises the following steps: m is embedded first using the local context mentioned at different locations in the document,obtaining the same mentioned smooth embedded representation E using the maximally pooled smoothed version i :
Wherein E is i Denotes a mention of e i Is to be presented in a smooth embedded representation,indicating a drug or adverse event mention e in a document i Total number of occurrences;
smoothly embedding representation E in separately obtained drug and adverse event s And E o After the local context and the global dependency information matrix Y, the classifier utilizes a feedforward neural network to classify E s 、E o Mapping Y to a hidden representation z, and then obtaining a relation probability through a bilinear function, wherein the specific process is as follows:
z s =tanh(W s E s +Y s,o )
z o =tanh(W o E o +Y s,o )
P(r|E s ,E o )=σ(z s W r z o +b r )
wherein z is s Is a hidden representation of the drug, z o Is not provided withHidden representation of good events, P is the probability of relationship, Y s,o Is a reference to drugs and adverse events in matrix Y (e) s ,e o ) Is represented by the local context and global dependency information, tanh is a nonlinear activation function, σ is a bilinear function, W s 、W o 、W r 、b r Is a learnable parameter matrix;
the sample imbalance processor is used for training by introducing a balanced softmax method and introducing an additional class 0 to process the class imbalance problem in the sample set, and the target class scores are expected to be larger than a threshold value t 0 The scores of the non-target categories are all less than the threshold t 0 :
Wherein L represents an objective loss function, log represents a logarithm based on e, e represents a constant, t i Indicates the probability of the ith positive label, t j The probability of the jth negative label is represented, omega pos represents a medicine and an adverse event mention relation corresponding to the medicine, namely a positive label, and omega neg represents a medicine and an adverse event mention relation not corresponding to the medicine, namely a negative label;
s2, preprocessing data, specifically adopting the following method to perform reference unification processing:
firstly, carrying out pause word removing processing on mentions in the medical text, then carrying out regularization matching, and classifying mentions with regularization matching degree higher than 90% as the same mention;
s3, model training and parameter optimization: training the extraction model by using the processed data, designing an objective optimization function to optimize network parameters, and generating an optimal extraction model, which specifically comprises the following steps of:
s31, the data set is divided into 7:2:1, dividing the training set, the verification set and the test set in proportion;
s32, adopting a balanced softmax classified cross entropy loss function as an optimization target, wherein the target function is realized by adopting the same formula as the target loss function L calculated in the sample unbalance processor in the step S1;
s33, optimizing a target function by adopting a random gradient descent algorithm, and updating network model parameters by using error back propagation;
s4, extracting adverse drug event relations:
s41, preprocessing medical text data to be extracted to obtain standardized sample data, and defining the medicine and the non-corresponding adverse event mention relationship pair category as 0;
s42, forming a training sample for a medical sample and all medicine mentions and adverse event mentions contained in the medical sample, directly inserting two fixed marks of < S > and </S > before and after all medicine mentions, and splicing the adverse event mentions behind a text in a suspension mark mode represented by < o > and </o >;
s43, feeding the sample into a BERT pre-training model, and for each pair of drug and adverse event mention mark pairs, splicing the local context representation of the drug mention mark and the local context representation of the adverse event mention mark together as corresponding drug mention and adverse event mention pair embedded representation;
s44, after embedding representation of all drug mention and adverse event mention pairs containing local context information of the sample is obtained, carrying out affine transformation on the embedded representation and an attention layer of a BERT pre-training model to obtain a mention pair relation matrix of the drug and the adverse event;
s45, combining the mention pair relation matrix containing the local context information with a coding module, and acquiring rich global features by utilizing a U-shaped semantic segmentation network so as to output all local contexts and a global dependency information matrix;
s46, obtaining smooth embedded expression of the medicines and the adverse events, utilizing a feedforward neural network to map the smooth embedded expression of the medicines and the adverse events, a local context and a global dependency information matrix to hidden expression, and then obtaining relation probability, namely relation score through a bilinear function;
and S47, calculating scores of the positive sample relation and the negative sample relation by introducing a softmax method, and enabling the scores of the positive sample relation to be larger than 0.
2. The method for extracting adverse drug event relation based on semantic segmentation according to claim 1, wherein in the U-shaped semantic segmentation network utilized by the semantic feature fusion device in step S1, the global feature extraction block includes three convolution modules and two maximum pooling layers, the first maximum pooling layer is located after the first convolution module, the second maximum pooling layer is located after the second convolution module, each convolution module includes two convolution layers, and the number of channels in the feature extraction block is doubled; the two up-sampling blocks respectively comprise an anti-convolution layer and two convolution layers which are sequentially arranged, the first up-sampling block is positioned behind the third convolution module, the second up-sampling block is positioned behind the first up-sampling block, and the number of channels in each up-sampling block is reduced by half; and the output result of the deconvolution layer in the second up-sampling block is in jumping connection with the output result of the second convolution layer in the first convolution module, and the output result of the deconvolution layer in the first up-sampling block is in jumping connection with the output result of the second convolution layer in the second convolution module.
3. The method according to claim 2, wherein the convolution kernel sizes of the two convolution layers in each convolution module are 3 x 3 and 1 step size, the convolution kernel sizes of the two largest pooling layers are 2 x 2 and 2 step sizes, the convolution kernel sizes of the deconvolution layers in the two upsampling blocks are 2 x 2 and 2 step sizes, the convolution kernel sizes of the two convolution layers are 3 x 3 and 1 step size, and the convolution kernel size of the feature output layer is 1 x 1 and 1 step size.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211040440.0A CN115392256A (en) | 2022-08-29 | 2022-08-29 | Drug adverse event relation extraction method based on semantic segmentation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211040440.0A CN115392256A (en) | 2022-08-29 | 2022-08-29 | Drug adverse event relation extraction method based on semantic segmentation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115392256A true CN115392256A (en) | 2022-11-25 |
Family
ID=84121722
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211040440.0A Pending CN115392256A (en) | 2022-08-29 | 2022-08-29 | Drug adverse event relation extraction method based on semantic segmentation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115392256A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116521888A (en) * | 2023-03-20 | 2023-08-01 | 麦博(上海)健康科技有限公司 | Method for extracting medical long document cross-sentence relation based on DocRE model |
CN116628577A (en) * | 2023-07-26 | 2023-08-22 | 安徽通灵仿生科技有限公司 | Adverse event detection method and device for ventricular assist device |
CN117095782A (en) * | 2023-10-20 | 2023-11-21 | 上海森亿医疗科技有限公司 | Medical text quick input method, system, terminal and editor |
CN117744657A (en) * | 2023-12-26 | 2024-03-22 | 广东外语外贸大学 | Medicine adverse event detection method and system based on neural network model |
CN118428471A (en) * | 2024-07-02 | 2024-08-02 | 湖南董因信息技术有限公司 | Atlas relation extraction method based on pre-training model enhancement |
-
2022
- 2022-08-29 CN CN202211040440.0A patent/CN115392256A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116521888A (en) * | 2023-03-20 | 2023-08-01 | 麦博(上海)健康科技有限公司 | Method for extracting medical long document cross-sentence relation based on DocRE model |
CN116628577A (en) * | 2023-07-26 | 2023-08-22 | 安徽通灵仿生科技有限公司 | Adverse event detection method and device for ventricular assist device |
CN116628577B (en) * | 2023-07-26 | 2023-10-31 | 安徽通灵仿生科技有限公司 | Adverse event detection method and device for ventricular assist device |
CN117095782A (en) * | 2023-10-20 | 2023-11-21 | 上海森亿医疗科技有限公司 | Medical text quick input method, system, terminal and editor |
CN117095782B (en) * | 2023-10-20 | 2024-02-06 | 上海森亿医疗科技有限公司 | Medical text quick input method, system, terminal and editor |
CN117744657A (en) * | 2023-12-26 | 2024-03-22 | 广东外语外贸大学 | Medicine adverse event detection method and system based on neural network model |
CN118428471A (en) * | 2024-07-02 | 2024-08-02 | 湖南董因信息技术有限公司 | Atlas relation extraction method based on pre-training model enhancement |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112001177B (en) | Electronic medical record named entity recognition method and system integrating deep learning and rules | |
CN111540468B (en) | ICD automatic coding method and system for visualizing diagnostic reasons | |
CN115392256A (en) | Drug adverse event relation extraction method based on semantic segmentation | |
JP2022541199A (en) | A system and method for inserting data into a structured database based on image representations of data tables. | |
CN113051356B (en) | Open relation extraction method and device, electronic equipment and storage medium | |
CN108959566B (en) | A kind of medical text based on Stacking integrated study goes privacy methods and system | |
CN113901207B (en) | Adverse drug reaction detection method based on data enhancement and semi-supervised learning | |
CN112988963B (en) | User intention prediction method, device, equipment and medium based on multi-flow nodes | |
CN113160917B (en) | Electronic medical record entity relation extraction method | |
CN112016314A (en) | Medical text understanding method and system based on BERT model | |
CN113704429A (en) | Semi-supervised learning-based intention identification method, device, equipment and medium | |
CN113393916B (en) | Method and device for extracting structural relationship of coronary artery medical report | |
US20220398374A1 (en) | Method and apparatus for segmenting a medical text report into sections | |
CN114942991B (en) | Emotion classification model construction method based on metaphor recognition | |
CN112686044A (en) | Medical entity zero sample classification method based on language model | |
CN114153978A (en) | Model training method, information extraction method, device, equipment and storage medium | |
CN116797848A (en) | Disease positioning method and system based on medical image text alignment | |
CN114970536A (en) | Combined lexical analysis method for word segmentation, part of speech tagging and named entity recognition | |
CN115935914A (en) | Admission record missing text supplementing method | |
CN111523320A (en) | Chinese medical record word segmentation method based on deep learning | |
Yan et al. | Chemical name extraction based on automatic training data generation and rich feature set | |
CN114511084A (en) | Answer extraction method and system for automatic question-answering system for enhancing question-answering interaction information | |
CN114444467A (en) | Traditional Chinese medicine literature content analysis method and device | |
CN117034948B (en) | Paragraph identification method, system and storage medium based on multi-feature self-adaptive fusion | |
CN117422074A (en) | Method, device, equipment and medium for standardizing clinical information text |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |