CN115392256A - Drug adverse event relation extraction method based on semantic segmentation - Google Patents

Drug adverse event relation extraction method based on semantic segmentation Download PDF

Info

Publication number
CN115392256A
CN115392256A CN202211040440.0A CN202211040440A CN115392256A CN 115392256 A CN115392256 A CN 115392256A CN 202211040440 A CN202211040440 A CN 202211040440A CN 115392256 A CN115392256 A CN 115392256A
Authority
CN
China
Prior art keywords
drug
mention
adverse
adverse event
local context
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211040440.0A
Other languages
Chinese (zh)
Inventor
崔少国
陈俊桦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Normal University
Original Assignee
Chongqing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Normal University filed Critical Chongqing Normal University
Priority to CN202211040440.0A priority Critical patent/CN115392256A/en
Publication of CN115392256A publication Critical patent/CN115392256A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a method for extracting adverse drug event relation based on semantic segmentation, which comprises the following steps: the method comprises the steps of establishing a drug adverse event relation extraction model with a local context information feature extractor, a semantic feature fusion device, a classifier and a sample imbalance processor, preprocessing data, training the model, optimizing parameters and extracting the drug adverse event relation. The method can better identify the mentioned boundary by marking before and after the drug mention by using special symbols and splicing the adverse event mention label behind the text by using the suspension mark; meanwhile, a U-shaped semantic segmentation network is introduced to fuse local context information to capture global interdependency among adverse drug events, so that key information can be found more accurately; in addition, a balanced softmax method is used for processing unbalanced relationship distribution, so that the influence of irrelevant triple on a model is avoided, and adverse drug event relationships in a medical text are extracted more accurately.

Description

Drug adverse event relation extraction method based on semantic segmentation
Technical Field
The invention relates to the technical field of medical text data mining, in particular to a method for extracting adverse drug event relation based on semantic segmentation.
Background
Adverse Drug Event (ADE) refers to an Adverse clinical Event occurring during the course of Drug therapy, which is not necessarily causal to the Drug. There are two main causes of adverse drug events, one is a problem with drug quality and the other is medication errors. Adverse drug events seriously jeopardize the physical health of patients and bring about huge economic losses to the whole medical system and society. According to statistics, the emergency rate of medical adverse events accounts for 28% of the total rate of medical adverse events, and due to the importance and harmfulness of medical adverse events, researchers in various fields such as life sciences, biology and comprehensive medicine are concerned with the emergency rate. In addition, although the ultimate goal of drug discovery is to develop chemicals for the treatment of specific diseases, recognition of the chemical and its resulting adverse drug reactions correspondences is critical to improving chemical safety and toxicity studies, as well as facilitating new drug compound screening methods.
After long-term exploration of researchers, the drug adverse event research technology based on text mining gradually develops from an early template and rule-based method to a traditional machine learning-based method taking data as guidance, and makes a major breakthrough in both theoretical and practical researches. In addition, with the rise and development of the deep learning method, the deep learning framework based on the neural network also provides a new idea for the text mining method. Because the neural network model can automatically learn the internal features of the data from the original data through large-scale data training, breakthrough progress has been made in the field of speech and image recognition, and great potential is also shown in the field of natural language processing. Therefore, the text mining method based on deep learning will become a trend of future research development. And the method for researching adverse drug events by using the text mining method based on deep learning has great value and promotion effect on promoting the development of relevant biomedical research.
The inventors of the present application have found through research that identifying all medication and adverse event mentions from natural language texts and extracting the medication and its corresponding adverse event relationship has the following problems due to the specific definition of adverse events of the medication: (1) With the acceleration of the development process of drugs in the biomedical field, in clinical trials before the market, due to the limitation of test conditions, adverse events of many drugs are difficult to be found and are listed in adverse event reports; in addition, because some drug adverse events do not occur until after a period of time after being taken, or occur in a specific population, many potential adverse events cannot be covered by existing dictionaries or databases, and it is difficult to find such potential drug adverse event mentions by only dictionary and rule methods; (2) The same condition mention may be both a drug adverse event and an indication in different contexts, so identification of a drug adverse event mention is more dependent on understanding the context semantic relationship to distinguish specific drug adverse events; (3) There is no uniform naming mode for the same adverse drug event, there may be multiple expression modes for the same disease, such problems may lead to sparse mention names, difficult to be fully learned in limited labeled corpus, difficult to be identified; (4) In some natural language texts, adverse drug events are often represented by non-medical terms, which are often connected with preceding and following common words or adjectives to represent a mention of adverse drug events, so that it is difficult to judge the boundaries of mention of adverse drug events, thereby causing inaccurate recognition.
Disclosure of Invention
Aiming at the technical problems existing in the extraction of the existing medicines and the adverse event relations corresponding to the medicines, the invention provides a method for extracting the adverse event relations of the medicines based on semantic segmentation, which marks the medicines before and after the mention of the medicines by using special symbols and splices the mention marks of the adverse events behind texts by using suspension marks so as to better identify the boundaries of the medicines and the mention of the adverse events; meanwhile, a U-shaped semantic segmentation network is introduced to fuse local context information to capture global interdependency among adverse drug events, so that key information can be found more accurately; in addition, a balanced softmax method is used for processing unbalanced relationship distribution, so that the influence of irrelevant triple on a model is avoided, and the adverse drug event relationship in the medical text is extracted more accurately.
In order to solve the technical problem, the invention adopts the following technical scheme:
a method for extracting adverse drug event relation based on semantic segmentation comprises the following steps:
s1, establishing a drug adverse event relation extraction model:
the drug adverse event relation extraction model is used for extracting drugs in a medical text and adverse events caused by the drugs, and the model structure comprises a local context information feature extractor, a semantic feature fusion device, a classifier and a sample imbalance processor; wherein,
the local context information feature extractor is used for extracting different mentioned local context features from the input of the medical text, and specifically comprises the following steps: given a drug adverse event document containing N text labels
Figure BDA0003820147590000031
Firstly, fixing marks are inserted at the beginning and the end of the drug mention<s>And</s>to mark the drug mention location and then to mention the corresponding candidate adverse event with suspension marking<o>And</o>the way is spliced behind the text, wherein<o>And</o>encoding the same location as the corresponding adverse event mention, then providing the combined sequence of text labels and inserted suspension labels to the BERT pre-training model to obtain the drug mention label local context representation e s And adverse event mention flag local context representation e o E is to be s And e o Split-together as corresponding drug mention and adverse event mention vs. insert representation
Figure BDA0003820147590000032
Wherein M represents the maximum logarithm of mentions of the composition of drug mentions and adverse event mentions in the sample, and finally obtaining an attention expression by using a BERT pre-training model
Figure BDA0003820147590000033
Wherein A is the average value of the attention heads in the last Encoder layer of the BERT pre-training model, and the attention matrix A and affine transformation from the BERT pre-training model are used for obtaining a mention pair relation matrix of the medicine and the adverse event:
Figure BDA0003820147590000034
wherein,
Figure BDA0003820147590000035
is a Hadamard product, W 1 Is a learnable parameter matrix, H is a drug mention and adverse event mention pair embedded representation, A s Denotes drug mention e s Attention to all the labels of the document, obtained by averaging the mean of the drug mentions in the last Encoder layer the attention head, A o Indicating adverse event mention e o Attention to all the indicia of the document was gained by averaging the average of the adverse events mentioned in the last Encoder layer with the attention head, F (s, o) representing the drug and adverse event mentioned pair (e) s ,e o ) A relationship matrix;
the semantic feature fusion device is used for fusing the mentioned global dependencies of the local context information through a coding module and a U-shaped semantic segmentation network, and specifically comprises the following steps: firstly, a reference pair relation matrix F epsilon R containing local context information M×M×D The method is used as a D-channel image, and is combined with a coding module, then rich global features are obtained by utilizing a U-shaped semantic segmentation network, the U-shaped semantic segmentation network comprises a global feature extraction block, two upsampling blocks with jump connection and a feature output layer which are sequentially arranged, and therefore a local context and a global dependency information matrix are obtained:
Y=U(W 2 F)
wherein Y ∈ R M×M×D ' represents local context and global dependency information matrix, U ∈ R M×M×D ' representing a U-shaped semantically segmented network, W 2 Is a matrix of weights that can be learned,to reduce the dimension of F, and D' is much smaller than D, W 2 F represents an encoding module;
the classifier is used for predicting adverse drug event relations through a local context and global dependency information matrix and a smooth embedding representation, and specifically comprises the following steps: m is embedded first using the local context mentioned at different locations in the document,
Figure BDA0003820147590000041
obtaining the same mentioned smooth embedded representation E by using the maximally pooled smooth version i
Figure BDA0003820147590000042
Wherein E is i Denotes a reference to e i Is to be presented in a smooth embedded representation,
Figure BDA0003820147590000043
indicating a drug or adverse event mention e in a document i Total number of occurrences;
smoothly embedding representation E in separately obtained drug and adverse event s And E o After the local context and the global dependency information matrix Y, the classifier firstly utilizes a feedforward neural network to convert E into s 、E o Mapping Y to a hidden representation z, and then obtaining a relation probability through a bilinear function, wherein the specific process is as follows:
z s =tanh(W s E s +Y s,o )
z o =tanh(W o E o +Y s,o )
P(r|E s ,E o )=σ(z s W r z o +b r )
wherein z is s Is a hidden representation of the drug, z o Is a hidden representation of an adverse event, P is a relationship probability, Y s,o Is a reference to drugs and adverse events in matrix Y (e) s ,e o ) Is represented by the local context and global dependency information, tanh is a nonlinear activation function, and σ isBilinear function, W s 、W o 、W r 、b r Is a learnable parameter matrix;
the sample imbalance processor is used for training by introducing a balanced softmax method and introducing an additional class 0 to process the class imbalance problem in the sample set, and the scores of target classes are expected to be larger than a threshold value t 0 The scores of the non-target categories are all less than the threshold t 0
Figure BDA0003820147590000051
Wherein L represents the target loss function, log represents the logarithm based on e, e represents the constant, t i Indicates the probability of the ith positive label, t j The probability of the jth negative label is represented, the omega pos represents the drug and the adverse event mention relationship corresponding to the drug, namely the positive label, and the omega neg represents the drug and the non-corresponding adverse event mention relationship, namely the negative label;
s2, preprocessing data, specifically adopting the following method to perform reference unification processing:
firstly, carrying out pause word removing processing on mentions in the medical text, then carrying out regularization matching, and classifying mentions with regularization matching degree higher than 90% as the same mention;
s3, model training and parameter optimization: training the extraction model by using the processed data, designing an objective optimization function to optimize network parameters, and generating an optimal extraction model, which specifically comprises the following steps of:
s31, the data set is divided into 7:2:1, dividing the training set, the verification set and the test set in proportion;
s32, adopting a balanced softmax classified cross entropy loss function as an optimization target, wherein the target function is realized by adopting the same formula as the target loss function L calculated in the sample unbalance processor in the step S1;
s33, optimizing a target function by adopting a random gradient descent algorithm, and updating network model parameters by using error back propagation;
s4, extracting adverse drug event relations:
s41, preprocessing medical text data to be extracted to obtain standardized sample data, and defining the medicine and the non-corresponding adverse event mention relationship pair category as 0;
s42, forming a training sample for a medical sample and all drug mentions and adverse event mentions contained in the medical sample, directly inserting two fixed marks of < S > and </S > before and after all drug mentions, and splicing the adverse event mentions behind a text in a suspension mark mode represented by < o > and </o >;
s43, feeding the sample into a BERT pre-training model, and for each pair of drug and adverse event mention mark pairs, splicing the local context representation of the drug mention mark and the local context representation of the adverse event mention mark together to serve as embedded representation of the corresponding drug mention and adverse event mention pairs;
s44, after embedding representation of all drug mention and adverse event mention pairs containing local context information of the sample is obtained, carrying out affine transformation on the embedded representation and an attention layer of a BERT pre-training model to obtain a mention pair relation matrix of the drug and the adverse event;
s45, combining the mention pair relation matrix containing the local context information with a coding module, and acquiring rich global features by utilizing a U-shaped semantic segmentation network so as to output all local contexts and a global dependency information matrix;
s46, obtaining smooth embedded expression of the medicine and the adverse event, mapping the smooth embedded expression of the medicine and the adverse event, the local context and the global dependency information matrix to a hidden expression by utilizing a feedforward neural network, and then obtaining a relation probability, namely a relation score through a bilinear function;
and S47, calculating scores of the positive sample relation and the negative sample relation by introducing a softmax method, and enabling the scores of the positive sample relation to be larger than 0.
Further, in the U-shaped semantic segmentation network utilized by the semantic feature fusion device in step S1, the global feature extraction block includes three convolution modules and two maximum pooling layers, the first maximum pooling layer is located behind the first convolution module, the second maximum pooling layer is located behind the second convolution module, each convolution module includes two convolution layers, and the number of channels in the feature extraction block is doubled; the two upsampling blocks comprise an anti-convolution layer and two convolution layers which are sequentially arranged, the first upsampling block is positioned behind the third convolution module, the second upsampling block is positioned behind the first upsampling block, and the number of channels in each upsampling block is reduced by half; and the output result of the deconvolution layer in the second up-sampling block is in jumping connection with the output result of the second convolution layer in the first convolution module, and the output result of the deconvolution layer in the first up-sampling block is in jumping connection with the output result of the second convolution layer in the second convolution module.
Further, the convolution kernel size of the two convolution layers in each convolution module is 3 × 3 and the step size is 1, the convolution kernel size of the two largest pooling layers is 2 × 2 and the step size is 2, the convolution kernel size of the deconvolution layer in the two upsampling blocks is 2 × 2 and the step size is 2, the convolution kernel size of the two convolution layers is 3 × 3 and the step size is 1, and the convolution kernel size of the characteristic output layer is 1 × 1 and the step size is 1.
Further, in the step S1 sample imbalance processor, the threshold t is set 0 Set to 0, the formula for calculating the target loss function L is simplified as follows:
Figure BDA0003820147590000071
compared with the prior art, the extraction method of adverse drug event relation based on semantic segmentation has the following beneficial effects:
1. according to the invention, when the embedded representation is obtained, the suspended mark representation is utilized, different reference embedded representations can be more effectively distinguished, and the prediction accuracy can be obviously improved.
2. The U-shaped semantic segmentation network is used for capturing the global interdependence relation among triples (medicine, adverse event and medicine adverse event relation), so that the extraction model can more effectively solve the problem that the distance between the reference pairs is too long and key information cannot be found.
3. An encoding module is introduced to capture locally mentioned context information, and the global interdependence relationship is fused, so that the extraction model can more fully understand global semantics, and specific adverse drug events can be better distinguished.
4. A balanced softmax method is used for processing the problem of unbalanced relation distribution, and the condition of 'undersampling' of an extraction model is avoided, so that the accuracy of relation classification is improved.
Drawings
Fig. 1 is a schematic flow chart of a drug adverse event relationship extraction system provided by the present invention.
Fig. 2 is a schematic diagram of a span of a suspension mark provided by the present invention.
FIG. 3 is a schematic diagram of a mention-pair relationship matrix provided by the present invention.
Fig. 4 is a schematic diagram of a drug adverse event relationship extraction network provided by the present invention.
FIG. 5 is a schematic diagram of a U-shaped semantic segmentation network structure provided by the present invention.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further explained below by combining the specific drawings.
Referring to fig. 1 to 5, the present invention provides a method for extracting adverse drug event relationship based on semantic segmentation, which includes the following steps:
s1, establishing a drug adverse event relation extraction model:
the drug adverse event relation extraction model is used for extracting drugs in a medical text and adverse events caused by the drugs, and the model structure comprises a local context information feature extractor, a semantic feature fusion device, a classifier and a sample imbalance processor; wherein,
the local context information feature extractor is used for extracting different mentioned local context features from the input of the medical text, and specifically comprises the following steps: with the benefit of the parallelism of the floating labels, a series of related mentions can be flexibly packaged into a training instance, given that one contains N text labels (for noting which of the textsCharacters belonging to common characters, no definition, which characters belong to a drug or an adverse event-referenced character)
Figure BDA0003820147590000081
First, as shown in FIG. 2, the fixed marks are inserted at the beginning and the end of the drug mention<s>And</s>to mark the drug mention location and then to mention the corresponding candidate adverse event with suspension marking<o>And</o>means for splicing behind text, i.e. for mentioning corresponding adverse events<o>And</o>the mark is placed behind the text, wherein<o>And</o>the same position codes are mentioned as corresponding adverse events, namely the same position codes are adopted for representing<o>And</o>with corresponding adverse event mentions, the combined sequence of text labels and inserted suspension labels is then provided to the BERT pre-training model to obtain a drug mention label local context representation e s And adverse event mention flag local context representation e o E is to be s And e o Split-together as corresponding drug mention and adverse event mention vs. insert representation
Figure BDA0003820147590000091
Wherein M represents the maximum logarithm of mentions of the composition of drug mentions and adverse event mentions in the sample, and finally obtaining an attention expression by using a BERT pre-training model
Figure BDA0003820147590000092
Wherein A is the average value of the attention heads in the last Encoder layer of the BERT pre-training model, and the attention matrix A and affine transformation from the BERT pre-training model are used for obtaining a mention pair relation matrix of the medicine and the adverse event:
Figure BDA0003820147590000093
wherein,
Figure BDA0003820147590000094
is a Hadamard product, W 1 Is a learnable parameter matrix, H is a drug mention and adverse event mention pair embedded representation, A s For drug mention e s Attention to all the labels of the document, obtained by averaging the mean of the drug mentions in the last Encoder layer the attention head, A o Mention of indicating adverse events e o Attention to all the indicia of the document was gained by averaging the average of the adverse events mentioned in the last Encoder layer with the attention head, F (s, o) representing the drug and adverse event mentioned pair (e) s ,e o ) A relationship matrix, as shown in FIG. 2 in particular;
the semantic feature fusion device is used for fusing the mentioned global dependencies by the local context information through the coding module and the U-shaped semantic segmentation network (see table 1 below), and specifically comprises: firstly, a reference pair relation matrix F epsilon R containing local context information M×M×D The image is used as a D channel image, namely a document level relation prediction formula is converted into a pixel mask in F, rich global features are obtained by utilizing a U-shaped semantic segmentation network, and the U-shaped semantic segmentation network comprises sequentially arranged global feature extraction blocks (sequence numbers 1-8), two upper sampling blocks (sequence numbers 9-14) with jump connection and a feature output layer (sequence number 15); as a specific way, the global feature extraction block includes three convolution modules and two maximum pooling layers, the first maximum pooling layer is located behind the first convolution module, the second maximum pooling layer is located behind the second convolution module, each convolution module includes two convolution layers, the convolution kernel sizes of the two convolution layers in each convolution module are 3 × 3 and the step size is 1, the convolution kernel sizes of the two maximum pooling layers are 2 × 2 and the step size is 2, the number of channels in the feature extraction block is doubled, the number of channels of the first convolution module and the first maximum pooling layer is 64, the number of channels of the second convolution module and the second maximum pooling layer is 128, the number of channels of the third convolution module is 256, the partition region in the reference pair relationship matrix refers to the occurrence of the relationship between the reference pairs, the U-shaped semantic partition network can facilitate the information exchange between the reference pair in the sense field, similar to implicit reasoning, specifically, the feature extraction block can enlarge the sense field of the current reference pair embedded F, thereby providing rich global information for representing learning(ii) a The two upsampling blocks respectively comprise an deconvolution layer and two convolution layers which are sequentially arranged, the first upsampling block is positioned behind the third convolution module, the second upsampling block is positioned behind the first upsampling block, the convolution kernel size of the deconvolution layer in the two upsampling blocks is 2 x 2, the step length is 2, the convolution kernel size of the two convolution layers is 3 x 3, the step length is 1, the number of channels in each upsampling block is halved, the aggregation information can be distributed to each pixel, the number of channels in the first upsampling block is 128, and the number of channels in the second upsampling block is 64; the output result of the deconvolution layer (serial number 12) in the second upsampling block is in jump connection with the output result of the second convolution layer (serial number 2) in the first convolution module, and the output result of the deconvolution layer (serial number 9) in the first upsampling block is in jump connection with the second convolution layer (serial number 5) in the second convolution module, so that the last upsampling block is taken as an example, the last upsampling block is characterized by not only the output (same-scale characteristic) from the first convolution module but also the output (large-scale characteristic) from the first upsampling block, and therefore the multi-scale characteristics are effectively fused together; the convolution kernel size of the characteristic output layer is 1 × 1, the step length is 1, and the number of channels is 1. The specific parameters of the U-shaped semantic segmentation network model are shown in table 1 below.
TABLE 1 Hyperparameter List of Global dependent network model architecture
Figure BDA0003820147590000101
Figure BDA0003820147590000111
Then combining an encoding module and a U-shaped semantic segmentation network to obtain a local context and global dependency information matrix Y:
Y=U(W 2 F)
wherein Y ∈ R M×M×D' Representing a local context and a global dependency information matrix, U ∈ R M×M×D' Representing a U-shaped semantically segmented network, W 2 Is a learnable weight matrix to reduce the dimension of F, and D' is far awayLess than D, W 2 F represents an encoding module;
the classifier is used for predicting adverse drug event relations through a local context and global dependency information matrix and a smooth embedding expression, and specifically comprises the following steps: the same mention may occur multiple times in the document, so m is embedded first with a local context of mention at a different location in the document,
Figure BDA0003820147590000112
obtaining the same mentioned smooth embedded representation E by using the maximally pooled smooth version i
Figure BDA0003820147590000113
Wherein, E i Denotes a mention of e i Is to be represented by a smooth embedding of (c),
Figure BDA0003820147590000114
indicating a drug or adverse event mention e in a document i Total number of occurrences;
smoothly embedding representation E in separately obtained drug and adverse event s And E o After the local context and the global dependency information matrix Y, the classifier utilizes a feedforward neural network to classify E s 、E o Mapping Y to a hidden representation z, and then obtaining a relation probability through a bilinear function, wherein the specific process is as follows:
z s =tanh(W s E s +Y s,o )
z o =tanh(W o E o +Y s,o )
P(r|E s ,E o )=σ(z s W r z o +b r )
wherein z is s Is a hidden representation of the drug, z o Is a hidden representation of an adverse event, P is a relationship probability, Y s,o Is the drug and adverse event mention in matrix Y for (e) s ,e o ) Is represented by the local context and global dependency information, tanh is a non-lineThe linear activating function is mainly used for non-linear transformation, sigma is the probability value of the bilinear function for outputting the prediction result, W s 、W o 、W r 、b r Is a learnable parameter matrix;
the sample imbalance processor is used for training by introducing a balanced softmax method and introducing an additional class 0 to process the class imbalance problem in the sample set, and the target class scores are expected to be larger than a threshold value t 0 The scores of all the non-target categories are less than the threshold value t 0
Figure BDA0003820147590000121
Wherein L represents the target loss function, log represents the logarithm based on e, e represents the constant, t i Indicates the probability of the ith positive label, t j The probability of the jth negative label is represented, the omega pos represents the drug and the adverse event mention relationship corresponding to the drug, namely the positive label, and the omega neg represents the drug and the non-corresponding adverse event mention relationship, namely the negative label;
as a specific embodiment, the threshold t is set for simplicity 0 Set to 0, the formula for calculating the target loss function L is simplified as follows:
Figure BDA0003820147590000122
s2, data preprocessing: in medical texts, there are cases of referring to different writing methods, some refer to only initials, some refer to abbreviations of letters, and the like, so that the name referring unification processing of an entity needs to be performed, and the name referring unification processing is performed by specifically adopting the following method:
firstly, carrying out pause word removing processing on the mentions in the medical text, then carrying out regularization matching, and classifying the mentions with the regularization matching degree higher than 90% as the same mentions.
S3, model training and parameter optimization: training the model by using the processed data, designing an objective optimization function to optimize network parameters, and generating an optimal extraction model, which specifically comprises the following steps of:
s31, the data set is divided into 7:2:1, dividing the training set, the verification set and the test set in proportion; as a specific embodiment, the inventors of the present application obtained 505 pieces of medical document data in total;
and S32, adopting a classified cross entropy loss function as an optimization target, wherein the target function is realized by adopting the same formula as the target loss function L calculated in the sample imbalance processor in the step S1, namely the target function is as follows:
Figure BDA0003820147590000131
and S33, optimizing the objective function by adopting the conventional stochastic gradient descent algorithm, and updating the network model parameters by using error back propagation.
S4, extracting adverse drug event relations:
s41, preprocessing medical text data to be extracted to obtain standardized sample data (see data preprocessing step), and defining the medicine and the non-corresponding adverse event mention relation pair category as 0;
s42, forming a training sample for a medical sample and all drug mentions and adverse event mentions contained in the medical sample, wherein the drug mentions adopt fixed marks, namely two fixed marks of < S > and </S > are directly inserted before and after all drug mentions, and the adverse event mentions are spliced behind a text in a suspension mark mode represented by < o > and </o >;
s43, feeding the sample into a BERT pre-training model, and for each pair of drug and adverse event mention mark pairs, splicing the local context representation of the drug mention mark and the local context representation of the adverse event mention mark together, namely splicing the characterization of the drug mention mark and the characterization of the adverse event mention mark together as corresponding embedded representations or characterizations of the drug mention mark and the adverse event mention pair;
s44, after embedding representation or characterization of all drug mention and adverse event mention pairs containing local context information in the obtained sample, carrying out affine transformation on the drug mention and adverse event mention pairs and an attention layer of a BERT pre-training model to obtain a mention pair relation matrix of the drug and the adverse event;
s45, combining the mention pair relation matrix containing the local context information with a coding module, and acquiring rich global features by utilizing a U-shaped semantic segmentation network so as to output all local contexts and a global dependency information matrix;
s46, obtaining smooth embedded expression of the medicine and the adverse event, mapping the smooth embedded expression of the medicine and the adverse event, the local context and the global dependency information matrix to a hidden expression by utilizing a feedforward neural network, and then obtaining a relation probability, namely a relation score through a bilinear function;
and S47, calculating scores of the positive sample relation and the negative sample relation by introducing a softmax method, and enabling the scores of the positive sample relation to be larger than 0.
Compared with the prior art, the extraction method of adverse drug event relation based on semantic segmentation has the following beneficial effects:
1. according to the invention, when the embedded representation is obtained, the suspension mark representation is utilized, so that different mentioned embedded representations can be more effectively distinguished, and the prediction accuracy can be obviously improved.
2. The U-shaped semantic segmentation network is used for capturing the global interdependence relation among triples (medicine, adverse event and medicine adverse event relation), so that the extraction model can more effectively solve the problem that the distance between the reference pairs is too long and key information cannot be found.
3. An encoding module is introduced to capture locally mentioned context information, and the global interdependence relationship is fused, so that the extraction model can more fully understand global semantics, and specific adverse drug events can be better distinguished.
4. A balanced softmax method is used for processing the problem of unbalanced relation distribution, and the condition of 'undersampling' of an extraction model is avoided, so that the accuracy of relation classification is improved.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.

Claims (4)

1. A method for extracting adverse drug event relation based on semantic segmentation is characterized by comprising the following steps:
s1, establishing a drug adverse event relation extraction model:
the drug adverse event relation extraction model is used for extracting drugs in a medical text and adverse events caused by the drugs, and the model structure comprises a local context information feature extractor, a semantic feature fusion device, a classifier and a sample imbalance processor; wherein,
the local context information feature extractor is used for extracting different mentioned local context features from the input of the medical text, and specifically comprises the following steps: given a drug adverse event document containing N text labels
Figure FDA0003820147580000011
Firstly, fixing marks are inserted at the beginning and the end of the drug mention<s>And</s>to mark drug mention locations and then mention corresponding candidate adverse events using suspension labeling<o>And</o>ways to stitch behind text, where<o>And</o>encoding the same location as the corresponding adverse event mention, then providing the combined sequence of text labels and inserted suspension labels to the BERT pre-training model to obtain the drug mention label local context representation e s And adverse event mention flag local context representation e o E is to be s And e o Split-together as corresponding drug mention and adverse event mention versus insert representation
Figure FDA0003820147580000012
Wherein M represents in the sampleMaximum logarithm of mentions of drug mention and adverse event mention composition, and finally obtaining attention representation by using BERT pre-training model
Figure FDA0003820147580000013
Wherein A is the average value of the attention heads in the last Encoder layer of the BERT pre-training model, and the attention matrix A and affine transformation from the BERT pre-training model are used for obtaining a reference pair relation matrix of the medicine and the adverse event:
Figure FDA0003820147580000014
wherein,
Figure FDA0003820147580000015
is a Hadamard product, W 1 Is a learnable parameter matrix, H is a drug mention and adverse event mention pair embedded representation, A s Denotes drug mention e s Attention to all labels of the document is gained by averaging the mean of the drug references to the attention head in the last Encoder layer, A o Mention of indicating adverse events e o Attention to all the indicia of the document was gained by averaging the average of the adverse events mentioned in the last Encoder layer with the attention head, F (s, o) representing the drug and adverse event mentioned pair (e) s ,e o ) A relationship matrix;
the semantic feature fusion device is used for fusing the mentioned global dependencies of the local context information through a coding module and a U-shaped semantic segmentation network, and specifically comprises the following steps: firstly, a reference pair relation matrix F epsilon R containing local context information M×M×D The method is used as a D-channel image, and is combined with a coding module, then rich global features are obtained by utilizing a U-shaped semantic segmentation network, the U-shaped semantic segmentation network comprises a global feature extraction block, two upsampling blocks with jump connection and a feature output layer which are sequentially arranged, and therefore a local context and a global dependency information matrix are obtained:
Y=U(W 2 F)
wherein Y ∈ R M×M×D' Represents local context and global dependency information matrix, U belongs to R M×M×D' Representing a U-shaped semantically segmented network, W 2 Is a learnable weight matrix to reduce the dimension of F, and D' is much smaller than D, W 2 F represents an encoding module;
the classifier is used for predicting adverse drug event relations through a local context and global dependency information matrix and a smooth embedding representation, and specifically comprises the following steps: m is embedded first using the local context mentioned at different locations in the document,
Figure FDA0003820147580000021
obtaining the same mentioned smooth embedded representation E using the maximally pooled smoothed version i
Figure FDA0003820147580000022
Wherein E is i Denotes a mention of e i Is to be presented in a smooth embedded representation,
Figure FDA0003820147580000023
indicating a drug or adverse event mention e in a document i Total number of occurrences;
smoothly embedding representation E in separately obtained drug and adverse event s And E o After the local context and the global dependency information matrix Y, the classifier utilizes a feedforward neural network to classify E s 、E o Mapping Y to a hidden representation z, and then obtaining a relation probability through a bilinear function, wherein the specific process is as follows:
z s =tanh(W s E s +Y s,o )
z o =tanh(W o E o +Y s,o )
P(r|E s ,E o )=σ(z s W r z o +b r )
wherein z is s Is a hidden representation of the drug, z o Is not provided withHidden representation of good events, P is the probability of relationship, Y s,o Is a reference to drugs and adverse events in matrix Y (e) s ,e o ) Is represented by the local context and global dependency information, tanh is a nonlinear activation function, σ is a bilinear function, W s 、W o 、W r 、b r Is a learnable parameter matrix;
the sample imbalance processor is used for training by introducing a balanced softmax method and introducing an additional class 0 to process the class imbalance problem in the sample set, and the target class scores are expected to be larger than a threshold value t 0 The scores of the non-target categories are all less than the threshold t 0
Figure FDA0003820147580000031
Wherein L represents an objective loss function, log represents a logarithm based on e, e represents a constant, t i Indicates the probability of the ith positive label, t j The probability of the jth negative label is represented, omega pos represents a medicine and an adverse event mention relation corresponding to the medicine, namely a positive label, and omega neg represents a medicine and an adverse event mention relation not corresponding to the medicine, namely a negative label;
s2, preprocessing data, specifically adopting the following method to perform reference unification processing:
firstly, carrying out pause word removing processing on mentions in the medical text, then carrying out regularization matching, and classifying mentions with regularization matching degree higher than 90% as the same mention;
s3, model training and parameter optimization: training the extraction model by using the processed data, designing an objective optimization function to optimize network parameters, and generating an optimal extraction model, which specifically comprises the following steps of:
s31, the data set is divided into 7:2:1, dividing the training set, the verification set and the test set in proportion;
s32, adopting a balanced softmax classified cross entropy loss function as an optimization target, wherein the target function is realized by adopting the same formula as the target loss function L calculated in the sample unbalance processor in the step S1;
s33, optimizing a target function by adopting a random gradient descent algorithm, and updating network model parameters by using error back propagation;
s4, extracting adverse drug event relations:
s41, preprocessing medical text data to be extracted to obtain standardized sample data, and defining the medicine and the non-corresponding adverse event mention relationship pair category as 0;
s42, forming a training sample for a medical sample and all medicine mentions and adverse event mentions contained in the medical sample, directly inserting two fixed marks of < S > and </S > before and after all medicine mentions, and splicing the adverse event mentions behind a text in a suspension mark mode represented by < o > and </o >;
s43, feeding the sample into a BERT pre-training model, and for each pair of drug and adverse event mention mark pairs, splicing the local context representation of the drug mention mark and the local context representation of the adverse event mention mark together as corresponding drug mention and adverse event mention pair embedded representation;
s44, after embedding representation of all drug mention and adverse event mention pairs containing local context information of the sample is obtained, carrying out affine transformation on the embedded representation and an attention layer of a BERT pre-training model to obtain a mention pair relation matrix of the drug and the adverse event;
s45, combining the mention pair relation matrix containing the local context information with a coding module, and acquiring rich global features by utilizing a U-shaped semantic segmentation network so as to output all local contexts and a global dependency information matrix;
s46, obtaining smooth embedded expression of the medicines and the adverse events, utilizing a feedforward neural network to map the smooth embedded expression of the medicines and the adverse events, a local context and a global dependency information matrix to hidden expression, and then obtaining relation probability, namely relation score through a bilinear function;
and S47, calculating scores of the positive sample relation and the negative sample relation by introducing a softmax method, and enabling the scores of the positive sample relation to be larger than 0.
2. The method for extracting adverse drug event relation based on semantic segmentation according to claim 1, wherein in the U-shaped semantic segmentation network utilized by the semantic feature fusion device in step S1, the global feature extraction block includes three convolution modules and two maximum pooling layers, the first maximum pooling layer is located after the first convolution module, the second maximum pooling layer is located after the second convolution module, each convolution module includes two convolution layers, and the number of channels in the feature extraction block is doubled; the two up-sampling blocks respectively comprise an anti-convolution layer and two convolution layers which are sequentially arranged, the first up-sampling block is positioned behind the third convolution module, the second up-sampling block is positioned behind the first up-sampling block, and the number of channels in each up-sampling block is reduced by half; and the output result of the deconvolution layer in the second up-sampling block is in jumping connection with the output result of the second convolution layer in the first convolution module, and the output result of the deconvolution layer in the first up-sampling block is in jumping connection with the output result of the second convolution layer in the second convolution module.
3. The method according to claim 2, wherein the convolution kernel sizes of the two convolution layers in each convolution module are 3 x 3 and 1 step size, the convolution kernel sizes of the two largest pooling layers are 2 x 2 and 2 step sizes, the convolution kernel sizes of the deconvolution layers in the two upsampling blocks are 2 x 2 and 2 step sizes, the convolution kernel sizes of the two convolution layers are 3 x 3 and 1 step size, and the convolution kernel size of the feature output layer is 1 x 1 and 1 step size.
4. The method for extracting adverse drug event relationship based on semantic segmentation as claimed in claim 1, wherein the step S1 sample imbalance processor sets the threshold t 0 Set to 0, the formula for calculating the target loss function L is simplified as follows:
Figure FDA0003820147580000051
CN202211040440.0A 2022-08-29 2022-08-29 Drug adverse event relation extraction method based on semantic segmentation Pending CN115392256A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211040440.0A CN115392256A (en) 2022-08-29 2022-08-29 Drug adverse event relation extraction method based on semantic segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211040440.0A CN115392256A (en) 2022-08-29 2022-08-29 Drug adverse event relation extraction method based on semantic segmentation

Publications (1)

Publication Number Publication Date
CN115392256A true CN115392256A (en) 2022-11-25

Family

ID=84121722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211040440.0A Pending CN115392256A (en) 2022-08-29 2022-08-29 Drug adverse event relation extraction method based on semantic segmentation

Country Status (1)

Country Link
CN (1) CN115392256A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521888A (en) * 2023-03-20 2023-08-01 麦博(上海)健康科技有限公司 Method for extracting medical long document cross-sentence relation based on DocRE model
CN116628577A (en) * 2023-07-26 2023-08-22 安徽通灵仿生科技有限公司 Adverse event detection method and device for ventricular assist device
CN117095782A (en) * 2023-10-20 2023-11-21 上海森亿医疗科技有限公司 Medical text quick input method, system, terminal and editor
CN117744657A (en) * 2023-12-26 2024-03-22 广东外语外贸大学 Medicine adverse event detection method and system based on neural network model
CN118428471A (en) * 2024-07-02 2024-08-02 湖南董因信息技术有限公司 Atlas relation extraction method based on pre-training model enhancement

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521888A (en) * 2023-03-20 2023-08-01 麦博(上海)健康科技有限公司 Method for extracting medical long document cross-sentence relation based on DocRE model
CN116628577A (en) * 2023-07-26 2023-08-22 安徽通灵仿生科技有限公司 Adverse event detection method and device for ventricular assist device
CN116628577B (en) * 2023-07-26 2023-10-31 安徽通灵仿生科技有限公司 Adverse event detection method and device for ventricular assist device
CN117095782A (en) * 2023-10-20 2023-11-21 上海森亿医疗科技有限公司 Medical text quick input method, system, terminal and editor
CN117095782B (en) * 2023-10-20 2024-02-06 上海森亿医疗科技有限公司 Medical text quick input method, system, terminal and editor
CN117744657A (en) * 2023-12-26 2024-03-22 广东外语外贸大学 Medicine adverse event detection method and system based on neural network model
CN118428471A (en) * 2024-07-02 2024-08-02 湖南董因信息技术有限公司 Atlas relation extraction method based on pre-training model enhancement

Similar Documents

Publication Publication Date Title
CN112001177B (en) Electronic medical record named entity recognition method and system integrating deep learning and rules
CN111540468B (en) ICD automatic coding method and system for visualizing diagnostic reasons
CN115392256A (en) Drug adverse event relation extraction method based on semantic segmentation
JP2022541199A (en) A system and method for inserting data into a structured database based on image representations of data tables.
CN113051356B (en) Open relation extraction method and device, electronic equipment and storage medium
CN108959566B (en) A kind of medical text based on Stacking integrated study goes privacy methods and system
CN113901207B (en) Adverse drug reaction detection method based on data enhancement and semi-supervised learning
CN112988963B (en) User intention prediction method, device, equipment and medium based on multi-flow nodes
CN113160917B (en) Electronic medical record entity relation extraction method
CN112016314A (en) Medical text understanding method and system based on BERT model
CN113704429A (en) Semi-supervised learning-based intention identification method, device, equipment and medium
CN113393916B (en) Method and device for extracting structural relationship of coronary artery medical report
US20220398374A1 (en) Method and apparatus for segmenting a medical text report into sections
CN114942991B (en) Emotion classification model construction method based on metaphor recognition
CN112686044A (en) Medical entity zero sample classification method based on language model
CN114153978A (en) Model training method, information extraction method, device, equipment and storage medium
CN116797848A (en) Disease positioning method and system based on medical image text alignment
CN114970536A (en) Combined lexical analysis method for word segmentation, part of speech tagging and named entity recognition
CN115935914A (en) Admission record missing text supplementing method
CN111523320A (en) Chinese medical record word segmentation method based on deep learning
Yan et al. Chemical name extraction based on automatic training data generation and rich feature set
CN114511084A (en) Answer extraction method and system for automatic question-answering system for enhancing question-answering interaction information
CN114444467A (en) Traditional Chinese medicine literature content analysis method and device
CN117034948B (en) Paragraph identification method, system and storage medium based on multi-feature self-adaptive fusion
CN117422074A (en) Method, device, equipment and medium for standardizing clinical information text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination