CN113065350A - Biomedical text word sense disambiguation method based on attention neural network - Google Patents

Biomedical text word sense disambiguation method based on attention neural network Download PDF

Info

Publication number
CN113065350A
CN113065350A CN202110395920.8A CN202110395920A CN113065350A CN 113065350 A CN113065350 A CN 113065350A CN 202110395920 A CN202110395920 A CN 202110395920A CN 113065350 A CN113065350 A CN 113065350A
Authority
CN
China
Prior art keywords
biomedical
semantic
ambiguous
neural network
disambiguation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110395920.8A
Other languages
Chinese (zh)
Inventor
逄淑阳
张春祥
王明磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN202110395920.8A priority Critical patent/CN113065350A/en
Publication of CN113065350A publication Critical patent/CN113065350A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a biomedical text word sense disambiguation method based on an attention mechanism (attention mechanism), an Asymmetric Convolutional Neural Network (ACNN) and a Bidirectional Long Short Term Memory network (Bi-LSTM). Firstly, processing a biomedical MSH corpus, and performing word segmentation, part-of-speech tagging and semantic tagging on English sentences containing ambiguous vocabularies to obtain processed training corpus and test corpus; then training the model by using the training corpus to obtain an optimized attention neural network model; on the optimized model, disambiguating the test corpus to obtain probability distribution of ambiguous vocabularies under each semantic category; the semantic category with the highest probability is the semantic category of the ambiguous vocabulary. The invention realizes good disambiguation on biomedicine ambiguous words and more accurately judges the real meaning of the biomedicine ambiguous words.

Description

Biomedical text word sense disambiguation method based on attention neural network
The technical field is as follows:
the invention relates to a biomedical text word sense disambiguation method based on an attention neural network, which is well applied to the field of natural language processing.
Background art:
biomedical texts are now so large that automated tools are needed to process them efficiently. However, automated processing of biomedical text is a difficulty. The reason for this is that many ambiguous words exist in the biomedical field. Determining semantic categories for biomedical words facilitates automatic processing of biomedical articles. At present, biomedical word sense disambiguation has been widely applied to biomedical natural language processing tasks, such as text indexing, text classification, named entity extraction, and the like.
Biomedical word sense disambiguation methods can be divided into three categories: supervised, unsupervised and knowledge-based approaches. In a supervised approach, classifiers are trained using labeled datasets and lexical and syntactic information in context to predict the correct sense of a biomedical word in a test dataset. In an unsupervised approach, unlabeled biomedical text is used to provide a choice of meaning for the biomedical vocabulary. In knowledge-based vocabulary classification, narrative and semantic tables are employed to determine semantic categories of biomedical vocabulary. In recent years, deep learning algorithms have been widely applied to biomedical word sense disambiguation, such as convolutional neural networks and cyclic neural networks, among others. In convolutional neural networks, the weights of the neurons are shared. Therefore, the neurons can share resources, the complexity of a network model is reduced, and the over-fitting phenomenon is prevented. The method has a very good effect on text processing in the recurrent neural network. For biomedicine ambiguous words, a deep learning algorithm can be well applied to disambiguation, and correct semantic classification is realized.
The invention content is as follows:
the invention discloses a biomedical text word sense disambiguation method based on an attention neural network, aiming at solving the problem of word ambiguity in the field of natural language processing.
Therefore, the invention provides the following technical scheme:
1. an attention neural network-based biomedical text word sense disambiguation method, comprising the steps of:
step 1: performing word segmentation, part-of-speech tagging and semantic information tagging on all biomedical ambiguous words and sentences contained in the MSH corpus, and selecting the morphological, part-of-speech and semantic information of four adjacent word units on the left and right of the biomedical ambiguous words as disambiguation characteristics.
Step 2: and (3) extracting the morphological, part of speech and semantic information of four adjacent Word units on the left and right of the biomedical ambiguous Word, and generating a corresponding Word vector by using the corpora which is trained and processed by Word2 vec. And selecting a small part of generated sentences as test data, and the rest of the sentences as training data.
And step 3: the training includes two processes, forward propagation and backward propagation. And training data is used as input of the attention neural network model training, and the optimized attention neural network model is obtained through the training of the attention neural network model.
And 4, step 4: the testing process is a forward propagation process, i.e. a semantic classification process. And inputting test data on the optimized attention neural network model, and calculating the probability distribution of the biomedical ambiguous words under each semantic category, wherein the semantic category with the maximum probability is the semantic category of the biomedical ambiguous words.
2. The biomedical text word sense disambiguation method based on the attention neural network as claimed in claim 1, wherein in step 1, word segmentation, part of speech tagging and semantic information tagging are performed on a chinese sentence, and disambiguation features are extracted, specifically comprising the steps of:
step 1-1, segmenting words of English sentences according to spaces in the sentences;
step 1-2, performing part-of-speech tagging on the segmented vocabulary by using a part-of-speech tagging tool;
step 1-3, semantic annotation is carried out on the segmented words by utilizing a semantic annotation tool;
and performing part-of-speech tagging and semantic tagging on all English sentences contained in the corpus by using an English part-of-speech tagging tool and an English semantic tagging tool, and selecting the morphological, part-of-speech and semantic information of four adjacent word units on the left and right of the biomedical ambiguous word as disambiguation characteristics.
3. The method for disambiguating Word senses in biomedical texts based on attention neural network as claimed in claim 1, wherein in said step 2, based on biomedical MSH corpus, Word2vec is used for extraction training to generate corresponding Word vectors, and the specific steps are as follows:
step 2-1, extracting the morphological, part of speech and semantic information of four adjacent word units on the left and right of the biomedical ambiguous word;
step 2-2, a CBOW model in Word2vec is used for obtaining a Word vector corresponding to each disambiguation feature, a small part of processed sentences are selected as test data, and the rest are used as training data.
4. The biomedical text word sense disambiguation method based on the attention neural network as claimed in claim 1, wherein in the step 3, the attention neural network model is trained, and the specific steps are as follows:
and (3) forward propagation process:
step 3-1, inputting training data into the initialized attention neural network model;
step 3-2, extracting disambiguation characteristics through an attention layer, and dynamically capturing the relation between words;
and 3-3, extracting more disambiguation characteristics through the asymmetric convolution layer. The asymmetric convolution can obtain different characteristic information according to convolution kernels with different sizes, meanwhile, the calculated amount can be reduced, the model calculation speed is increased, and overfitting is effectively prevented;
3-4, acquiring effective characteristic information from a forward network and a backward network through a bidirectional long-short term memory network layer, splicing the information and inputting the information into a full-connection layer, reducing the dimension of the extracted disambiguation characteristics, and connecting the disambiguation characteristics into a one-dimensional disambiguation characteristic vector; .
Step 3-5 utilizes softmax layer to calculate biomedical ambiguous vocabulary m in each semantic category si(i 1, 2.., n), the softmax function being as follows:
Figure BDA0003018570020000041
wherein, aiInput data representing the softmax layer, P(s)i| m) represents the biomedical ambiguous vocabulary m in semantic category siThe probability of occurrence of (i ═ 1, 2.., n).
Step 3-6 fromP(s1|m)、P(s2|m)、...、P(snAnd | m) selecting the maximum probability as the prediction probability.
Figure BDA0003018570020000042
Wherein y _ predictedjRepresenting the predicted probability of the biometrically ambiguous vocabulary m.
Step 3-7 predicts the probability y _ predictedjAnd true probability yjA comparison is made and the error loss is calculated using a cross entropy loss function.
The error loss is calculated as follows:
Figure BDA0003018570020000043
wherein, yjMeaning that the biomedicine ambiguous word m belongs to the semantic class siThe true probability of.
And (3) a back propagation process:
updating parameters layer by layer according to the error loss back propagation, wherein the parameter updating process is as follows:
Figure BDA0003018570020000044
where θ denotes a parameter set, θ' denotes an updated parameter set, and a denotes a learning rate.
And continuously iterating the attention neural network model to obtain the optimized attention neural network model.
5. The biomedical text word sense disambiguation method based on the attention neural network as claimed in claim 1, wherein in the step 4, the biomedical ambiguous word m is semantically classified by:
and (3) semantic classification process:
step 4-1, inputting the test data into the optimized attention neural network model;
step 4-2, dynamically capturing the relation between words through an attention layer;
step 4-3, more effective information is extracted and the calculated amount is reduced through the asymmetric convolution layer;
step 4-4, respectively acquiring information from a forward network and a backward network through a bidirectional long-short term memory network layer, splicing, entering a full connection layer, reducing the dimension of the extracted disambiguation features, and connecting into a one-dimensional disambiguation feature vector;
step 4-5 utilizes the softmax layer to calculate the probability distribution of the biomedical ambiguous vocabulary m under each semantic category. The semantic category s' with the maximum probability is the semantic category of the biomedical ambiguous vocabulary.
The semantic class s' is determined as follows:
Figure BDA0003018570020000051
wherein s' represents the semantic class with the highest probability, n represents the number of semantic classes, P(s)1|m),...,P(si|m),...,P(sn| m) represents the probability distribution sequence of the biomedical ambiguous vocabulary m under the semantic category.
Has the advantages that:
1. the invention relates to a biomedical text word sense disambiguation method based on an attention neural network. The English sentence is subjected to word segmentation, part of speech tagging and semantic information tagging. Based on the biomedical MSH corpus, Word vectors of sentences are extracted by using Word2vec, and the trained Word vectors are used as disambiguation characteristics. The extracted disambiguating features are of higher quality.
2. The model used by the invention mainly comprises an attention mechanism, an asymmetric convolution neural network and a bidirectional long-time and short-time memory neural network. The attention mechanism can dynamically capture the relation between words, the non-butt convolution neural network not only has the advantages of local perception and parameter sharing of the convolution neural network, but also reduces the calculated amount to accelerate the training speed, can well process high-dimensional data, can acquire effective information from the front direction and the back direction by long-time memory neural network, and has a good effect on text processing. As long as the attention neural network model is trained, a better classification effect can be obtained.
3. The classifier used by the invention is a softmax classifier, and can not only solve the data processing of the second class classification, but also solve the data processing of the multi-class classification.
4. And when the model is trained, updating parameters by adopting a random gradient descent method. By calculating the error, the error returns along the original route through back propagation, namely, the error reversely passes through each intermediate hidden layer from the output layer, each layer of parameters is updated layer by layer, and finally the error returns to the output layer. And continuously carrying out forward propagation and backward propagation to reduce errors and update model parameters until the attention neural network model is trained. The parameters are continuously updated along with the back propagation of the errors, and the disambiguation accuracy of the whole attention neural network model on the input data is improved.
Description of the drawings:
fig. 1 is a flowchart of a biomedical text word sense disambiguation method based on an attention neural network according to an embodiment of the present invention.
FIG. 2 is a training process of a biomedical text word sense disambiguation method based on an attention neural network according to an embodiment of the present invention.
FIG. 3 is a testing process of a biomedical text word sense disambiguation method based on an attention neural network according to an embodiment of the present invention.
The specific implementation mode is as follows:
in order to clearly and completely describe the technical solutions in the embodiments of the present invention, the present invention is further described in detail below with reference to the drawings in the embodiments.
Take the disambiguation processing of the ambiguous word "ADA" in the english sentence "a message from ADA predicted Feldman" as an example.
The flow chart of the biomedical text word sense disambiguation method based on the attention neural network, disclosed by the embodiment of the invention, is shown in fig. 1 and comprises the following steps.
Step 1, the extraction process of the disambiguation characteristics is as follows:
english sentence: a message from ADA president Feldman.
Step 1-1, segmenting words of English sentences according to spaces in the sentences, wherein the word segmentation result is as follows: access from ADA president Feldman.
Step 1-2, performing part-of-speech tagging on the segmented vocabulary by using a part-of-speech tagging tool, wherein the part-of-speech tagging result is as follows: A/DT message/NN from/IN ADA/NNP president/NN Feldman/NNP.
Step 1-3, semantic labeling is carried out on the segmented words by using a semantic labeling tool, and the semantic information labeling result is as follows: a/angstrom.n.01 message/message.n.01 from/-1 ADA/adenosine _ deamidase.n.01 president/president.n.01 Feldman/-1.
The segmentation, part of speech tagging and semantic information tagging results of the English sentence containing the biomedical ambiguous word "ADA" are as follows: A/DT/angstrom.n.01 message/NN/message.n.01 from/IN/-1 ADA/NNP/adenosine _ deamidase.n.01 president/NN/president.n.01 Feldman/NNP/-1.
Step 2, using Word2vec to train the medical text to generate the disambiguation feature vector.
Step 2-1 extracts four adjacent vocabulary units on the left and right of the biomedical ambiguous vocabulary, namely "message/NN/message.n.01", "from/IN/-1", "predicted/NN/predicted.n.01" and "Feldman/NNP/-1", respectively, from the English sentence containing the biomedical ambiguous vocabulary "ADA". A total of 12 disambiguating features were extracted.
The word vector generated in the step 2-2 is 100 dimensions, and the word vector with 1200 dimensions is generated by splicing 12 disambiguation characteristics.
Step 3 the biomedical ambiguous word "ADA" has two semantic categories, namely American Dental Association and Adenosine Deaminase.
The embodiment of the invention relates to a training process of a biomedical text word sense disambiguation method based on an attention neural network and a testing process of the biomedical text word sense disambiguation method based on the attention neural network, which are shown in fig. 2 and fig. 3. The method specifically comprises the following steps:
and (3) forward propagation process:
step 3-1, inputting a feature vector formed by splicing 12 disambiguation features into an initialized attention neural network model as training data;
step 3-2, extracting disambiguation characteristics through an attention layer, and dynamically capturing the relation between words;
and 3-3, extracting more disambiguation characteristics through the asymmetric convolution layer. The asymmetric convolution can obtain different characteristic information according to convolution kernels with different sizes, meanwhile, the calculated amount can be reduced, the model calculation speed is increased, and overfitting is effectively prevented;
3-4, acquiring effective characteristic information from a forward network and a backward network through a bidirectional long-short term memory network layer, splicing the information into a full connection layer, reducing the dimension of the extracted disambiguation characteristics, and connecting the disambiguation characteristics into a one-dimensional disambiguation characteristic vector;
step 3-5, calculating the prediction probability of the biomedical ambiguous word "ADA" under semantic categories "American Dental Association" and "Adenosine Deaminase" by utilizing a softmax layer;
the calculation process of the softmax function is as follows:
Figure BDA0003018570020000081
Figure BDA0003018570020000082
wherein, asRepresenting the input data of the softmax layer, P (American deep Association | ADA) represents the probability of occurrence of the biomedical ambiguous word "ADA" under the semantic category "American deep Association", and P (Adenosine Deaminase | ADA) represents the probability of occurrence of the biomedical ambiguous word "ADA" under the semantic category "Adenosine Deaminase".
And 3-6, selecting the maximum probability from P (American Central Association) ADA and P (Adenosine Deaminase ADA) as the prediction probability.
y_predicted=max(P(American Dental Association|ADA),P(Adenosine Deaminase|ADA))
Where y _ predicted represents the prediction probability of the ambiguous word "ADA," 94.47%.
And 3-6, comparing the predicted probability y _ predicted and the real probability y of the attention neural network, and calculating the error by using a cross entropy loss function.
The error calculation process is as follows:
lossADA=(ylog(y_predicted)+(1-y)log(1-y_predicted))
therein, lossADAError representing the biomedical ambiguous word "ADA".
And (3) a back propagation process:
according to error lossADAAnd reversely propagating the error, and updating the parameters of each layer by layer, wherein the parameter updating process is as follows:
Figure BDA0003018570020000091
wherein, thetaADAParameter set, θ ', representing the biomedical ambiguous vocabulary "ADA'ADADenotes the parameter set after update, and a is the learning rate.
And continuously iterating the attention neural network model to obtain the optimized attention neural network model.
Step 4, model testing, namely a semantic classification process, specifically comprises the following steps:
step 4-1, inputting the test data into the optimized attention neural network model;
step 4-2, dynamically capturing the relation between words through an attention layer;
step 4-3, more effective information is extracted and the calculated amount is reduced through the asymmetric convolution layer;
step 4-4, respectively acquiring information from a forward network and a backward network through a bidirectional long-short term memory network layer, splicing, entering a full connection layer, reducing the dimension of the extracted disambiguation features, and connecting into a one-dimensional disambiguation feature vector;
and 4-5, calculating the probability of the biomedical ambiguous vocabulary "ADA" under each semantic category through a softmax layer, wherein the semantic category corresponding to the maximum probability is the semantic category of the ambiguous vocabulary.
The semantic class s' of the biomedical ambiguous word "ADA" is determined as follows:
Figure BDA0003018570020000101
wherein s' represents that the semantic category corresponding to the biomedical ambiguous word "ADA" is American digital Association, and P (s | ADA) represents the probability distribution of the biomedical ambiguous word "ADA" under each semantic category.
Through the attention neural network model, meaning disambiguation is carried out on English sentences "Address from ADA president Feldman" containing biomedical ambiguous words "ADA", and semantic categories corresponding to the ambiguous words "ADA" are American Dental Association (American Dental Association) and Adenosine Deaminase (Adenosine Deaminase).
According to the biomedical text word sense disambiguation method based on the attention neural network, accurate disambiguation characteristics can be selected, the semantic category of the biomedical ambiguous words can be determined by adopting the attention neural network model, and the accuracy is high.
The foregoing is a detailed description of embodiments of the invention, taken in conjunction with the accompanying drawings, wherein the specific embodiments are merely provided to assist in understanding the method of the invention. For those skilled in the art, the invention can be modified and adapted within the scope of the embodiments and applications according to the spirit of the present invention, and therefore the present invention should not be construed as being limited thereto.

Claims (5)

1. An attention neural network-based biomedical text word sense disambiguation method, comprising the steps of:
step 1: performing word segmentation, part-of-speech tagging and semantic information tagging on all biomedical ambiguous words and sentences contained in the MSH corpus, and selecting the morphological, part-of-speech and semantic information of four adjacent word units on the left and right of the biomedical ambiguous words as disambiguation characteristics.
Step 2: and (3) extracting the morphological, part of speech and semantic information of four adjacent Word units on the left and right of the biomedical ambiguous Word, and generating a corresponding Word vector by using the corpora which is trained and processed by Word2 vec. And selecting a small part of generated sentences as test data, and the rest of the sentences as training data.
And step 3: the training includes two processes, forward propagation and backward propagation. And training data is used as input of the attention neural network model training, and the optimized attention neural network model is obtained through the training of the attention neural network model.
And 4, step 4: the testing process is a forward propagation process, i.e. a semantic classification process. And inputting test data on the optimized attention neural network model, and calculating the probability distribution of the biomedical ambiguous words under each semantic category, wherein the semantic category with the maximum probability is the semantic category of the biomedical ambiguous words.
2. The biomedical text word sense disambiguation method based on the attention neural network as claimed in claim 1, wherein in step 1, word segmentation, part of speech tagging and semantic information tagging are performed on an english sentence, and disambiguation features are extracted, and the specific steps are as follows:
step 1-1, segmenting words of English sentences according to spaces in the sentences;
step 1-2, performing part-of-speech tagging on the segmented vocabulary by using a part-of-speech tagging tool;
step 1-3, semantic annotation is carried out on the segmented words by utilizing a semantic annotation tool;
and performing part-of-speech tagging and semantic tagging on all English sentences contained in the corpus by using an English part-of-speech tagging tool and an English semantic tagging tool, and selecting the morphological, part-of-speech and semantic information of four adjacent word units on the left and right of the biomedical ambiguous word as disambiguation characteristics.
3. The method for disambiguating Word senses in biomedical texts based on attention neural network as claimed in claim 1, wherein in said step 2, based on biomedical MSH corpus, Word2vec is used for extraction training to generate corresponding Word vectors, and the specific steps are as follows:
step 2-1, extracting the morphological, part of speech and semantic information of four adjacent word units on the left and right of the biomedical ambiguous word;
step 2-2, a CBOW model in Word2vec is used for obtaining a Word vector corresponding to each disambiguation feature, a small part of processed sentences are selected as test data, and the rest are used as training data.
4. The biomedical text word sense disambiguation method based on the attention neural network as claimed in claim 1, wherein in the step 3, the attention neural network model is trained, and the specific steps are as follows:
and (3) forward propagation process:
step 3-1, inputting training data into the initialized attention neural network model;
step 3-2, extracting disambiguation characteristics through an attention layer, and dynamically capturing the relation between words;
and 3-3, extracting more disambiguation characteristics through the asymmetric convolution layer. The asymmetric convolution can obtain different characteristic information according to convolution kernels with different sizes, meanwhile, the calculated amount can be reduced, the model calculation speed is increased, and overfitting is effectively prevented;
3-4, acquiring effective characteristic information from a forward network and a backward network through a bidirectional long-short term memory network layer, splicing the information into a full connection layer, reducing the dimension of the extracted disambiguation characteristics, and connecting the disambiguation characteristics into a one-dimensional disambiguation characteristic vector; .
Step 3-5 utilizes softmax layer to calculate biomedical ambiguous vocabulary m in each semantic category si(i 1, 2.., n), the softmax function being as follows:
Figure FDA0003018570010000031
wherein, aiInput data representing the softmax layer, P(s)i| m) represents the biomedical ambiguous vocabulary m in semantic category siThe probability of occurrence of (i ═ 1, 2.., n).
Steps 3-6 from P(s)1|m)、P(s2|m)、...、P(snAnd | m) selecting the maximum probability as the prediction probability.
Figure FDA0003018570010000032
Wherein y _ predictedjRepresenting the predicted probability of the biometrically ambiguous vocabulary m.
Step 3-7 predicts the probability y _ predictedjAnd true probability yjA comparison is made and the error loss is calculated using a cross entropy loss function.
The error loss is calculated as follows:
Figure FDA0003018570010000033
wherein, yjMeaning that the biomedicine ambiguous word m belongs to the semantic class siThe true probability of.
And (3) a back propagation process:
updating parameters layer by layer according to the error loss back propagation, wherein the parameter updating process is as follows:
Figure FDA0003018570010000034
where θ denotes a parameter set, θ' denotes an updated parameter set, and a denotes a learning rate.
And continuously iterating the attention neural network model to obtain the optimized attention neural network model.
5. The biomedical text word sense disambiguation method based on the attention neural network as claimed in claim 1, wherein in the step 4, the biomedical ambiguous word m is semantically classified by:
and (3) semantic classification process:
step 4-1, inputting the test data into the optimized attention neural network model;
step 4-2, dynamically capturing the relation between words through an attention layer;
step 4-3, more effective information is extracted and the calculated amount is reduced through the asymmetric convolution layer;
step 4-4, respectively acquiring information from a forward network and a backward network through a bidirectional long-short term memory network layer, splicing, entering a full connection layer, reducing the dimension of the extracted disambiguation features, and connecting into a one-dimensional disambiguation feature vector;
step 4-5 utilizes the softmax layer to calculate the probability distribution of the biomedical ambiguous vocabulary m under each semantic category. The semantic category s' with the maximum probability is the semantic category of the biomedical ambiguous vocabulary.
The semantic class s' is determined as follows:
Figure FDA0003018570010000041
wherein s' represents the semantic class with the highest probability, n represents the number of semantic classes, P(s)1|m),...,P(si|m),...,P(sn| m) represents the probability distribution sequence of the biomedical ambiguous vocabulary m under the semantic category.
CN202110395920.8A 2021-04-13 2021-04-13 Biomedical text word sense disambiguation method based on attention neural network Pending CN113065350A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110395920.8A CN113065350A (en) 2021-04-13 2021-04-13 Biomedical text word sense disambiguation method based on attention neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110395920.8A CN113065350A (en) 2021-04-13 2021-04-13 Biomedical text word sense disambiguation method based on attention neural network

Publications (1)

Publication Number Publication Date
CN113065350A true CN113065350A (en) 2021-07-02

Family

ID=76567245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110395920.8A Pending CN113065350A (en) 2021-04-13 2021-04-13 Biomedical text word sense disambiguation method based on attention neural network

Country Status (1)

Country Link
CN (1) CN113065350A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113779987A (en) * 2021-08-23 2021-12-10 科大国创云网科技有限公司 Event co-reference disambiguation method and system based on self-attention enhanced semantics
CN116226362A (en) * 2023-05-06 2023-06-06 湖南德雅曼达科技有限公司 Word segmentation method for improving accuracy of searching hospital names

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113779987A (en) * 2021-08-23 2021-12-10 科大国创云网科技有限公司 Event co-reference disambiguation method and system based on self-attention enhanced semantics
CN116226362A (en) * 2023-05-06 2023-06-06 湖南德雅曼达科技有限公司 Word segmentation method for improving accuracy of searching hospital names

Similar Documents

Publication Publication Date Title
CN109753566B (en) Model training method for cross-domain emotion analysis based on convolutional neural network
CN107291795B (en) Text classification method combining dynamic word embedding and part-of-speech tagging
CN108446271B (en) Text emotion analysis method of convolutional neural network based on Chinese character component characteristics
CN110609897A (en) Multi-category Chinese text classification method fusing global and local features
CN111930942B (en) Text classification method, language model training method, device and equipment
CN112395393B (en) Remote supervision relation extraction method based on multitask and multiple examples
CN109308353B (en) Training method and device for word embedding model
CN113326374B (en) Short text emotion classification method and system based on feature enhancement
CN108733647B (en) Word vector generation method based on Gaussian distribution
CN111191442A (en) Similar problem generation method, device, equipment and medium
CN111339772B (en) Russian text emotion analysis method, electronic device and storage medium
CN110874411A (en) Cross-domain emotion classification system based on attention mechanism fusion
CN110705247A (en) Based on x2-C text similarity calculation method
CN109271636B (en) Training method and device for word embedding model
US20220156489A1 (en) Machine learning techniques for identifying logical sections in unstructured data
CN112988970A (en) Text matching algorithm serving intelligent question-answering system
CN113065350A (en) Biomedical text word sense disambiguation method based on attention neural network
CN112528653A (en) Short text entity identification method and system
CN114417851A (en) Emotion analysis method based on keyword weighted information
Purba et al. Document level emotion detection from bangla text using machine learning techniques
Chan et al. Applying and optimizing NLP model with CARU
WO2021129410A1 (en) Method and device for text processing
O’Neill et al. Meta-embedding as auxiliary task regularization
CN113051892A (en) Chinese word sense disambiguation method based on transformer model
Aalaa Abdulwahab et al. Documents classification based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination