CN108491382A - A kind of semi-supervised biomedical text semantic disambiguation method - Google Patents

A kind of semi-supervised biomedical text semantic disambiguation method Download PDF

Info

Publication number
CN108491382A
CN108491382A CN201810207213.XA CN201810207213A CN108491382A CN 108491382 A CN108491382 A CN 108491382A CN 201810207213 A CN201810207213 A CN 201810207213A CN 108491382 A CN108491382 A CN 108491382A
Authority
CN
China
Prior art keywords
sentence
data
semantic disambiguation
words
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810207213.XA
Other languages
Chinese (zh)
Inventor
李智
罗曜儒
李健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201810207213.XA priority Critical patent/CN108491382A/en
Publication of CN108491382A publication Critical patent/CN108491382A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Machine Translation (AREA)

Abstract

The present invention is a kind of semantic disambiguation method for biomedical text polysemant.Include mainly:The vectorization for being carried out word to biomedical text using Word2Vec is indicated, the vectorization for being built context sentence to term vector language model based on two-way LSTM models is indicated, recycle the relationship of sentence vector space similarity, the label of existing mark medical data is passed to most like no labeled data by combination tag TRANSFER METHOD according to probability, and all labeled data is finally combined to carry out semantic disambiguation to biomedical text.Since biomedical data is with strongly professional, the features such as term is more manually carry out processing to medical data and take time and effort and error-prone, can then greatly reduce handmarking's cost using the present invention, simultaneously compared to traditional machine learning method, the accuracy of semantic disambiguation can be effectively improved.

Description

Semi-supervised biomedical text semantic disambiguation method
Technical Field
The invention belongs to the field of natural language processing semantic disambiguation, and relates to a text semantic disambiguation method and system based on semi-supervised biomedicine. Specifically, semantic disambiguation is carried out on polysemous words in medical texts by utilizing a bidirectional long-short term memory model Bi-LSTM based on a label transfer method.
Background
Medical care providers have become increasingly available with the explosive growth of digital information in recent years. In the biomedical field, text data contains a great deal of knowledge and information in professional fields, and how to extract useful information from digitized text information becomes more and more important. Compared with general text data, the medical text data has the difficulties of strong specialization, difficult data annotation and the like. Understanding biomedical text semantic information and automated labeling of medical data has therefore become a research hotspot.
Traditional biomedical text semantic disambiguation methods include supervised learning methods, unsupervised learning methods, and knowledge base-based learning methods. The supervised learning method learns a potential classifier by using the labeled data and then predicts the potential semantics of the unknown data by using the classifier. This method usually requires a large amount of labeling data to ensure high accuracy of the classifier, and its manual labeling process is time and labor consuming, and thus is not the best choice in some fields of biomedicine where the amount of data is not large. The unsupervised learning method does not require labeled data, which is classified only by potential similarities. The unsupervised learning method greatly simplifies the engineering of manual labeling data, but the accuracy of the method needs to be further improved at any time, and the unsupervised learning method is not suitable for the fields with low fault tolerance rate, such as the medical field. The method based on the knowledge base utilizes the established and open-source medical knowledge base as a training sample, and has the advantages of high data reliability and the disadvantages of poor expansibility for establishing the knowledge base and difficult maintenance.
Biomedical text semantic disambiguation commonly utilizes a word vector model to vectorize each word in the text, word semantic information is stored in a low-dimensional space in the form of a vector, and similar semantic words have similar word vector representations. Common Word vector transformation techniques are the Word2Vec model, which includes the Skip-gram model and the CBOW model. The Skip-gram model predicts word vectors of adjacent window words by using target words, and the CBOW model predicts word vectors of target words by using adjacent window words. Similarly, a sentence vector uses the word vector characteristics of each word in the fused sentence to represent the semantic information of the sentence. Common traditional fusion methods include methods such as cascading, averaging, weighted summation and the like, wherein the cascading method is to directly splice word vectors of each word in a sentence according to the front and back sequence; the averaging method is that all word vectors in the sentence are averaged to obtain a sentence vector; the weighted summation method is that different weights are given according to the importance of each word to semantic information, and then sentence vectors are obtained by adding and summing according to the weights. Sentence vectors are often used as features to initialize language models to facilitate subsequent natural language processing tasks.
The Recurrent Neural Network (RNN) is a neural network model for processing text information, and is characterized by connecting the information of previous time to the task of current time, and has a certain memory. However, when dealing with long sentences, the RNN can theoretically deal with long-term dependency problems. However, in practice, bengio et al (1994) conducted intensive research on the problem, and found that RNN could not successfully learn these knowledge, and when the words are far apart, RNN may cause gradient explosion or gradient disappearance, leading to backward propagation failure, and failing to effectively retain the text information. To overcome this drawback, an improved model of RNN, the long short term memory model (LSTM), was proposed. Three ' gate ' structures are additionally arranged on the basis of the RNN internal structure, a forgetting gate ' determines the amount of information of the last time, an input gate ' determines the amount of information of the current time, and an output gate ' determines the amount of information output at the current time. The LSTM selectively utilizes the previous time information and the current time information through the special gate structure, and effectively avoids the problem of long-term dependence of RNN.
In recent years, semi-supervised learning is successfully applied to semantic disambiguation tasks, wherein the bootstrapping algorithm can achieve better accuracy. A low-recall classifier can learn from a small set of labeled examples and then expand the set of labels with those sentences with which to label unlabeled corpora with high confidence labels. In recent years, a label propagation algorithm for word sense disambiguation has been proposed. And compared with bootstrapping and a Support Vector Machine (SVM) supervised classifier. Tag propagation achieves better performance because it assigns tags by optimizing global targets, whereas bootstrapping, et al, traditional algorithms propagate tags based on instance local similarities.
Disclosure of Invention
The invention provides a biomedical text semantic disambiguation method and system based on semi-supervised learning and deep learning. The problems of weak global property, difficult manual labeling, high cost and the like of the traditional disambiguation method are solved to a certain extent, and the accuracy of semantic disambiguation of biomedical texts and general texts is improved.
The invention consists of two parts: 1. and fusing the word vectors to form sentence vectors based on the bidirectional long-short term memory network LSTM model, and generating semantic features of the sentences. 2. The semi-supervised semantic disambiguation model based on the label transfer method automatically labels unlabelled data by utilizing the similarity of the labeled data and simultaneously eliminates semantic ambiguity.
The technical scheme adopted by the invention comprises the following steps:
(I) forming sentence vectors based on the two-way long-short term memory network LSTM model, and generating semantic features of sentences
The two-way long-short term memory network LSTM model comprises: the device comprises an output layer, a backward hidden layer, a forward hidden layer and an input layer. Wherein, each time step has six specific weights to be recycled, and the six weights correspond to the following: input layers to forward and backward hidden layers (w 1, w 3), hidden layers to hidden layers (w 2, w 5), forward and backward hidden layers to output layers (w 4, w 6)
The hidden layer is LSTM model composed of three gates (9, input gate, output gate) and a memory cell (cell)
The word vector of each word is used as the input of the bidirectional recurrent neural network LSTM, and the current output is obtained together with the output at the last moment. The process is divided into three stages
The first stage is as follows: selectively filtering information of last moment by using sigmoid function through forget gate layer
Wherein,in order to output the signals at the last moment,for the current input, i.e. the current word vector,is 0 to 1, and is used for filtering the information learned at the last moment
And a second stage: generating new information requiring updating
Firstly, the input gate layer decides which values to update through sigmoid
Then, a new candidate value is generated by a tanh layer
Candidate value of new informationRefresh is performed
And a third stage: output of the model
Obtaining an initial output through a sigmoid layer
Then will be determined by the tanh functionScaling and multiplying the two to obtain the output of the model
Semi-supervised semantic disambiguation model based on label transfer method
The label transfer method transfers the label of the marked data to the unmarked data according to probability by utilizing the similarity among the sample data. Firstly, a graph model is constructed for all samples, wherein each sample is a node, and the nodeAndthe similarity calculation method comprises the following steps:
whereinIs a hyper-parameter. Each node propagates the label according to the probability according to the similarity with the surrounding node, the probability calculation method is:
n represents the number of edges.
Drawings
FIG. 1 is a schematic diagram of the system of the present invention.
FIG. 2 is a view showing the internal structure of the LSTM according to the present invention.
Detailed Description
(1) User inputs biomedical text to generate sentence vector characteristics
Firstly, dividing a biomedical text into phrase forms, then generating Word vectors by using a Word2Vec model for the phrases, then sequentially inputting the Word vectors in each sentence into a bidirectional long-short term memory model, and outputting two sentence vectors by the model, wherein the two sentence vectors are respectivelyAndforming a new sentence vector in a cascade manner
New sentence vector re-input into multi-layer perceptronGet the final sentence vector
(2) Automatic labeling of unmarked data and disambiguation of ambiguous words using tag transfer
And (3) taking the sentence vector characteristics obtained in the step (1) as vector graph nodes, calculating the similarity of each node, automatically spreading the most similar labels for the unlabeled data according to a label transfer method, and for ambiguous words, transferring semantic information which most conforms to the sentence vector according to the similarity.
(3) Results of the experiment
And (3) adopting an international universal medical text MSH WSD data set and an NLM WSD data set according to the step (1) and the step (2). Wherein the MSH WSD dataset contains 203 medically ambiguous entities, for a total of 37888 ambiguous sentences, wherein 37090 samples were manually annotated; the NLM WSD dataset contains 50 ambiguous entities, containing 552153 common sentences, where each ambiguous entity is artificially labeled with 100 samples. The experiment used 20: 1, randomly adding unmarked data of one twentieth of the original marked data from other medical corpora, and carrying out tests according to the semi-supervised model based on the label transfer method, wherein the test results are compared as follows:
table 1 MSH WSD data set experimental results.
Table 2 NLM WSD dataset experimental results.
Wherein, SVM represents to adopt the support vector machine as the model, LSTM represents to adopt the unidirectional long short-term memory model, Bi-LSTM represents to adopt the bidirectional long short-term memory model; WE (Con) represents that a cascade word vector is adopted as a sentence semantic feature, WE (Avg) represents that an average word vector method is adopted as the sentence semantic feature, WE (Wsum) represents that a weighted sum word vector method is adopted as the sentence semantic feature, and Con represents that a model adopted by the invention is adopted as the sentence semantic feature; LP represents the label delivery method proposed by the present invention. According to the experimental result, after the non-tag data is added to the language model, manual marking is not needed, the cost of manual marking of medical personnel is reduced, the optimal accuracy is obtained in semantic disambiguation of medical texts, and the method is proved to be feasible and effective.

Claims (5)

1. A semi-supervised biomedical text semantic disambiguation method is characterized by comprising the following steps:
(1) vectorization representation of words of medical text based on Word2Vec language model
(2) Performing vectorization representation on medical text sentences based on bidirectional long-term and short-term model Bi-LSTM on the basis of word vectors
(3) And automatically labeling the label-free data based on a label transfer method by using sentence vector space similarity, and performing semantic disambiguation on the polysemous words.
2. The Word2 vectored language model-based vectorized representation of words of medical text according to claim 1, wherein: the words may include both medical specific terms and general text words.
3. The Bi-directional long-short term memory model Bi-LSTM based vectorized representation of medical text sentences as claimed in claim 1 wherein: the bidirectional long-short term memory model Bi-LSTM inputs the word vector representation of each word in the sentence, and outputs the vectorized representation of the sentence.
4. The sentence vector spatial similarity of claim 1, wherein: and calculating the geometric distance between the sentence vectors by using an Euclidean distance formula, and calculating the similarity of the sentence vectors by using the reciprocal of the geometric distance.
5. The label delivery-based automated tagging of unlabeled data and semantic disambiguation of ambiguous words of claim 1, wherein: and (4) transferring the data label to the unlabeled data according to probability by using the similarity between the sentence vectors, and automatically carrying out semantic disambiguation on the medical text data.
CN201810207213.XA 2018-03-14 2018-03-14 A kind of semi-supervised biomedical text semantic disambiguation method Pending CN108491382A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810207213.XA CN108491382A (en) 2018-03-14 2018-03-14 A kind of semi-supervised biomedical text semantic disambiguation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810207213.XA CN108491382A (en) 2018-03-14 2018-03-14 A kind of semi-supervised biomedical text semantic disambiguation method

Publications (1)

Publication Number Publication Date
CN108491382A true CN108491382A (en) 2018-09-04

Family

ID=63339234

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810207213.XA Pending CN108491382A (en) 2018-03-14 2018-03-14 A kind of semi-supervised biomedical text semantic disambiguation method

Country Status (1)

Country Link
CN (1) CN108491382A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109377203A (en) * 2018-09-13 2019-02-22 平安医疗健康管理股份有限公司 Medical settlement data processing method, device, computer equipment and storage medium
CN110059185A (en) * 2019-04-03 2019-07-26 天津科技大学 A kind of medical files specialized vocabulary automation mask method
CN110287337A (en) * 2019-06-19 2019-09-27 上海交通大学 The system and method for medicine synonym is obtained based on deep learning and knowledge mapping
CN110705206A (en) * 2019-09-23 2020-01-17 腾讯科技(深圳)有限公司 Text information processing method and related device
CN111221960A (en) * 2019-10-28 2020-06-02 支付宝(杭州)信息技术有限公司 Text detection method, similarity calculation method, model training method and device
CN111414473A (en) * 2020-02-13 2020-07-14 合肥工业大学 Semi-supervised classification method and system
CN111597296A (en) * 2019-02-20 2020-08-28 阿里巴巴集团控股有限公司 Commodity data processing method, device and system
CN111881979A (en) * 2020-07-28 2020-11-03 复旦大学 Multi-modal data annotation device and computer-readable storage medium containing program
CN113158687A (en) * 2021-04-29 2021-07-23 新声科技(深圳)有限公司 Semantic disambiguation method and device, storage medium and electronic device
CN113742458A (en) * 2021-09-18 2021-12-03 苏州大学 Natural language instruction disambiguation method and system for mechanical arm grabbing
CN113779987A (en) * 2021-08-23 2021-12-10 科大国创云网科技有限公司 Event co-reference disambiguation method and system based on self-attention enhanced semantics
CN115293158A (en) * 2022-06-30 2022-11-04 撼地数智(重庆)科技有限公司 Disambiguation method and device based on label assistance

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002010985A2 (en) * 2000-07-28 2002-02-07 Tenara Limited Method of and system for automatic document retrieval, categorization and processing
CN1916887A (en) * 2006-09-06 2007-02-21 哈尔滨工程大学 Method for eliminating ambiguity without directive word meaning based on technique of substitution words
US20140040275A1 (en) * 2010-02-09 2014-02-06 Siemens Corporation Semantic search tool for document tagging, indexing and search
CN104268200A (en) * 2013-09-22 2015-01-07 中科嘉速(北京)并行软件有限公司 Unsupervised named entity semantic disambiguation method based on deep learning
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN106919646A (en) * 2017-01-18 2017-07-04 南京云思创智信息科技有限公司 Chinese text summarization generation system and method
CN106980608A (en) * 2017-03-16 2017-07-25 四川大学 A kind of Chinese electronic health record participle and name entity recognition method and system
CN106997379A (en) * 2017-03-20 2017-08-01 杭州电子科技大学 A kind of merging method of the close text based on picture text click volume
CN107301213A (en) * 2017-06-09 2017-10-27 腾讯科技(深圳)有限公司 Intelligent answer method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002010985A2 (en) * 2000-07-28 2002-02-07 Tenara Limited Method of and system for automatic document retrieval, categorization and processing
CN1916887A (en) * 2006-09-06 2007-02-21 哈尔滨工程大学 Method for eliminating ambiguity without directive word meaning based on technique of substitution words
US20140040275A1 (en) * 2010-02-09 2014-02-06 Siemens Corporation Semantic search tool for document tagging, indexing and search
CN104268200A (en) * 2013-09-22 2015-01-07 中科嘉速(北京)并行软件有限公司 Unsupervised named entity semantic disambiguation method based on deep learning
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN106919646A (en) * 2017-01-18 2017-07-04 南京云思创智信息科技有限公司 Chinese text summarization generation system and method
CN106980608A (en) * 2017-03-16 2017-07-25 四川大学 A kind of Chinese electronic health record participle and name entity recognition method and system
CN106997379A (en) * 2017-03-20 2017-08-01 杭州电子科技大学 A kind of merging method of the close text based on picture text click volume
CN107301213A (en) * 2017-06-09 2017-10-27 腾讯科技(深圳)有限公司 Intelligent answer method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DAYU YUAN等: "Semi-supervisedWord Sense Disambiguation with Neural Models", 《ARXIV:1603.07012V2[CS.CL]》 *
ZHENG-YU NIU等: "Word Sense Disambiguation Using Label Propagation Based Semi-Supervised Learning", 《PROCEEDINGS OF THE 43RD ANNUAL MEETING OF THE ACL》 *
李丽双等: "基于CNN-BLSTM-CRF模型的生物医学命名实体识别", 《中文信息学报》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109377203A (en) * 2018-09-13 2019-02-22 平安医疗健康管理股份有限公司 Medical settlement data processing method, device, computer equipment and storage medium
CN111597296A (en) * 2019-02-20 2020-08-28 阿里巴巴集团控股有限公司 Commodity data processing method, device and system
CN110059185A (en) * 2019-04-03 2019-07-26 天津科技大学 A kind of medical files specialized vocabulary automation mask method
CN110059185B (en) * 2019-04-03 2022-10-04 天津科技大学 Medical document professional vocabulary automatic labeling method
CN110287337A (en) * 2019-06-19 2019-09-27 上海交通大学 The system and method for medicine synonym is obtained based on deep learning and knowledge mapping
CN110705206A (en) * 2019-09-23 2020-01-17 腾讯科技(深圳)有限公司 Text information processing method and related device
CN111221960A (en) * 2019-10-28 2020-06-02 支付宝(杭州)信息技术有限公司 Text detection method, similarity calculation method, model training method and device
CN111414473A (en) * 2020-02-13 2020-07-14 合肥工业大学 Semi-supervised classification method and system
CN111414473B (en) * 2020-02-13 2021-09-07 合肥工业大学 Semi-supervised classification method and system
CN111881979A (en) * 2020-07-28 2020-11-03 复旦大学 Multi-modal data annotation device and computer-readable storage medium containing program
CN113158687A (en) * 2021-04-29 2021-07-23 新声科技(深圳)有限公司 Semantic disambiguation method and device, storage medium and electronic device
CN113779987A (en) * 2021-08-23 2021-12-10 科大国创云网科技有限公司 Event co-reference disambiguation method and system based on self-attention enhanced semantics
CN113742458A (en) * 2021-09-18 2021-12-03 苏州大学 Natural language instruction disambiguation method and system for mechanical arm grabbing
CN115293158A (en) * 2022-06-30 2022-11-04 撼地数智(重庆)科技有限公司 Disambiguation method and device based on label assistance
CN115293158B (en) * 2022-06-30 2024-02-02 撼地数智(重庆)科技有限公司 Label-assisted disambiguation method and device

Similar Documents

Publication Publication Date Title
CN108491382A (en) A kind of semi-supervised biomedical text semantic disambiguation method
Sharma et al. Literature survey of statistical, deep and reinforcement learning in natural language processing
CN109800437B (en) Named entity recognition method based on feature fusion
Yao et al. Bi-directional LSTM recurrent neural network for Chinese word segmentation
Gasmi et al. LSTM recurrent neural networks for cybersecurity named entity recognition
Nguyen et al. Recurrent neural network-based models for recognizing requisite and effectuation parts in legal texts
US20210141863A1 (en) Multi-perspective, multi-task neural network model for matching text to program code
US20200302118A1 (en) Korean Named-Entity Recognition Method Based on Maximum Entropy Model and Neural Network Model
CN110457682B (en) Part-of-speech tagging method for electronic medical record, model training method and related device
CN111782769B (en) Intelligent knowledge graph question-answering method based on relation prediction
Jabreel et al. Target-dependent sentiment analysis of tweets using bidirectional gated recurrent neural networks
CN111274829B (en) Sequence labeling method utilizing cross-language information
CN109960728A (en) A kind of open field conferencing information name entity recognition method and system
US20240233877A1 (en) Method for predicting reactant molecule, training method, apparatus, and electronic device
Popov Neural network models for word sense disambiguation: an overview
Ren et al. Detecting the scope of negation and speculation in biomedical texts by using recursive neural network
Thattinaphanich et al. Thai named entity recognition using Bi-LSTM-CRF with word and character representation
Deng et al. Self-attention-based BiGRU and capsule network for named entity recognition
Anandika et al. A study on machine learning approaches for named entity recognition
Zhang et al. Using a pre-trained language model for medical named entity extraction in Chinese clinic text
Foland et al. CU-NLP at SemEval-2016 task 8: AMR parsing using LSTM-based recurrent neural networks
Bhuyan et al. Textual entailment as an evaluation metric for abstractive text summarization
CN114373554A (en) Drug interaction relation extraction method using drug knowledge and syntactic dependency relation
Abd et al. A comparative study of word representation methods with conditional random fields and maximum entropy markov for bio-named entity recognition
Liu et al. Recognizing proper names in ur iii texts through supervised learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180904