CN109002436A - Medical text terms automatic identifying method and system based on shot and long term memory network - Google Patents
Medical text terms automatic identifying method and system based on shot and long term memory network Download PDFInfo
- Publication number
- CN109002436A CN109002436A CN201810762297.3A CN201810762297A CN109002436A CN 109002436 A CN109002436 A CN 109002436A CN 201810762297 A CN201810762297 A CN 201810762297A CN 109002436 A CN109002436 A CN 109002436A
- Authority
- CN
- China
- Prior art keywords
- text
- memory network
- input
- word
- word vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 230000007787 long-term memory Effects 0.000 title claims abstract description 17
- 239000003814 drug Substances 0.000 claims abstract description 27
- 230000015654 memory Effects 0.000 claims abstract description 25
- 239000000284 extract Substances 0.000 claims abstract 2
- 239000011159 matrix material Substances 0.000 claims description 13
- 230000010354 integration Effects 0.000 claims description 2
- 230000008901 benefit Effects 0.000 abstract description 4
- 201000010099 disease Diseases 0.000 description 5
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 5
- 229940079593 drug Drugs 0.000 description 3
- 210000001124 body fluid Anatomy 0.000 description 2
- 239000010839 body fluid Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 208000004880 Polyuria Diseases 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000035619 diuresis Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of medical text terms automatic identifying method and system based on shot and long term memory network extracts medical terminology class entity from medicine text automatically and designs to realize.Medical text terms automatic identifying method the present invention is based on shot and long term memory network includes indicating text each in medicine text sentence using the word vector of pre-training, obtaining training data;Training data is input in two-way length memory network, the label classification of each text maximum probability in medicine text sentence is obtained;By the label classification of each text maximum probability, this output result is input in condition random field, calculates the maximum annotated sequence of joint probability using viterbi algorithm.The present invention has merged two-way length memory network and the respective advantage of condition random field in short-term, can effectively promote the accuracy rate of word mark.
Description
Technical field
The present invention relates to machine learning fields, and in particular to a kind of medical text terms based on shot and long term memory network from
Dynamic recognition methods and system.
Background technique
Traditional medical terminology identifying system can be divided into the term identifying system of word-based storehouse matching and based on machine learning
Medical terminology automatic recognition system.
The medical terminology automatic recognition system of word-based storehouse matching has the advantages that accurate rate is high, recognition speed is fast but right
Medicine scale and quality have very high requirement, and can not identify that is, recall rate is often insufficient to the term for being not logged in dictionary.
Medical terminology automatic recognition system based on conventional machines learning method can learn medicine art from training data
The contextual information of language, contextual information identify medical terminology, avoid dictionary pattern matching to be not logged in dictionary term without
The situation of method identification, greatly increases recall rate, but accurate rate is often lower.
In view of above-mentioned, the designer is actively subject to research and innovation, to found a kind of doctor based on shot and long term memory network
Text terms automatic identifying method and system are treated, makes it with more the utility value in industry.
Summary of the invention
In order to solve the above technical problems, the object of the present invention is to provide a kind of high precision rate, high recall rate based on length
The medical text terms automatic identifying method and system of phase memory network.
The present invention is based on the medical text terms automatic identifying methods of shot and long term memory network, including,
Text each in medicine text sentence is indicated using the word vector of pre-training, obtains training data;
Training data is input in two-way length memory network, each text maximum probability in medicine text sentence is obtained
Label classification;
By the label classification of each text maximum probability, this output result is input in condition random field, is calculated using Viterbi
Method calculates the maximum annotated sequence of joint probability.
Further, word vector is obtained with the text vector training method of word2vec, the word vector matrix L of generation is n
× m ties up matrix, and wherein n represents the number of words in dictionary, and m represents the dimension of each word vector, and usual m takes between 100 to 300
Value.
The present invention is based on the medical text terms automatic recognition systems of shot and long term memory network, comprising:
Word vector model unit, for indicating text each in medicine text sentence using the word vector of pre-training;
Two-way length in short-term cured for training data to be input in two-way length memory network by memory network unit
Learn the label classification of each text maximum probability in text sentence;
Conditional random field models unit is input to item for this output result by the label classification of each text maximum probability
In part random field, the maximum annotated sequence of joint probability is calculated using viterbi algorithm.
Further, it specifically includes:
Text input layer, text are inputted in the form that single word is split;
The character of input is mapped to the word vector of pre-training by matrix L by word vector embeding layer;
It is special to extract word vector embeding layer using LSTM layers forward, backward LSTM layers respectively for two-way length memory network layer in short-term
Sign;
Condition random field layer integrates the information of two-way LSTM, and the information after integration will be exported medicine as input
Text word for word marks part of speech.
According to the above aspect of the present invention, the present invention is based on the medical text terms automatic identifying method of shot and long term memory network and being
System, has at least the following advantages:
Using two-way length, memory network, distributed by text each in medicine text indicate to be used as network the present invention in short-term
Input, export the label classification of each word maximum probability.Memory network fully considers that the context of text is believed to two-way length in short-term
Breath, is conceived to the maximization to each word tag classification;Condition random field more considers that the part of entire sentence is special
The linear weighted combination of sign calculates joint probability, directly optimizes entire sequence.Using two-way length in short-term memory network and condition with
Airport algorithm is jointly labeled word sequence.Compared to traditional algorithm, two-way length memory network and condition random in short-term have been merged
The respective advantage in field, memory network can more fully utilize contextual information to two-way length in short-term, can effectively promote the standard of word mark
True rate so that the accuracy rate that word is classified in sequence labelling greatly improves, namely improves the essence of medical terminology automatic recognition system
True rate and recall rate.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,
And can be implemented in accordance with the contents of the specification, the following is a detailed description of the preferred embodiments of the present invention and the accompanying drawings.
Detailed description of the invention
Fig. 1 is that the present invention is based on the medical text terms automatic identifying method of shot and long term memory network and the two-way length of system
The frame diagram of short-term memory network;
Fig. 2 is that the present invention is based on the medical text terms automatic identifying method of shot and long term memory network and the middle length of system
When memory network unit F L1-FL5And BL1-BL5Detailed structure view.
Specific embodiment
With reference to the accompanying drawings and examples, specific embodiments of the present invention will be described in further detail.Implement below
Example is not intended to limit the scope of the invention for illustrating the present invention.
In medicine class text, such as textbook, clinical guidelines, electronic health record, it all include a large amount of medical speciality terms, these
To text structure, knowledge information extraction etc. all plays a significant role term.Medicine Key Term is divided into disease by us
Shape (SYM), sign (SGN), position word (REG), organ (ORG), body fluid (BFL), checks (TES), drug at disease (DIS)
(DRU), 23 vocabulary classifications such as operation (SUR).The automatic recognition problem of medical terminology is converted medicine text by this programme
Word sequence labelling problem: using word sequence as observation sequence, the sequence that each affiliated term classification of text is constituted is as status switch.
Embodiment 1
An a kind of preferred embodiment of the medical text terms automatic identifying method based on shot and long term memory network of the present invention,
Include:
Text each in medicine text sentence is indicated using the word vector of pre-training, obtains training data;
Training data is input in two-way length memory network, each text maximum probability in medicine text sentence is obtained
Label classification;
By the label classification of each text maximum probability, this output result is input in condition random field, is calculated using Viterbi
Method calculates the maximum annotated sequence of joint probability.
The present embodiment citing will need largely to mark training data using long memory network and condition random field in short-term,
In the artificial annotation process of word sequence of medicine text, common BIO scheme will be marked using word.
For example, ' symptom of diabetes has more drinks, more foods and diuresis.' it is identified by into following form:
Sugared B_dis
Urinate I_dis
Sick I_dis
O
Disease O
Shape O
There is O
More B_sym
Drink I_sym
、 O
More B_sym
Eat I_sym
And O
More B_sym
Urinate I_sym
。 O
In above-mentioned mark, disease (dis) and symptom (sym) class entity are marked by special mask method such as ' B_dis '
And go out, other useless vocabulary and symbol are then directly marked as ' O '.
Embodiment 2
An a kind of preferred embodiment of the medical text terms automatic recognition system based on shot and long term memory network of the present invention,
Include:
Word vector model unit, for indicating text each in medicine text sentence using the word vector of pre-training;
Two-way length in short-term cured for training data to be input in two-way length memory network by memory network unit
Learn the label classification of each text maximum probability in text sentence;
Conditional random field models unit is input to item for this output result by the label classification of each text maximum probability
In part random field, the maximum annotated sequence of joint probability is calculated using viterbi algorithm.
As shown in Fig. 1 to 2, the memory network+conditional random field models building in short-term of two-way length
Model programming framework: Python Tensorflow
Model training data: a large amount of medicine in step 1 mark text
Mode input: medicine text word for word input model
Model output: medicine text word for word marks part of speech
Model framework: text input layer, word vector embeding layer, memory network layer, condition random field layer are defeated in short-term for two-way length
Out shown in layer following structure chart arrangement from the bottom to top:
1. the model bottom is Chinese character input layer, the form input model that text is split with single word.
2.E1-E5For word vector embeding layer, by matrix L by the character of input map to the word of pre-training in step 2 to
Amount.
3.FL1-FL5It is LSTM layers forward, for extracting E1-E5Feature.
4.BL1-BL5It is LSTM layers backward, for extracting E1-E5Feature.
5.O1-O5Output layer is integrated for the information of two-way LSTM, while will be as subsequent CRF layers of input.
6.C1-C5It is CRF layers.
7. the top output layer final for model, for predicting the label of input layer character.
Long memory network unit F L in short-term in model1-FL5And BL1-BL5Detailed construction introduction:
In LSTM cellular construction figure, xtFor t moment mode input, htIt is exported for t moment model, since LSTM belongs to circulation
Neural network, htAlso it can become the input of next timing node t+1, i.e. t+1 moment unit receives input [ht,xt]。ctFor list
First state, for saving long term state.σ is sigmoid function, and tanh is hyperbolic tangent function.WfTo forget door weight matrix,
WiFor input gate weight matrix, WoFor out gate weight matrix, WcFor active cell state ctNewly-added information weight matrix.
In the various embodiments described above, word vector reflection be positional relationship of the word in semantic space, the cosine in space away from
From signifying to correspond to the semantic similarity between word.This programme uses the text vector training method of word2vec, big by introducing
The medicine text (medical text books, clinical guidelines, electronic health record etc.) of amount carries out the training of word vector, generate the word of higher-dimension to
Amount, with relative positional relationship of the response word in semantic vector space.The word vector matrix L ultimately generated is that n × m ties up matrix,
Wherein n represents the number of words in dictionary, and m represents the dimension of each word vector, usual m value between 100 to 300.
For the present invention since the word based on long memory network in short-term marks task, output layer is the label of each word maximum probability
Classification is finally stitched together, and has ignored sentence Global Information, often will appear many mistakes or unreasonable label.In order to
It is modified, this programme uses condition random field further to calculate the joint probability of annotated sequence, is calculated using Viterbi
Method finds the maximum status switch of joint probability.Condition random field is conceived to the optimization of entire sequence, directs at mark task most
Whole target can so correct the partial error or unreasonable label of the long output of memory network in short-term, improve term and know automatically
Other accuracy rate.Finally, above system achieves 89% accuracy rate in validation data set.
The above is only a preferred embodiment of the present invention, it is not intended to restrict the invention, it is noted that for this skill
For the those of ordinary skill in art field, without departing from the technical principles of the invention, can also make it is several improvement and
Modification, these improvements and modifications also should be regarded as protection scope of the present invention.
Claims (4)
1. a kind of medical text terms automatic identifying method based on shot and long term memory network, which is characterized in that including,
Text each in medicine text sentence is indicated using the word vector of pre-training, obtains training data;
Training data is input in two-way length memory network, the mark of each text maximum probability in medicine text sentence is obtained
Sign classification;
By the label classification of each text maximum probability, this output result is input in condition random field, uses viterbi algorithm meter
Calculate the maximum annotated sequence of joint probability.
2. the medical text terms automatic identifying method according to claim 1 based on shot and long term memory network, feature
It is, obtain word vector with the text vector training method of word2vec, the word vector matrix L of generation is that n × m ties up matrix,
Middle n represents the number of words in dictionary, and m represents the dimension of each word vector, usual m value between 100 to 300.
3. a kind of medical text terms automatic recognition system based on shot and long term memory network characterized by comprising
Word vector model unit, for indicating text each in medicine text sentence using the word vector of pre-training;
Two-way length memory network unit in short-term obtains medicine text for training data to be input in two-way length memory network
The label classification of each text maximum probability in this sentence;
Conditional random field models unit, for by the label classification of each text maximum probability this output result be input to condition with
In airport, the maximum annotated sequence of joint probability is calculated using viterbi algorithm.
4. the medical text terms automatic recognition system according to claim 1 based on shot and long term memory network, feature
It is, specifically includes:
Text input layer, text are inputted in the form that single word is split;
The character of input is mapped to the word vector of pre-training by matrix L by word vector embeding layer;
Two-way length memory network layer in short-term extracts word vector embeding layer feature using LSTM layers forward, backward LSTM layers respectively;
Condition random field layer integrates the information of two-way LSTM, and the information after integration will be exported medicine text as input
Word for word mark part of speech.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810762297.3A CN109002436A (en) | 2018-07-12 | 2018-07-12 | Medical text terms automatic identifying method and system based on shot and long term memory network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810762297.3A CN109002436A (en) | 2018-07-12 | 2018-07-12 | Medical text terms automatic identifying method and system based on shot and long term memory network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109002436A true CN109002436A (en) | 2018-12-14 |
Family
ID=64599881
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810762297.3A Pending CN109002436A (en) | 2018-07-12 | 2018-07-12 | Medical text terms automatic identifying method and system based on shot and long term memory network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109002436A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109885702A (en) * | 2019-01-17 | 2019-06-14 | 哈尔滨工业大学(深圳) | Sequence labelling method, apparatus, equipment and storage medium in natural language processing |
CN110134966A (en) * | 2019-05-21 | 2019-08-16 | 中电健康云科技有限公司 | A kind of sensitive information determines method and device |
CN110232192A (en) * | 2019-06-19 | 2019-09-13 | 中国电力科学研究院有限公司 | Electric power term names entity recognition method and device |
CN110309769A (en) * | 2019-06-28 | 2019-10-08 | 北京邮电大学 | The method that character string in a kind of pair of picture is split |
CN110348021A (en) * | 2019-07-17 | 2019-10-18 | 湖北亿咖通科技有限公司 | Character string identification method, electronic equipment, storage medium based on name physical model |
CN110597994A (en) * | 2019-09-17 | 2019-12-20 | 北京百度网讯科技有限公司 | Event element identification method and device |
CN110867225A (en) * | 2019-11-04 | 2020-03-06 | 山东师范大学 | Character-level clinical concept extraction named entity recognition method and system |
CN111209751A (en) * | 2020-02-14 | 2020-05-29 | 全球能源互联网研究院有限公司 | Chinese word segmentation method, device and storage medium |
CN111783464A (en) * | 2020-06-29 | 2020-10-16 | 中国电力科学研究院有限公司 | Electric power-oriented domain entity identification method, system and storage medium |
CN111930943A (en) * | 2020-08-12 | 2020-11-13 | 中国科学技术大学 | Method and device for detecting pivot bullet screen |
CN111930909A (en) * | 2020-08-11 | 2020-11-13 | 付立军 | Geological intelligent question and answer oriented data automatic sequence labeling identification method |
CN112541056A (en) * | 2020-12-18 | 2021-03-23 | 卫宁健康科技集团股份有限公司 | Medical term standardization method, device, electronic equipment and storage medium |
CN112927806A (en) * | 2019-12-05 | 2021-06-08 | 金色熊猫有限公司 | Medical record structured network cross-disease migration training method, device, medium and equipment |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106569998A (en) * | 2016-10-27 | 2017-04-19 | 浙江大学 | Text named entity recognition method based on Bi-LSTM, CNN and CRF |
CN107451115A (en) * | 2017-07-11 | 2017-12-08 | 中国科学院自动化研究所 | The construction method and system of Chinese Prosodic Hierarchy forecast model end to end |
CN107526799A (en) * | 2017-08-18 | 2017-12-29 | 武汉红茶数据技术有限公司 | A kind of knowledge mapping construction method based on deep learning |
CN107608970A (en) * | 2017-09-29 | 2018-01-19 | 百度在线网络技术(北京)有限公司 | part-of-speech tagging model generating method and device |
CN107622050A (en) * | 2017-09-14 | 2018-01-23 | 武汉烽火普天信息技术有限公司 | Text sequence labeling system and method based on Bi LSTM and CRF |
CN107644014A (en) * | 2017-09-25 | 2018-01-30 | 南京安链数据科技有限公司 | A kind of name entity recognition method based on two-way LSTM and CRF |
CN107797992A (en) * | 2017-11-10 | 2018-03-13 | 北京百分点信息科技有限公司 | Name entity recognition method and device |
CN107977361A (en) * | 2017-12-06 | 2018-05-01 | 哈尔滨工业大学深圳研究生院 | The Chinese clinical treatment entity recognition method represented based on deep semantic information |
CN108038103A (en) * | 2017-12-18 | 2018-05-15 | 北京百分点信息科技有限公司 | A kind of method, apparatus segmented to text sequence and electronic equipment |
CN108170675A (en) * | 2017-12-27 | 2018-06-15 | 哈尔滨福满科技有限责任公司 | A kind of name entity recognition method based on deep learning towards medical field |
-
2018
- 2018-07-12 CN CN201810762297.3A patent/CN109002436A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106569998A (en) * | 2016-10-27 | 2017-04-19 | 浙江大学 | Text named entity recognition method based on Bi-LSTM, CNN and CRF |
CN107451115A (en) * | 2017-07-11 | 2017-12-08 | 中国科学院自动化研究所 | The construction method and system of Chinese Prosodic Hierarchy forecast model end to end |
CN107526799A (en) * | 2017-08-18 | 2017-12-29 | 武汉红茶数据技术有限公司 | A kind of knowledge mapping construction method based on deep learning |
CN107622050A (en) * | 2017-09-14 | 2018-01-23 | 武汉烽火普天信息技术有限公司 | Text sequence labeling system and method based on Bi LSTM and CRF |
CN107644014A (en) * | 2017-09-25 | 2018-01-30 | 南京安链数据科技有限公司 | A kind of name entity recognition method based on two-way LSTM and CRF |
CN107608970A (en) * | 2017-09-29 | 2018-01-19 | 百度在线网络技术(北京)有限公司 | part-of-speech tagging model generating method and device |
CN107797992A (en) * | 2017-11-10 | 2018-03-13 | 北京百分点信息科技有限公司 | Name entity recognition method and device |
CN107977361A (en) * | 2017-12-06 | 2018-05-01 | 哈尔滨工业大学深圳研究生院 | The Chinese clinical treatment entity recognition method represented based on deep semantic information |
CN108038103A (en) * | 2017-12-18 | 2018-05-15 | 北京百分点信息科技有限公司 | A kind of method, apparatus segmented to text sequence and electronic equipment |
CN108170675A (en) * | 2017-12-27 | 2018-06-15 | 哈尔滨福满科技有限责任公司 | A kind of name entity recognition method based on deep learning towards medical field |
Non-Patent Citations (2)
Title |
---|
杨红梅 等: "基于双向LSTM神经网络电子病历命名实体的识别模型", 《中国组织工程研究》 * |
陈伟 等: "基于BILSTM-CRF的关键词自动抽取", 《计算机科学》 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109885702A (en) * | 2019-01-17 | 2019-06-14 | 哈尔滨工业大学(深圳) | Sequence labelling method, apparatus, equipment and storage medium in natural language processing |
CN110134966A (en) * | 2019-05-21 | 2019-08-16 | 中电健康云科技有限公司 | A kind of sensitive information determines method and device |
CN110232192A (en) * | 2019-06-19 | 2019-09-13 | 中国电力科学研究院有限公司 | Electric power term names entity recognition method and device |
CN110309769A (en) * | 2019-06-28 | 2019-10-08 | 北京邮电大学 | The method that character string in a kind of pair of picture is split |
CN110309769B (en) * | 2019-06-28 | 2021-06-15 | 北京邮电大学 | Method for segmenting character strings in picture |
CN110348021A (en) * | 2019-07-17 | 2019-10-18 | 湖北亿咖通科技有限公司 | Character string identification method, electronic equipment, storage medium based on name physical model |
CN110597994A (en) * | 2019-09-17 | 2019-12-20 | 北京百度网讯科技有限公司 | Event element identification method and device |
CN110867225A (en) * | 2019-11-04 | 2020-03-06 | 山东师范大学 | Character-level clinical concept extraction named entity recognition method and system |
CN112927806A (en) * | 2019-12-05 | 2021-06-08 | 金色熊猫有限公司 | Medical record structured network cross-disease migration training method, device, medium and equipment |
CN112927806B (en) * | 2019-12-05 | 2022-11-25 | 金色熊猫有限公司 | Medical record structured network cross-disease migration training method, device, medium and equipment |
CN111209751A (en) * | 2020-02-14 | 2020-05-29 | 全球能源互联网研究院有限公司 | Chinese word segmentation method, device and storage medium |
CN111783464A (en) * | 2020-06-29 | 2020-10-16 | 中国电力科学研究院有限公司 | Electric power-oriented domain entity identification method, system and storage medium |
CN111930909A (en) * | 2020-08-11 | 2020-11-13 | 付立军 | Geological intelligent question and answer oriented data automatic sequence labeling identification method |
CN111930909B (en) * | 2020-08-11 | 2023-09-12 | 付立军 | Geological intelligent question-answering oriented data automation sequence labeling identification method |
CN111930943A (en) * | 2020-08-12 | 2020-11-13 | 中国科学技术大学 | Method and device for detecting pivot bullet screen |
CN111930943B (en) * | 2020-08-12 | 2022-09-02 | 中国科学技术大学 | Method and device for detecting pivot bullet screen |
CN112541056A (en) * | 2020-12-18 | 2021-03-23 | 卫宁健康科技集团股份有限公司 | Medical term standardization method, device, electronic equipment and storage medium |
CN112541056B (en) * | 2020-12-18 | 2024-05-31 | 卫宁健康科技集团股份有限公司 | Medical term standardization method, device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109002436A (en) | Medical text terms automatic identifying method and system based on shot and long term memory network | |
CN109522546B (en) | Medical named entity recognition method based on context correlation | |
CN110032648B (en) | Medical record structured analysis method based on medical field entity | |
CN108897989B (en) | Biological event extraction method based on candidate event element attention mechanism | |
CN109669994B (en) | Construction method and system of health knowledge map | |
CN107977361B (en) | Chinese clinical medical entity identification method based on deep semantic information representation | |
CN110223742A (en) | The clinical manifestation information extraction method and equipment of Chinese electronic health record data | |
CN110059185B (en) | Medical document professional vocabulary automatic labeling method | |
Sharnagat | Named entity recognition: A literature survey | |
CN110502749A (en) | A kind of text Relation extraction method based on the double-deck attention mechanism Yu two-way GRU | |
CN112818676B (en) | Medical entity relationship joint extraction method | |
CN111046670B (en) | Entity and relationship combined extraction method based on drug case legal documents | |
CN109800437A (en) | A kind of name entity recognition method based on Fusion Features | |
CN111538845A (en) | Method, model and system for constructing kidney disease specialized medical knowledge map | |
Liu et al. | BB-KBQA: BERT-based knowledge base question answering | |
CN111259897A (en) | Knowledge-aware text recognition method and system | |
CN112151183A (en) | Entity identification method of Chinese electronic medical record based on Lattice LSTM model | |
CN108959566A (en) | A kind of medical text based on Stacking integrated study goes privacy methods and system | |
CN108550065A (en) | comment data processing method, device and equipment | |
CN116719913A (en) | Medical question-answering system based on improved named entity recognition and construction method thereof | |
CN109584006A (en) | A kind of cross-platform goods matching method based on depth Matching Model | |
CN111611780A (en) | Digestive endoscopy report structuring method and system based on deep learning | |
Deng et al. | Self-attention-based BiGRU and capsule network for named entity recognition | |
CN116341557A (en) | Diabetes medical text named entity recognition method | |
CN114239612A (en) | Multi-modal neural machine translation method, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Zhao Menghai Inventor after: Yan Zhihua Inventor before: Zhao Menghai Inventor before: Yan Zhihua |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181214 |