CN110008473A - A kind of medical text name Entity recognition mask method based on alternative manner - Google Patents
A kind of medical text name Entity recognition mask method based on alternative manner Download PDFInfo
- Publication number
- CN110008473A CN110008473A CN201910257482.1A CN201910257482A CN110008473A CN 110008473 A CN110008473 A CN 110008473A CN 201910257482 A CN201910257482 A CN 201910257482A CN 110008473 A CN110008473 A CN 110008473A
- Authority
- CN
- China
- Prior art keywords
- text
- medical
- dictionary
- word
- corpus
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Abstract
The embodiment of the present invention proposes a kind of medical text name Entity recognition mask method based on alternative manner, is related to medical information technical field.A large amount of man power and material is consumed using traditional annotation tool for extensive medical corpus labeling tool.The method that the method is combined using model and automation tools is suitable for large-scale medical text marking, reduces the mark period, to facilitate the raising of research and development of products efficiency.
Description
Technical field
The present invention relates to medical information technical fields, order in particular to a kind of medical text based on alternative manner
Name Entity recognition mask method.
Background technique
Medical field is different from general field, itself has centainly professional.The research of medical field be unable to do without medical treatment
The support of corpus, in medical research field, sequence labelling is a basic and very important job.But since name is real
Body identification mark needs a large amount of man power and material, and current mainstream sequence labelling be all by means of annotation tool of increasing income, because
This mark period is long, further relates to professional very strong knowledge in medical field, so bringing to medical sequence labelling task tired
It is difficult.In order to improve annotating efficiency, a kind of medical treatment name entity automatic marking method based on iteration is proposed.
With the development of internet, mobile Internet and big data technology, the scale of various text data resources is presented
Explosive growth mainly includes social media (such as microblogging number, public platform, Facebook, Twitter etc.) and news media
Unstructured data and Baidupedia and wikipedia on (such as People's Daily, phoenix news, Sohu's news etc.) website
Semi-structured data on equal encyclopaedias website, natural language processing (Natural Language Processing, NLP) is in text
Play the part of very important role in this information extraction process.During text mining, how to be extracted in mass text data
Useful information is all of great value to enterprise or user.Sequence labelling is a kind of most basic and most common side NLP
Method.How in Chinese sequence each word corresponding label is quickly and effectively predicted (for example, noun, name, place name, time
Deng), for relation excavation, the important artificial intelligence task such as knowledge mapping plays a significant role.
In the prior art, medical treatment mark corpus is few, brings difficulty to medical text basis research work;Meanwhile medical treatment text
This mark depends on annotation tool, and the mark period is long, consumes a large amount of man power and material.
Summary of the invention
The purpose of the present invention is to provide a kind of, and the medical text based on alternative manner names Entity recognition mask method, tool
Have the advantages that annotating efficiency is high, mark is accurate and method is simple.
To achieve the goals above, technical solution used in the embodiment of the present invention is as follows:
A kind of medical text name Entity recognition mask method based on alternative manner, the method execute following steps:
Step 1: according to name entity class, preparing initialization seed word, basis of the seed words as successive iterations;
Step 2: it is based on the existing free text of medical treatment, seed words label is stamped to text, in name Entity recognition task,
The beginning ending of seed words is respectively B and E, and middle word I, remaining word is O;
Step 3: model training is carried out to the corpus after first run mark, it is pre- to the completion of medical corpus of text according to model is generated
It surveys, and extracts the entity word after prediction;
Step 4: web analysis being carried out using search-engine tool to the new round entity word of generation, according to the presence or absence of hundred
The principle of section's entry is filtered, while can further supplement entity word resource according to the relational language of Internet resources, will be handled
Entity word afterwards supplements dictionary;
Step 5: repeating step 2, step 3, step 4, complete more wheel iteration, when the number of iterations for reaching setting or increase newly
Number of entries does not increase, then stops iteration;Using automation tools, the reality gone out according to the entity word of dictionary mark and model prediction
Pronouns, general term for nouns, numerals and measure words, by boundary is inconsistent and the inconsistent entity word of classification extracts;To the inconsistent entity word corpus extracted, pass through
Regular further amendment, is finally completed the mark of medical treatment name entity.
Further, described to be based on the existing free text of medical treatment, the method that text stamps seed words label is executed following
Step:
Step S1: obtaining different keywords, generates the corresponding lists of keywords of different medical text, and be stored in database
In;
Step S2: lists of keywords is read from database, and according to the pass of different medical text and medical text
Keyword generates the corresponding unique identifier of different medical text, and constructs a unique dictionary tree according to the unique identifier,
All unique dictionary trees constitute a basic dictionary tree object pool for being used for Chinese Word Segmentation Service;
Step S3: pending data is received, and base word is corresponded to according to the corresponding medical text to be processed of pending data
Dictionary tree segments pending data in allusion quotation tree object pool;Keyword filtering is realized according to word segmentation result.
Further, in the step S1 the different keywords of different medical text by user maintenance to database.
Further, the step S2 further include:
Lists of keywords constructs different dictionaries in the way of a medical corresponding dictionary of text;The dictionary
Format be X.dic, wherein X be dictionary title.
Further, the step S3 includes following sub-step:
S31: receiving pending data, judges the corresponding medical text of pending data, and jump to step S32;
S32: it is retrieved from basic dictionary tree object pool and the medical treatment text according to the corresponding medical text of pending data
Corresponding dictionary tree;In the presence of jump to step S33;Otherwise step S34 is jumped to;
S33: segmenting pending data by the dictionary tree, realizes keyword filtering according to word segmentation result, terminates;
S34: judging whether there is the corresponding dictionary of corresponding with pending data medical text, in the presence of jump to step
Otherwise rapid S35 jumps to step S36;
S35: according to the corresponding dictionary dynamic construction dictionary tree of the corresponding medical treatment text of pending data, and according to building
Dictionary tree segments pending data, realizes keyword filtering according to word segmentation result, terminates;
S36: pre-set general dictionary is called, and general dictionary tree is constructed according to general dictionary, and according to building
General dictionary tree segments pending data, realizes keyword filtering according to word segmentation result, terminates.
Further, the corpus after the mark to the first run carries out model training, according to generation model to medical text language
Material completes prediction, and the method for extracting the entity word after prediction executes following steps:
Step A1: the corpus that will acquire is pre-processed;
Step A2: corpus pretreated in step A1 is inputted into preset learning model, the parameter of regularized learning algorithm model
And it saves;
Step A3: being that the corpus obtained adds corresponding pre- mark respectively according to the sequence classification results that learning model exports
Label carry out minimizing of the optimization to be fitted prediction label and manual tag using loss function of the manual tag to learning model
Match, for unknown corpus, segmented using segmentation methods, using learning model adjusted to the unknown corpus after participle into
The first mark of row;
Step A4: the unknown corpus marked for the first time in step A3 is subjected to tuning, the corpus after tuning is finally marked
Note.
Further, the pretreatment in the step A1 includes merging big granularity participle and unified format.
A kind of medical text based on alternative manner provided in an embodiment of the present invention names Entity recognition mask method, has
Below the utility model has the advantages that consuming a large amount of manpower and object using traditional annotation tool for extensive medical corpus labeling tool
Power.The method that the method is combined using model and automation tools is suitable for large-scale medical text marking, reduces
The period is marked, to facilitate the raising of research and development of products efficiency.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate
Appended attached drawing, is described in detail below.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached
Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair
The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this
A little attached drawings obtain other relevant attached drawings.
Fig. 1 shows the medical text name Entity recognition mask method provided in an embodiment of the present invention based on alternative manner
Method flow schematic diagram.
Specific embodiment
Below in conjunction with attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete
Ground description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Usually exist
The component of the embodiment of the present invention described and illustrated in attached drawing can be arranged and be designed with a variety of different configurations herein.Cause
This, is not intended to limit claimed invention to the detailed description of the embodiment of the present invention provided in the accompanying drawings below
Range, but it is merely representative of selected embodiment of the invention.Based on the embodiment of the present invention, those skilled in the art are not doing
Every other embodiment obtained under the premise of creative work out, shall fall within the protection scope of the present invention.
It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi
It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.Meanwhile of the invention
In description, term " first ", " second " etc. are only used for distinguishing description, are not understood to indicate or imply relative importance.
Embodiment 1:
As shown in Figure 1, a kind of medical text based on alternative manner names Entity recognition mask method, the method is executed
Following steps:
Step 1: according to name entity class, preparing initialization seed word, basis of the seed words as successive iterations;
Step 2: it is based on the existing free text of medical treatment, seed words label is stamped to text, in name Entity recognition task,
The beginning ending of seed words is respectively B and E, and middle word I, remaining word is O;
Step 3: model training is carried out to the corpus after first run mark, it is pre- to the completion of medical corpus of text according to model is generated
It surveys, and extracts the entity word after prediction;
Step 4: web analysis being carried out using search-engine tool to the new round entity word of generation, according to the presence or absence of hundred
The principle of section's entry is filtered, while can further supplement entity word resource according to the relational language of Internet resources, will be handled
Entity word afterwards supplements dictionary;
Step 5: repeating step 2, step 3, step 4, complete more wheel iteration, when the number of iterations for reaching setting or increase newly
Number of entries does not increase, then stops iteration;Using automation tools, the reality gone out according to the entity word of dictionary mark and model prediction
Pronouns, general term for nouns, numerals and measure words, by boundary is inconsistent and the inconsistent entity word of classification extracts;To the inconsistent entity word corpus extracted, pass through
Regular further amendment, is finally completed the mark of medical treatment name entity.
The technical principle of above-mentioned technical proposal are as follows: by extracting keyword, then the keyword of extraction is matched, in fact
Existing function.
The technical effect of above-mentioned technical proposal are as follows: to facilitate the raising of research and development of products efficiency.
Embodiment 2:
It is described to be based on the existing free text of medical treatment on the basis of a upper embodiment, seed words label is stamped to text
Method executes following steps:
Step S1: obtaining different keywords, generates the corresponding lists of keywords of different medical text, and be stored in database
In;
Step S2: lists of keywords is read from database, and according to the pass of different medical text and medical text
Keyword generates the corresponding unique identifier of different medical text, and constructs a unique dictionary tree according to the unique identifier,
All unique dictionary trees constitute a basic dictionary tree object pool for being used for Chinese Word Segmentation Service;
Step S3: pending data is received, and base word is corresponded to according to the corresponding medical text to be processed of pending data
Dictionary tree segments pending data in allusion quotation tree object pool;Keyword filtering is realized according to word segmentation result.
The technical principle of above-mentioned technical proposal are as follows: receive pending data, and corresponding to be processed according to pending data
Medical text corresponds to dictionary tree in basic dictionary tree object pool and segments to pending data;It is realized according to word segmentation result crucial
Word filtering
The technical effect of above-mentioned technical proposal are as follows: the accuracy of system, method can be promoted.
Embodiment 3:
On the basis of a upper embodiment, the different keywords of different medical text are tieed up by user in the step S1
Protect database.
The technical principle of above-mentioned technical proposal are as follows: keyword is inserted into database, it is ensured that keyword is permanently effective.
The technical effect of above-mentioned technical proposal are as follows: ensure that the reliability of method.
Embodiment 4:
On the basis of a upper embodiment, the step S2 further include:
Lists of keywords constructs different dictionaries in the way of a medical corresponding dictionary of text;The dictionary
Format be X.dic, wherein X be dictionary title.
The technical principle of above-mentioned technical proposal are as follows: constructed in the way of a medical corresponding dictionary of text different
Dictionary;The format of the dictionary is X.dic, and wherein X is dictionary title.
The technical effect of above-mentioned technical proposal are as follows: the efficiency of method for improving.
Embodiment 5:
On the basis of a upper embodiment, the step S3 includes following sub-step:
S31: receiving pending data, judges the corresponding medical text of pending data, and jump to step S32;
S32: it is retrieved from basic dictionary tree object pool and the medical treatment text according to the corresponding medical text of pending data
Corresponding dictionary tree;In the presence of jump to step S33;Otherwise step S34 is jumped to;
S33: segmenting pending data by the dictionary tree, realizes keyword filtering according to word segmentation result, terminates;
S34: judging whether there is the corresponding dictionary of corresponding with pending data medical text, in the presence of jump to step
Otherwise rapid S35 jumps to step S36;
S35: according to the corresponding dictionary dynamic construction dictionary tree of the corresponding medical treatment text of pending data, and according to building
Dictionary tree segments pending data, realizes keyword filtering according to word segmentation result, terminates;
S36: pre-set general dictionary is called, and general dictionary tree is constructed according to general dictionary, and according to building
General dictionary tree segments pending data, realizes keyword filtering according to word segmentation result, terminates.
The technical principle of above-mentioned technical proposal are as follows: call pre-set general dictionary, and constructed and led to according to general dictionary
With dictionary tree, and pending data is segmented according to the general dictionary tree of building, keyword mistake is realized according to word segmentation result
Filter.
The technical effect of above-mentioned technical proposal are as follows: the accuracy of method for improving.
Embodiment 6:
On the basis of a upper embodiment, model training is carried out to the corpus after first run mark, according to generation model to doctor
It treats corpus of text and completes prediction, and the method for extracting the entity word after prediction executes following steps:
Step A1: the corpus that will acquire is pre-processed;
Step AA2: corpus pretreated in step A1 is inputted into preset learning model, the parameter of regularized learning algorithm model
And it saves;
Step A3: being that the corpus obtained adds corresponding pre- mark respectively according to the sequence classification results that learning model exports
Label carry out minimizing of the optimization to be fitted prediction label and manual tag using loss function of the manual tag to learning model
Match, for unknown corpus, segmented using segmentation methods, using learning model adjusted to the unknown corpus after participle into
The first mark of row;
Step A4: the unknown corpus marked for the first time in step A3 is subjected to tuning, the corpus after tuning is finally marked
Note.
The technical principle of above-mentioned technical proposal are as follows: according to the corpus point that the sequence classification results that learning model exports are acquisition
Corresponding prediction label is not added, carries out minimizing optimization using loss function of the manual tag to learning model to be fitted prediction
The matching of label and manual tag segments unknown corpus using segmentation methods, utilizes learning model pair adjusted
Unknown corpus after participle is marked for the first time.
The technical effect of above-mentioned technical proposal are as follows: method has learning-oriented, growth.
Embodiment 7
On the basis of a upper embodiment, pretreatment in the step A1 includes merging big granularity participle and uniformly
Format.
The technical principle of above-mentioned technical proposal are as follows: merged using unified format, so that result is more accurate.
The technical effect of above-mentioned technical proposal are as follows: improve the accuracy of method.
In several embodiments provided herein, it should be understood that disclosed device and method can also pass through
Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, flow chart and block diagram in attached drawing
Show the device of multiple embodiments according to the present invention, the architectural framework in the cards of method and computer program product,
Function and operation.In this regard, each box in flowchart or block diagram can represent the one of a unit, program segment or code
Part, a part of the unit, program segment or code, which includes that one or more is for implementing the specified logical function, to be held
Row instruction.It should also be noted that function marked in the box can also be to be different from some implementations as replacement
The sequence marked in attached drawing occurs.For example, two continuous boxes can actually be basically executed in parallel, they are sometimes
It can execute in the opposite order, this depends on the function involved.It is also noted that every in block diagram and or flow chart
The combination of box in a box and block diagram and or flow chart can use the dedicated base for executing defined function or movement
It realizes, or can realize using a combination of dedicated hardware and computer instructions in the system of hardware.
In addition, each functional unit in each embodiment of the present invention can integrate one independent portion of formation together
Point, it is also possible to each unit individualism, an independent part can also be integrated to form with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product
It is stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially in other words
The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter
Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a
People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention.
And storage medium above-mentioned includes: that USB flash disk, mobile hard disk, read-only memory (ROM, Read-Onl8Memor8), arbitrary access are deposited
The various media that can store program code such as reservoir (RAM, Random Access Memor8), magnetic or disk.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that
There is also other identical elements in process, method, article or equipment including the element.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field
For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair
Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.It should also be noted that similar label and letter exist
Similar terms are indicated in following attached drawing, therefore, once being defined in a certain Xiang Yi attached drawing, are then not required in subsequent attached drawing
It is further defined and explained.
Claims (7)
1. a kind of medical text based on alternative manner names Entity recognition mask method, which is characterized in that the method executes
Following steps:
Step 1: according to name entity class, preparing initialization seed word, basis of the seed words as successive iterations;
Step 2: it is based on the existing free text of medical treatment, seed words label is stamped to text, in name Entity recognition task, seed
The beginning ending of word is respectively B and E, and middle word I, remaining word is O;
Step 3: model training is carried out to the corpus after first run mark, medical corpus of text is completed to predict according to model is generated,
And extract the entity word after prediction;
Step 4: web analysis being carried out using search-engine tool to the new round entity word of generation, according to the presence or absence of encyclopaedia word
The principle of item is filtered, while can further supplement entity word resource according to the relational language of Internet resources, by treated
Entity word supplements dictionary;
Step 5: repeating step 2, step 3, step 4, more wheel iteration are completed, when the number of iterations or newly-increased entry for reaching setting
Quantity does not increase, then stops iteration;Using automation tools, the entity gone out according to the entity word of dictionary mark and model prediction
Word, by boundary is inconsistent and the inconsistent entity word of classification extracts;To the inconsistent entity word corpus extracted, pass through rule
Then further amendment is finally completed the mark of medical treatment name entity.
2. the medical text based on alternative manner names Entity recognition mask method as described in claim 1, which is characterized in that
Described to be based on the existing free text of medical treatment, the method for stamping seed words label to text executes following steps:
Step S1: obtaining different keywords, generates the corresponding lists of keywords of different medical text, and be stored in database profession;
Step S2: lists of keywords is read from database, and according to the keyword of different medical text and medical text
The corresponding unique identifier of different medical text is generated, and a unique dictionary tree is constructed according to the unique identifier, is owned
Unique dictionary tree constitutes a basic dictionary tree object pool for being used for Chinese Word Segmentation Service;
Step S3: pending data is received, and basic dictionary tree is corresponded to according to the corresponding medical text to be processed of pending data
Dictionary tree segments pending data in object pool;Keyword filtering is realized according to word segmentation result.
3. the medical text based on alternative manner names Entity recognition mask method as claimed in claim 2, which is characterized in that
The different keywords of different medical text are by user maintenance to database in the step S1.
4. as claim 3 names Entity recognition mask method based on the medical text of alternative manner, which is characterized in that the step
Rapid S2 further include:
Lists of keywords constructs different dictionaries in the way of a medical corresponding dictionary of text;The lattice of the dictionary
Formula is X.dic, and wherein X is dictionary title.
5. as claim 4 names Entity recognition mask method based on the medical text of alternative manner, which is characterized in that the step
Rapid S3 includes following sub-step:
S31: receiving pending data, judges the corresponding medical text of pending data, and jump to step S32;
S32: it is retrieved from basic dictionary tree object pool according to the corresponding medical text of pending data corresponding with the medical treatment text
Dictionary tree;In the presence of jump to step S33;Otherwise step S34 is jumped to;
S33: segmenting pending data by the dictionary tree, realizes keyword filtering according to word segmentation result, terminates;
S34: judging whether there is the corresponding dictionary of corresponding with pending data medical text, in the presence of jump to step
Otherwise S35 jumps to step S36;
S35: according to the corresponding dictionary dynamic construction dictionary tree of the corresponding medical treatment text of pending data, and according to the dictionary of building
Tree segments pending data, realizes keyword filtering according to word segmentation result, terminates;
S36: pre-set general dictionary is called, and general dictionary tree is constructed according to general dictionary, and according to the general of building
Dictionary tree segments pending data, realizes keyword filtering according to word segmentation result, terminates.
6. the medical text based on alternative manner names Entity recognition mask method as claimed in claim 5, which is characterized in that
Corpus after the mark to the first run carries out model training, completes to predict to medical corpus of text according to model is generated, and extract
The method of entity word after prediction executes following steps:
Step A1: the corpus that will acquire is pre-processed;
Step A2: inputting preset learning model for corpus pretreated in step A1, the parameter of regularized learning algorithm model and guarantor
It deposits;
Step A3: being that the corpus obtained adds corresponding prediction label respectively according to the sequence classification results that learning model exports,
It carries out minimizing matching of the optimization to be fitted prediction label and manual tag using loss function of the manual tag to learning model,
It for unknown corpus, is segmented using segmentation methods, unknown corpus after participle is carried out using learning model adjusted
First mark;
Step A4: the unknown corpus marked for the first time in step A3 is subjected to tuning, the corpus after tuning is finally marked.
7. the medical text based on alternative manner names Entity recognition mask method as claimed in claim 6, which is characterized in that
Pretreatment in the step A1 includes merging big granularity participle and unified format.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910257482.1A CN110008473B (en) | 2019-04-01 | 2019-04-01 | Medical text named entity identification and labeling method based on iteration method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910257482.1A CN110008473B (en) | 2019-04-01 | 2019-04-01 | Medical text named entity identification and labeling method based on iteration method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110008473A true CN110008473A (en) | 2019-07-12 |
CN110008473B CN110008473B (en) | 2022-11-25 |
Family
ID=67169242
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910257482.1A Active CN110008473B (en) | 2019-04-01 | 2019-04-01 | Medical text named entity identification and labeling method based on iteration method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110008473B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111178080A (en) * | 2020-01-02 | 2020-05-19 | 杭州涂鸦信息技术有限公司 | Named entity identification method and system based on structured information |
WO2021114632A1 (en) * | 2020-05-13 | 2021-06-17 | 平安科技(深圳)有限公司 | Disease name standardization method, apparatus, device, and storage medium |
WO2021139257A1 (en) * | 2020-06-24 | 2021-07-15 | 平安科技(深圳)有限公司 | Method and apparatus for selecting annotated data, and computer device and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140310207A1 (en) * | 2013-04-10 | 2014-10-16 | Lifecom, Inc. | Chronology-centric, case-entity information handling system and methodology |
CN107622050A (en) * | 2017-09-14 | 2018-01-23 | 武汉烽火普天信息技术有限公司 | Text sequence labeling system and method based on Bi LSTM and CRF |
-
2019
- 2019-04-01 CN CN201910257482.1A patent/CN110008473B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140310207A1 (en) * | 2013-04-10 | 2014-10-16 | Lifecom, Inc. | Chronology-centric, case-entity information handling system and methodology |
CN107622050A (en) * | 2017-09-14 | 2018-01-23 | 武汉烽火普天信息技术有限公司 | Text sequence labeling system and method based on Bi LSTM and CRF |
Non-Patent Citations (1)
Title |
---|
田家源等: "面向互联网资源的医学命名实体识别研究", 《计算机科学与探索》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111178080A (en) * | 2020-01-02 | 2020-05-19 | 杭州涂鸦信息技术有限公司 | Named entity identification method and system based on structured information |
CN111178080B (en) * | 2020-01-02 | 2023-07-18 | 杭州涂鸦信息技术有限公司 | Named entity identification method and system based on structured information |
WO2021114632A1 (en) * | 2020-05-13 | 2021-06-17 | 平安科技(深圳)有限公司 | Disease name standardization method, apparatus, device, and storage medium |
WO2021139257A1 (en) * | 2020-06-24 | 2021-07-15 | 平安科技(深圳)有限公司 | Method and apparatus for selecting annotated data, and computer device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110008473B (en) | 2022-11-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104598535B (en) | A kind of event extraction method based on maximum entropy | |
CN107705066A (en) | Information input method and electronic equipment during a kind of commodity storage | |
CN103853834B (en) | Text structure analysis-based Web document abstract generation method | |
CN104199972A (en) | Named entity relation extraction and construction method based on deep learning | |
CN103823824A (en) | Method and system for automatically constructing text classification corpus by aid of internet | |
CN109635288A (en) | A kind of resume abstracting method based on deep neural network | |
CN103077164A (en) | Text analysis method and text analyzer | |
CN103294781A (en) | Method and equipment used for processing page data | |
CN110008473A (en) | A kind of medical text name Entity recognition mask method based on alternative manner | |
CN108287911A (en) | A kind of Relation extraction method based on about fasciculation remote supervisory | |
CN103324700A (en) | Noumenon concept attribute learning method based on Web information | |
KR101724398B1 (en) | A generation system and method of a corpus for named-entity recognition using knowledge bases | |
KR101801257B1 (en) | Text-Mining Application Technique for Productive Construction Document Management | |
CN103530429A (en) | Webpage content extracting method | |
CN104699797A (en) | Webpage data structured analytic method and device | |
Azir et al. | Wrapper approaches for web data extraction: A review | |
CN111143571B (en) | Entity labeling model training method, entity labeling method and device | |
CN109710930A (en) | A kind of Chinese Resume analytic method based on deep neural network | |
CN107436931B (en) | Webpage text extraction method and device | |
Owen et al. | Towards a scientific workflow featuring Natural Language Processing for the digitisation of natural history collections. | |
CN106372232B (en) | Information mining method and device based on artificial intelligence | |
CN115186015A (en) | Network security knowledge graph construction method and system | |
CN104834718A (en) | Recognition method and system for event argument based on maximum entropy model | |
CN105335446A (en) | Short text classification model generation method and classification method based on word vector | |
CN109002561A (en) | Automatic document classification method, system and medium based on sample keyword learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |