CN107977361A - The Chinese clinical treatment entity recognition method represented based on deep semantic information - Google Patents

The Chinese clinical treatment entity recognition method represented based on deep semantic information Download PDF

Info

Publication number
CN107977361A
CN107977361A CN201711278996.2A CN201711278996A CN107977361A CN 107977361 A CN107977361 A CN 107977361A CN 201711278996 A CN201711278996 A CN 201711278996A CN 107977361 A CN107977361 A CN 107977361A
Authority
CN
China
Prior art keywords
word
sentence
clinical treatment
mrow
semantic information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711278996.2A
Other languages
Chinese (zh)
Other versions
CN107977361B (en
Inventor
汤步洲
石雪
刘增健
陈清财
王晓龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN201711278996.2A priority Critical patent/CN107977361B/en
Publication of CN107977361A publication Critical patent/CN107977361A/en
Application granted granted Critical
Publication of CN107977361B publication Critical patent/CN107977361B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The present invention proposes a kind of Chinese clinical treatment entity recognition method represented based on deep semantic information, including two parts content:1) method for expressing of Chinese clinical treatment entity;2) recognition methods of Chinese clinical treatment entity.Method for expressing includes following two:Single tag representation and multi-tag represent.Recognition methods incorporates the Chinese character method for expressing based on medical field radical information, the local semantic information of medical text is obtained based on CNN, the overall situation semanteme of medical text is obtained based on two-way LSTM, and the semantic information of different words in sentence is made choice based on Attention mechanism.The present invention inherits the advantage of deep learning, has the advantages that the accuracy rate and recall rate of less manual features intervention and higher.

Description

The Chinese clinical treatment entity recognition method represented based on deep semantic information
Technical field
The invention belongs to intelligent medical treatment technical field, more particularly to a kind of Chinese clinic represented based on deep semantic information Medical bodies recognition methods.
Background technology
With the development of medical information technology, clinical treatment information system is constantly popularized, and a large amount of available clinics occurs Medical data.These data can not only support clinical decision, moreover it is possible to be used for the research in terms of clinical and translational medicine.Utilize One major obstacle of these data is:The useful information resided in medical his- tory taking can not be direct by computer system Use.Can the text analysis technique of drawing-out structure information be to solve the key of this obstacle from urtext.It is medical real One of background task as clinical treatment text analyzing of body identification, its correlation technique to clinical treatment DSS, face The development of bed medical knowledge Research on Mining etc. is significant, receives the extensive concern of academia and industrial quarters.
Within twenty or thirty year in past, researchers have studied the clinical treatment reality for different type clinography in large quantities Body recognition methods, and develop some application systems.The method of existing clinical treatment Entity recognition is mainly directed towards the clinic of English Medical text, the clinical treatment Entity recognition research of other language electronic health records is relatively fewer, and the method for use is mainly regular Matching, statistical machine learning and both combinations.Rule-based matched method needs the expert of medical field to write rule, It is with high costs, and transplantability is poor.Method based on statistical machine learning needs manual extraction, and largely reliable feature obtains High Entity recognition performance.
In recent years, as deep learning has been widely applied in the rise of natural language processing field, deep learning The many aspects such as morphology, syntax, word sense disambiguation, semantic analysis, information extraction, question and answer.With regard to the background task in text analyzing field For naming Entity recognition, the deep-neural-network model based on window of early stage is just used for identifying that the name of general field is real Body, its performance exceed statistical machine learning algorithm.There are within 2015 a series of name Entity recognitions using RNN combinations CRF Work, in general field, its effect exceedes the name Entity recognition main model (such as CRF) based on feature-rich.But facing Bed medical field, statistical machine learning algorithm is still the mainstream technology of medical bodies identification, towards Chinese clinical treatment text Entity recognition method research it is also fewer.
Compared with English, Chinese has the characteristics of oneself is exclusive, and Chinese is pictograph, and radical is semantic with substantial amounts of words Information.English text analytical technology is used directly to analysis Chinese text performance to tend not to be optimal.Therefore, in processes When literary, it is necessary to from Chinese itself the characteristics of, design suitable processing method.
The content of the invention
In order to solve the problems, such as the name Entity recognition of Chinese medical field, the present invention is carried from the characteristics of Chinese itself A kind of Chinese clinical treatment entity recognition method represented based on deep semantic information is supplied.This method has the characteristics that:1) According to the form of expression of clinical treatment entity, two kinds of method for expressing for representing new are devised;2) take into full account that the composition of Chinese character is special Point, depth representing is carried out by the radical comprising Chinese Character Semantics;3) devise one kind while consider words local context information With the deep neural network of sentence global information.
The present invention adopts the following technical scheme that:
A kind of Chinese clinical treatment entity recognition method represented based on deep semantic information, it is characterised in that:The side Method uses deep neural network model, is divided into 5 layers on the whole:(1) input layer, CNN layers of (2), (3) are LSTM layers two-way, (4) Attention layers, (5) output layer;The described method includes:
During training, table is carried out to the sentence comprising Chinese clinical treatment entity with single label or multi-tag method for expressing first Show, then carry out model training using following steps:
S1, using common words is distributed represent that learning algorithm training on the relevant text of a large amount of medical fields obtains Word vector distribution formula represents;
S2, extract extensive medical bodies composition clinical field word automatically from Vertical Website and/or medical profession system Word in dictionary, is carried out radical fractionation, extracts and count to obtain the common radical of clinical field dictionary by allusion quotation, and will All words are classified by the common radical of clinical field dictionary, and are initialized immediately;
S3, by the obtained distributed vector of step S1 and S2 be spliced to form and merged Chinese clinical treatment field radical The Chinese character deep semantic of information represents;
The contextual window of S4, selection centered on current word, the context part semantic information table of word is obtained with CNN Show, and as LSTM layers of input;
S5, the global semantic information expression using sentence in two-way LSTM acquisition clinical treatment texts;
S6, using attention mechanism, by calculating the similarity of current word and other words in sentence, obtain in sentence Other words are contributed by the semanteme of current word and weight, find useful information significantly correlated with current word in sentence, and will Attention is vectorial to be spliced with current word vector, obtains current word context part semantic information and place sentence is global The depth representing of semantic information;
The sentence of S7, the Chinese medical field of input one, by the processing of above-mentioned steps S1-S6, have obtained its depth language Justice represents sequence, and sequence is represented as input using the deep semantic, and clinical treatment entity represents that sequence label as output, utilizes Sequence labelling algorithm is modeled, and the Chinese character deep semantic to having merged Chinese clinical treatment field radical information represents to carry out Adjustment;
During test, represented by tabling look-up to obtain the Chinese character deep semantic in step S3, then successively according to step S4, S5 and S6 obtains the depth representing of current word context part semantic information and place sentence overall situation semantic information, finally by the depth of sentence It is predicted to obtain sequence label in the model that degree semantic expressiveness sequence inputting is obtained to training and carries out clinical treatment entity also It is former.
Further, when using single tag representation, reduced using following rule:(1) if the son labeled as H Sequence, all subsequences for being labeled as D merge with the subsequence labeled as H in same words and expressions;(2) if be not marked with The subsequence of H, all subsequences for being labeled as D merge;When being represented using multi-tag, there is no difference is represented, directly Reduced.
Further, the method is local to the context in medical text sentence where Chinese character using convolutional neural networks Semantic information is indicated, and is mainly comprised the following steps:
1) contextual window of a fixed size is chosen to each word in medical text sentence;
2) size and number of fixed convolution kernel carries out convolution operation to the contextual window of each word, in the process of convolution In, assign a relevant depth representing of relative position to each word in window;
3) down-sampled by pondization operation progress to the feature vector that each convolution kernel obtains, the pond operation includes Maximum pond, average pond;Wherein, maximum pondization is that the feature vector obtained to each convolution kernel takes its maximum, Value pondization is that the feature vector obtained to each convolution kernel takes its average value.
Further, the method carries out global semantic table using two-way LSTM to medical text sentence where current word Show, key step includes:
1) by the word for having merged clinical treatment field character structure feature represent acquired in vector sum CNN centered on word Local semantic information represent to carry out splicing input as two-way LSTM, be denoted as xt, represent sentence in t-th of Chinese character depth Semantic expressiveness;
2) input Chinese character deep semantic and represent sequence x1x2…xnTwo groups are obtained through forward and reverse LSTM network processes State output sequences h1h2…hnAnd h 'nh’n-1…h’1, two groups of state output sequences are merged, obtain [h1h’1] [h2h’2]…[hnh’n], for t-th of Chinese character, its global semantic information is expressed as [hth’t]。
Further, the method using attention mechanism selection sentence in different terms justice information in sentence semantics In weight, key step includes:
1) the term vector s of each word i is calculatediWith the term vector s of other words j (j ≠ i) in sentencejSimilarity:eij= sim(si,sj)=si·sj
2) by softmax functions by its normalized, and then the weight factor of each word is calculated:
3) summation is weighted with reference to its weight factor by the LSTM layers output vector to each word, obtains fusion sentence In other words to the vector of the semantic contribution information of current word:
4) splicing merges the current term vector inflow output layer of Attention vector sums.
Further, softmax functions are combined by the output layer with CRF, are specifically included:
1) if it is x=(x to assume its list entries1,x2,…,xn), the sequence label of prediction is y=(y1,y2,…, yn), the output probability matrix P of two-way LSTMn×k, Pi,jThe probability of j-th of label is marked as i-th of word, then there have to be following fixed Justice:
Wherein, k is the number of output label, and A is state-transition matrix, Ai,jRepresentative is transferred to j-th from i-th of label The probability of label;
2) by solving maximum s (x, y), you can obtain the optimal output label sequence of sequence.
The beneficial effects of the invention are as follows:Incorporate the Chinese character method for expressing based on medical field radical information;Based on CNN Obtain the local semantic information of medical text;The overall situation that medical text is obtained based on two-way LSTM is semantic;Based on Attention machines Make and the semantic information of different words in sentence is made choice.
Brief description of the drawings
Fig. 1 is the method for expressing schematic diagram of the Chinese clinical treatment entity of the present invention;
Fig. 2 is the frame diagram of the Chinese clinical treatment entity recognition method represented based on deep semantic information;
Fig. 3 is the flow chart for the local semantic information that medical text is obtained based on CNN;
Fig. 4 is the flow chart for the global semantic information that medical text is obtained based on two-way LSTM;
Fig. 5 is the flow chart of the semantic information selection based on Attention mechanism.
Embodiment
The present invention is further described for explanation and embodiment below in conjunction with the accompanying drawings.
The present invention in depth have studied existing clinical treatment text entities (including disease, symptom, body part, inspection Look into, treatment etc.) on the basis of recognition methods, a kind of Chinese clinical treatment Entity recognition represented based on deep semantic information of design Method.
With the continuous popularization of clinical treatment information system, clinical medical data has obtained substantial amounts of accumulation, how to have analyzed With using clinical medical data come adjuvant clinical medical decision making, promote the demand of clinical treatment correlative study development more and more stronger It is strong.One of the background task of medical bodies identification as clinical treatment text analyzing, its correlation technique is to clinical treatment decision-making branch It is significant to hold the development of system, clinical treatment knowledge excavation research etc., receives academia and the extensive of industrial quarters is closed Note.The present invention includes two parts content:1) method for expressing of Chinese clinical treatment entity;2) identification of Chinese clinical treatment entity Method.
Wherein, as shown in Figure 1, method for expressing is including following two:Single tag representation and multi-tag represent.The former according to work as Whether preceding subsequence, which needs to combine with other subsequences, could form a complete clinical treatment entity, whether multiple Clinical treatment entity is shared is divided into HDN three classes them, wherein, H represents that subsequence is shared by multiple clinical treatment entities, D Represent that subsequence is not shared by multiple clinical treatment entities, N represents that subsequence oneself is exactly a clinical treatment entity;To appointing One subsequence, according to where word position (such as " beginnings "-B, " centre "-M, " ending "-E, " individual character entity "-S with " entity it - O etc. outside ") indicated.Each entity of the latter is individually indicated according to the position where word, to by multiple clinical treatments The part that entity shares carries out multiple sign.
Recognition methods uses deep neural network frame on the whole, is broadly divided into 5 layers:Embedding layers, CNN (convolution god Through network, Convolutional Neural Network) layer, RNN (Recognition with Recurrent Neural Network, Recurrent Neural Network) layer (including LSTM (long short-term memory, Long Short-Term Memory), GRU (gating cycle neural unit, Gated Recurrent Unit) etc., only explained below by taking LSTM as an example, RNN layers can also be multilayer, below only with one Layer exemplified by explain), Attention layers and output layer.Wherein, (radical carries the composition feature of Embedding layers of fusion Chinese character Semantic information) deep semantic expression is carried out to Chinese character, CNN layers utilize the office that the contextual window centered on Chinese character represents Chinese character Portion's semantic information, LSTM layers utilize one-way or bi-directional LSTM (for convenience of period, only two-way LSTM is described in detail below) The global semantic information of sentence is obtained, Attention layers are existed using different terms justice information in attention mechanism selection sentence Weight in sentence semantics, output layer utilize sequence labelling algorithm (such as CRF (condition random field, Conditional Random Filed), SSVM (structuring support vector machines, Structural Support Vector Machine) etc.) to Chinese clinical Medical bodies identification is modeled.
The Chinese clinical treatment entity recognition method represented based on deep semantic information of the present invention as shown in Figure 2, its god Include through the network architecture:
Input layer:The main term vector for completing list entries is distributed to be represented.The layer by merge word distribution represent to Amount 1 and Chinese clinical treatment field radical it is distributed represent vector 2 obtain fusion radical information Chinese character it is distributed represent word to Amount;
CNN layers:The main local semantic information that medical text is obtained using convolution method.Layer output is flowed by step 3 Enter LSTM layers two-way;
It is LSTM layers two-way:The global semantic information of medical text is mainly obtained using two-way LSTM.This layer of output vector is led to Cross step 4 and flow into Attention layers;
Attention layers:It is based primarily upon Attention mechanism and assigns difference according to the semantic information of current sentence difference word Weight.This layer of output vector flows into output layer by step 5;
Output layer:Using advantage of the CRF models in sequence labelling problem, optimal annotated sequence is obtained.
Wherein, using the distributed expression of term vector of Chinese character representation of the fusion based on radical, following step is included Suddenly:
(1) represent learning algorithm (such as Continuous Bag-Of-Word, Skip-Gram) a large amount of by distribution The acquistion of going to school of medical text represent that term vector not only solves one-hot tables to term vector distribution that can be general, good Show the dimension disaster problem brought, and term vector has contained the semantic information of vocabulary, and good base is laid for work below Plinth;
(2) faced on a large scale by disclosed Internet resources (such as Vertical Website of encyclopaedia, health medical treatment field) structure These entities are carried out radical fractionation by bed medical bodies (such as disease, symptom, body part, checks, treatment etc.) dictionary, And the radical with different type clinical treatment Entity Semantics information is obtained by statistics, give each according to radical Word assigns a semantic expressiveness vector, carries out random initializtion;
(3) term vector obtained by step (1) and the random initialization vector obtained by step (2) are spliced, is formed new Input of the term vector as deep neural network model.
As shown in figure 3, obtaining the local semantic information of medical text using convolution method, following several steps are specifically included Suddenly:
The acquisition of step 31, term vector:Input layer is belonged to, the language of substantial amounts of medical text is trained by word2vec The good term vector distribution that model obtains can be general represents.
The acquisition of step 32, medical field radical vector:Input layer is belonged to, by medical field radical Radical random initialization vector in dictionary.
Step 33, the term vector being indicated based on fusion radical information to Chinese character are obtained:Input layer is belonged to, splicing is closed And the vector that step 31 and step 32 are obtained obtains new term vector as CNN layers of input.
Step 34, convolution operation:CNN layers are belonged to, obtaining step 33 is rolled up based on the vector that contextual window obtains Product processing.
Step 35, pondization operation:CNN layers are belonged to, pondization operation is carried out to the characteristic pattern of previous convolution operation.
Step 36, the acquisition based on the CNN layers of term vector for merging local semantic information:CNN layers are belonged to, splicing merges step It is rapid 35 gained local semantic information vector sum step 33 obtained by fusion radical information Chinese character term vector obtain new word to Amount.
As shown in figure 4, obtaining the global semantic information of medical text using two-way LSTM methods, specifically include following several Step:
The acquisition of step 41, input vector:CNN layers are belonged to, it is local to obtain fusion by convolution operation and pond operation The term vector of semantic information, new term vector is obtained by the term vector with merging the Chinese character term vector of radical information.
Step 42, forward direction LSTM:Belong to LSTM layers two-way, sample will be according to x1,x2,…,xnSequentially input Cell In, then obtain one group of state output { h1,h2,…,hn}。
Step 43, backward LSTM:Belong to LSTM layers two-way in Fig. 1, sample will be according to xn,xn-1,…,x1Sequentially input In Cell, one group of state output { h ' is then obtainedn,h’n-1,…,h’1}。
Step 44, output vector:Belong to LSTM layers two-way in Fig. 1, two groups of state variables are spelled according to following form Get up { [h1,h’1],[h2,h’2],…,[hn,h’n] obtain the vector of amalgamation of global semantic information.
As shown in figure 5, power of the different terms justice information in sentence semantics in sentence is selected using Attention mechanism Weight, including the following steps:
The acquisition of step 51, weight:CNN layers are belonged to, obtains merging local semanteme by convolution operation and pond operation The term vector of information, new term vector is obtained by the term vector with merging the Chinese character term vector of radical information.
Step 52, normalized:Other words in sentence are passed through into softmax to the semantic contribution of current word and weight Function is normalized.
Step 53, weighted sum:By other words in the weighted sum acquisition fusion sentence to different term vectors to working as The vector of the semantic contribution information of preceding word.
Step 54, output vector:The Attention vectorial splicings with current term vector are merged to form new term vector inflow Output layer.
Middle in the Chinese clinical treatment entity recognition method represented based on deep semantic information is exported most using CRF methods Good sequence label.In final output sequence label formula, generally handled using traditional softmax functions, but this side Effect is limited when method has the output label of the data of direct strong connection in processing.Due to neural network structure to data since Very big, the size of data volume can seriously affect model training effect, so LSTM is combined in CRF, in simple terms, be exactly Softmax functions are combined by output terminal with CRF, by CRF efficiently use sentence level label information obtain it is optimal defeated Outgoing label sequence.
In conclusion the present invention proposes a kind of Chinese clinical treatment entity recognition method based on deep semantic information, melt Enter the Chinese character method for expressing based on medical field radical information, the local semantic information of medical text, base are obtained based on CNN The overall situation semanteme of medical text, and the semantic information based on Attention mechanism to different words in sentence are obtained in two-way LSTM Make choice.It inherits the advantage of deep learning, has less manual features intervention and the accuracy rate of higher and recall rate etc. Advantage.
Above content is that a further detailed description of the present invention in conjunction with specific preferred embodiments, it is impossible to is assert The specific implementation of the present invention is confined to these explanations.For general technical staff of the technical field of the invention, On the premise of not departing from present inventive concept, some simple deduction or replace can also be made, should all be considered as belonging to the present invention's Protection domain.

Claims (6)

  1. A kind of 1. Chinese clinical treatment entity recognition method represented based on deep semantic information, it is characterised in that:The method Using deep neural network model, it is divided into 5 layers on the whole:(1) input layer, CNN layers of (2), (3) are LSTM layers two-way, (4) Attention layers, (5) output layer;The described method includes:
    During training, the sentence comprising Chinese clinical treatment entity is indicated with single label or multi-tag method for expressing first, Then model training is carried out using following steps:
    S1, using common words it is distributed represent learning algorithm training on the relevant text of a large amount of medical fields obtain word to Amount is distributed to be represented;
    S2, extract extensive medical bodies composition clinical field dictionary automatically from Vertical Website and/or medical profession system, Word in dictionary carries out radical fractionation, extracts and counts to obtain the common radical of clinical field dictionary, and will be all Word is classified by the common radical of clinical field dictionary, and is initialized immediately;
    S3, by the obtained distributed vector of step S1 and S2 be spliced to form and merged Chinese clinical treatment field radical information Chinese character deep semantic represent;
    S4, choose contextual window centered on current word, and the context part semantic information that word is obtained with CNN represents, and As LSTM layers of input;
    S5, the global semantic information expression using sentence in two-way LSTM acquisition clinical treatment texts;
    S6, using attention mechanism, by calculating the similarity of current word and other words in sentence, obtain other in sentence Word finds useful information significantly correlated with current word in sentence to the semantic contribution of current word and weight, and by attention It is vectorial to be spliced with current word vector, obtain current word context part semantic information and place sentence overall situation semantic information Depth representing;
    The sentence of S7, the Chinese medical field of input one, by the processing of above-mentioned steps S1-S6, have obtained its deep semantic table Show sequence, sequence is represented as input using the deep semantic, clinical treatment entity represents that sequence label as output, utilizes sequence Dimensioning algorithm is modeled, and the Chinese character deep semantic to having merged Chinese clinical treatment field radical information represents to adjust It is whole;
    During test, represented by tabling look-up to obtain the Chinese character deep semantic in step S3, then obtained successively according to step S4, S5 and S6 To the depth representing of current word context part semantic information and place sentence overall situation semantic information, finally by the depth language of sentence Justice represents to be predicted to obtain sequence label in the model that sequence inputting is obtained to training and carries out clinical treatment entity reduction.
  2. 2. according to the method described in claim 1, it is characterized in that:When using single tag representation, carried out using following rule Reduction:
    (1) if the subsequence labeled as H, in same words and expressions all subsequences for being labeled as D and the subsequence labeled as H into Row merges;
    (2) if being not marked with the subsequence of H, all subsequences for being labeled as D merge;Represented when using multi-tag When, there is no difference is represented, directly reduced.
  3. 3. according to the method described in claim 1, it is characterized in that:The method is using convolutional neural networks to being cured where Chinese character The context part semantic information treated in text sentence is indicated, and is mainly comprised the following steps:
    1) contextual window of a fixed size is chosen to each word in medical text sentence;
    2) size and number of fixed convolution kernel carries out convolution operation to the contextual window of each word, during convolution, A relevant depth representing of relative position is assigned to each word in window;
    3) down-sampled by pondization operation progress to the feature vector that each convolution kernel obtains, the pondization operation includes maximum It is worth pond, average pond;Wherein, maximum pondization is that the feature vector obtained to each convolution kernel takes its maximum, average pond Change the feature vector obtained to each convolution kernel and take its average value.
  4. 4. according to the method described in claim 1, it is characterized in that:The method is using two-way LSTM to medical treatment where current word Text sentence carries out global semantic expressiveness, and key step includes:
    1) word for having merged clinical treatment field character structure feature is represented to the office centered on word acquired in vector sum CNN Portion's semantic information represents to carry out splicing the input as two-way LSTM, is denoted as xt, represent sentence in t-th of Chinese character deep semantic Represent;
    2) input Chinese character deep semantic and represent sequence x1x2...xnTwo groups of shapes are obtained through forward and reverse LSTM network processes State output sequence h1h2...hnAnd h 'nh’n-1...h’1, two groups of state output sequences are merged, obtain [h1h’1] [h2h’2]...[hnh’n], for t-th of Chinese character, its global semantic information is expressed as [hth’t]。
  5. 5. according to the method described in claim 1, it is characterized in that:The method is using in attention mechanism selection sentence Weight of the different terms justice information in sentence semantics, key step include:
    1) the term vector s of each word i is calculatediWith the term vector s of other words j (j ≠ i) in sentencejSimilarity:eij=sim (si, sj)=si·sj
    2) by softmax functions by its normalized, and then the weight factor of each word is calculated:
    3) summation is weighted with reference to its weight factor by the LSTM layers output vector to each word, obtains its in fusion sentence The vector of his word to the semantic contribution information of current word:
    4) splicing merges the current term vector inflow output layer of Attention vector sums.
  6. 6. according to the method described in claim 5, it is characterized in that:Softmax functions are combined by the output layer with CRF, Specifically include:
    1) if it is x=(x to assume its list entries1, x2..., xn), the sequence label of prediction is y=(y1, y2..., yn), The output probability matrix P of two-way LSTMn×k, PI, jThe probability of j-th of label is marked as i-th of word, then is defined as below:
    <mrow> <mi>s</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>n</mi> </msubsup> <msub> <mi>A</mi> <mrow> <msub> <mi>y</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mrow> <mi>i</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> </mrow> </msub> <mo>+</mo> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>n</mi> </msubsup> <msub> <mi>P</mi> <mrow> <mi>i</mi> <mo>,</mo> <msub> <mi>y</mi> <mi>i</mi> </msub> </mrow> </msub> <mo>,</mo> </mrow>
    Wherein, k is the number of output label, and A is state-transition matrix, AI, jRepresent from i-th of label and be transferred to j-th of label Probability;
    2) by solving maximum s (x, y), you can obtain the optimal output label sequence of sequence.
CN201711278996.2A 2017-12-06 2017-12-06 Chinese clinical medical entity identification method based on deep semantic information representation Active CN107977361B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711278996.2A CN107977361B (en) 2017-12-06 2017-12-06 Chinese clinical medical entity identification method based on deep semantic information representation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711278996.2A CN107977361B (en) 2017-12-06 2017-12-06 Chinese clinical medical entity identification method based on deep semantic information representation

Publications (2)

Publication Number Publication Date
CN107977361A true CN107977361A (en) 2018-05-01
CN107977361B CN107977361B (en) 2021-05-18

Family

ID=62009382

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711278996.2A Active CN107977361B (en) 2017-12-06 2017-12-06 Chinese clinical medical entity identification method based on deep semantic information representation

Country Status (1)

Country Link
CN (1) CN107977361B (en)

Cited By (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733837A (en) * 2018-05-28 2018-11-02 杭州依图医疗技术有限公司 A kind of the natural language structural method and device of case history text
CN108804718A (en) * 2018-06-11 2018-11-13 线粒体(北京)科技有限公司 Data push method, device, electronic equipment and computer readable storage medium
CN108875034A (en) * 2018-06-25 2018-11-23 湖南丹尼尔智能科技有限公司 A kind of Chinese Text Categorization based on stratification shot and long term memory network
CN108920586A (en) * 2018-06-26 2018-11-30 北京工业大学 A kind of short text classification method based on depth nerve mapping support vector machines
CN109002436A (en) * 2018-07-12 2018-12-14 上海金仕达卫宁软件科技有限公司 Medical text terms automatic identifying method and system based on shot and long term memory network
CN109190113A (en) * 2018-08-10 2019-01-11 北京科技大学 A kind of knowledge mapping construction method of theory of traditional Chinese medical science ancient books and records
CN109189882A (en) * 2018-08-08 2019-01-11 北京百度网讯科技有限公司 Answer type recognition methods, device, server and the storage medium of sequence content
CN109214407A (en) * 2018-07-06 2019-01-15 阿里巴巴集团控股有限公司 Event detection model, calculates equipment and storage medium at method, apparatus
CN109388807A (en) * 2018-10-30 2019-02-26 中山大学 The method, apparatus and storage medium of electronic health record name Entity recognition
CN109493956A (en) * 2018-10-15 2019-03-19 海口市人民医院(中南大学湘雅医学院附属海口医院) Diagnosis guiding method
CN109543187A (en) * 2018-11-23 2019-03-29 中山大学 Generation method, device and the storage medium of electronic health record feature
CN109657239A (en) * 2018-12-12 2019-04-19 电子科技大学 The Chinese name entity recognition method learnt based on attention mechanism and language model
CN109670179A (en) * 2018-12-20 2019-04-23 中山大学 Case history text based on iteration expansion convolutional neural networks names entity recognition method
CN109697285A (en) * 2018-12-13 2019-04-30 中南大学 Enhance the hierarchical B iLSTM Chinese electronic health record disease code mask method of semantic expressiveness
CN109710930A (en) * 2018-12-20 2019-05-03 重庆邮电大学 A kind of Chinese Resume analytic method based on deep neural network
CN109800411A (en) * 2018-12-03 2019-05-24 哈尔滨工业大学(深圳) Clinical treatment entity and its attribute extraction method
CN109871544A (en) * 2019-03-25 2019-06-11 平安科技(深圳)有限公司 Entity recognition method, device, equipment and storage medium based on Chinese case history
CN109885824A (en) * 2019-01-04 2019-06-14 北京捷通华声科技股份有限公司 A kind of Chinese name entity recognition method, device and the readable storage medium storing program for executing of level
CN109918506A (en) * 2019-03-07 2019-06-21 安徽省泰岳祥升软件有限公司 A kind of file classification method and device
CN109949929A (en) * 2019-03-19 2019-06-28 挂号网(杭州)科技有限公司 A kind of assistant diagnosis system based on the extensive case history of deep learning
CN109992783A (en) * 2019-04-03 2019-07-09 同济大学 Chinese term vector modeling method
CN109994215A (en) * 2019-04-25 2019-07-09 清华大学 Disease automatic coding system, method, equipment and storage medium
CN110032739A (en) * 2019-04-18 2019-07-19 清华大学 Chinese electronic health record name entity abstracting method and system
CN110069781A (en) * 2019-04-24 2019-07-30 北京奇艺世纪科技有限公司 A kind of recognition methods of entity tag and relevant device
CN110083833A (en) * 2019-04-18 2019-08-02 东华大学 Term vector joint insertion sentiment analysis method in terms of Chinese words vector sum
CN110110324A (en) * 2019-04-15 2019-08-09 大连理工大学 A kind of biomedical entity link method that knowledge based indicates
CN110209823A (en) * 2019-06-12 2019-09-06 齐鲁工业大学 A kind of multi-tag file classification method and system
CN110263324A (en) * 2019-05-16 2019-09-20 华为技术有限公司 Text handling method, model training method and device
CN110287483A (en) * 2019-06-06 2019-09-27 广东技术师范大学 A kind of unknown word identification method and system using five-stroke etymon deep learning
CN110298040A (en) * 2019-06-20 2019-10-01 翼健(上海)信息科技有限公司 A kind of pair of Chinese corpus is labeled the control method and control device of identification
CN110310740A (en) * 2019-04-15 2019-10-08 山东大学 Based on see a doctor again information forecasting method and the system for intersecting attention neural network
CN110377711A (en) * 2019-07-01 2019-10-25 浙江大学 A method of open long video question-answering task is solved from attention network using layering convolution
CN110378318A (en) * 2019-07-30 2019-10-25 腾讯科技(深圳)有限公司 Character recognition method, device, computer equipment and storage medium
EP3564964A1 (en) * 2018-05-04 2019-11-06 Avaintec Oy Method for utilising natural language processing technology in decision-making support of abnormal state of object
CN110459282A (en) * 2019-07-11 2019-11-15 新华三大数据技术有限公司 Sequence labelling model training method, electronic health record processing method and relevant apparatus
CN110457682A (en) * 2019-07-11 2019-11-15 新华三大数据技术有限公司 Electronic health record part-of-speech tagging method, model training method and relevant apparatus
CN110569343A (en) * 2019-08-16 2019-12-13 华东理工大学 question and answer based clinical text structuring method
CN110569506A (en) * 2019-09-05 2019-12-13 清华大学 Medical named entity recognition method based on medical dictionary
CN110569486A (en) * 2019-07-30 2019-12-13 平安科技(深圳)有限公司 sequence labeling method and device based on double architectures and computer equipment
CN110598212A (en) * 2019-09-05 2019-12-20 清华大学 Rapid named body identification method
CN110688855A (en) * 2019-09-29 2020-01-14 山东师范大学 Chinese medical entity identification method and system based on machine learning
CN110825875A (en) * 2019-11-01 2020-02-21 科大讯飞股份有限公司 Text entity type identification method and device, electronic equipment and storage medium
CN111090987A (en) * 2019-12-27 2020-05-01 北京百度网讯科技有限公司 Method and apparatus for outputting information
CN111160023A (en) * 2019-12-23 2020-05-15 华南理工大学 Medical text named entity identification method based on multi-way recall
CN111180076A (en) * 2018-11-13 2020-05-19 零氪科技(北京)有限公司 Medical information extraction method based on multilayer semantic analysis
CN111538817A (en) * 2019-01-18 2020-08-14 北京京东尚科信息技术有限公司 Man-machine interaction method and device
CN111563380A (en) * 2019-01-25 2020-08-21 浙江大学 Named entity identification method and device
CN111581970A (en) * 2020-05-12 2020-08-25 厦门市美亚柏科信息股份有限公司 Text recognition method, device and storage medium for network context
CN111597814A (en) * 2020-05-22 2020-08-28 北京慧闻科技(集团)有限公司 Man-machine interaction named entity recognition method, device, equipment and storage medium
WO2020211275A1 (en) * 2019-04-18 2020-10-22 五邑大学 Pre-trained model and fine-tuning technology-based medical text relationship extraction method
CN112101034A (en) * 2020-09-09 2020-12-18 沈阳东软智能医疗科技研究院有限公司 Method and device for distinguishing attribute of medical entity and related product
CN112528653A (en) * 2020-12-02 2021-03-19 支付宝(杭州)信息技术有限公司 Short text entity identification method and system
CN112599211A (en) * 2020-12-25 2021-04-02 中电云脑(天津)科技有限公司 Medical entity relationship extraction method and device
CN112597774A (en) * 2020-12-14 2021-04-02 山东师范大学 Chinese medical named entity recognition method, system, storage medium and equipment
CN112925995A (en) * 2021-02-22 2021-06-08 北京百度网讯科技有限公司 Method and device for acquiring POI state information
CN113035303A (en) * 2021-02-09 2021-06-25 北京工业大学 Method and system for labeling named entity category of Chinese electronic medical record
CN113569575A (en) * 2021-08-10 2021-10-29 云南电网有限责任公司电力科学研究院 Evaluation expert recommendation method based on pictograph-semantic dual-feature space mapping
CN113948217A (en) * 2021-11-23 2022-01-18 重庆邮电大学 Medical nested named entity recognition method based on local feature integration
CN114300081A (en) * 2022-03-09 2022-04-08 四川大学华西医院 Prediction device, system and storage medium based on electronic medical record multi-modal data
CN114648029A (en) * 2022-03-31 2022-06-21 河海大学 Electric power field named entity identification method based on BiLSTM-CRF model
CN114927177A (en) * 2022-05-27 2022-08-19 浙江工业大学 Medical entity identification method and system fusing Chinese medical field characteristics

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105894088A (en) * 2016-03-25 2016-08-24 苏州赫博特医疗信息科技有限公司 Medical information extraction system and method based on depth learning and distributed semantic features
CN106776711A (en) * 2016-11-14 2017-05-31 浙江大学 A kind of Chinese medical knowledge mapping construction method based on deep learning
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR
CN107122416A (en) * 2017-03-31 2017-09-01 北京大学 A kind of Chinese event abstracting method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105894088A (en) * 2016-03-25 2016-08-24 苏州赫博特医疗信息科技有限公司 Medical information extraction system and method based on depth learning and distributed semantic features
CN106776711A (en) * 2016-11-14 2017-05-31 浙江大学 A kind of Chinese medical knowledge mapping construction method based on deep learning
CN107122416A (en) * 2017-03-31 2017-09-01 北京大学 A kind of Chinese event abstracting method
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YANRAN LI,WENJIE LI,FEI SUN,SUJIAN LI: "《Component-Enhanced Chinese Character Embeddings》", 《PROCEEDINGS OF THE 2015 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING》 *

Cited By (95)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3564964A1 (en) * 2018-05-04 2019-11-06 Avaintec Oy Method for utilising natural language processing technology in decision-making support of abnormal state of object
CN108733837A (en) * 2018-05-28 2018-11-02 杭州依图医疗技术有限公司 A kind of the natural language structural method and device of case history text
CN108804718A (en) * 2018-06-11 2018-11-13 线粒体(北京)科技有限公司 Data push method, device, electronic equipment and computer readable storage medium
CN108875034A (en) * 2018-06-25 2018-11-23 湖南丹尼尔智能科技有限公司 A kind of Chinese Text Categorization based on stratification shot and long term memory network
CN108920586A (en) * 2018-06-26 2018-11-30 北京工业大学 A kind of short text classification method based on depth nerve mapping support vector machines
CN109214407A (en) * 2018-07-06 2019-01-15 阿里巴巴集团控股有限公司 Event detection model, calculates equipment and storage medium at method, apparatus
CN109214407B (en) * 2018-07-06 2022-04-19 创新先进技术有限公司 Event detection model, method and device, computing equipment and storage medium
CN109002436A (en) * 2018-07-12 2018-12-14 上海金仕达卫宁软件科技有限公司 Medical text terms automatic identifying method and system based on shot and long term memory network
CN109189882A (en) * 2018-08-08 2019-01-11 北京百度网讯科技有限公司 Answer type recognition methods, device, server and the storage medium of sequence content
CN109190113A (en) * 2018-08-10 2019-01-11 北京科技大学 A kind of knowledge mapping construction method of theory of traditional Chinese medical science ancient books and records
CN109190113B (en) * 2018-08-10 2021-08-31 北京科技大学 Knowledge graph construction method of traditional Chinese medicine theory book
CN109493956A (en) * 2018-10-15 2019-03-19 海口市人民医院(中南大学湘雅医学院附属海口医院) Diagnosis guiding method
CN109388807A (en) * 2018-10-30 2019-02-26 中山大学 The method, apparatus and storage medium of electronic health record name Entity recognition
CN109388807B (en) * 2018-10-30 2021-09-21 中山大学 Method, device and storage medium for identifying named entities of electronic medical records
CN111180076B (en) * 2018-11-13 2023-09-05 零氪科技(北京)有限公司 Medical information extraction method based on multi-layer semantic analysis
CN111180076A (en) * 2018-11-13 2020-05-19 零氪科技(北京)有限公司 Medical information extraction method based on multilayer semantic analysis
CN109543187A (en) * 2018-11-23 2019-03-29 中山大学 Generation method, device and the storage medium of electronic health record feature
CN109543187B (en) * 2018-11-23 2021-09-17 中山大学 Method and device for generating electronic medical record characteristics and storage medium
CN109800411A (en) * 2018-12-03 2019-05-24 哈尔滨工业大学(深圳) Clinical treatment entity and its attribute extraction method
CN109800411B (en) * 2018-12-03 2023-07-18 哈尔滨工业大学(深圳) Clinical medical entity and attribute extraction method thereof
CN109657239B (en) * 2018-12-12 2020-04-21 电子科技大学 Chinese named entity recognition method based on attention mechanism and language model learning
CN109657239A (en) * 2018-12-12 2019-04-19 电子科技大学 The Chinese name entity recognition method learnt based on attention mechanism and language model
CN109697285A (en) * 2018-12-13 2019-04-30 中南大学 Enhance the hierarchical B iLSTM Chinese electronic health record disease code mask method of semantic expressiveness
CN109710930A (en) * 2018-12-20 2019-05-03 重庆邮电大学 A kind of Chinese Resume analytic method based on deep neural network
CN109670179B (en) * 2018-12-20 2022-11-11 中山大学 Medical record text named entity identification method based on iterative expansion convolutional neural network
CN109670179A (en) * 2018-12-20 2019-04-23 中山大学 Case history text based on iteration expansion convolutional neural networks names entity recognition method
CN109885824B (en) * 2019-01-04 2024-02-20 北京捷通华声科技股份有限公司 Hierarchical Chinese named entity recognition method, hierarchical Chinese named entity recognition device and readable storage medium
CN109885824A (en) * 2019-01-04 2019-06-14 北京捷通华声科技股份有限公司 A kind of Chinese name entity recognition method, device and the readable storage medium storing program for executing of level
CN111538817A (en) * 2019-01-18 2020-08-14 北京京东尚科信息技术有限公司 Man-machine interaction method and device
CN111563380A (en) * 2019-01-25 2020-08-21 浙江大学 Named entity identification method and device
CN109918506A (en) * 2019-03-07 2019-06-21 安徽省泰岳祥升软件有限公司 A kind of file classification method and device
CN109918506B (en) * 2019-03-07 2022-12-16 安徽省泰岳祥升软件有限公司 Text classification method and device
CN109949929A (en) * 2019-03-19 2019-06-28 挂号网(杭州)科技有限公司 A kind of assistant diagnosis system based on the extensive case history of deep learning
CN109871544A (en) * 2019-03-25 2019-06-11 平安科技(深圳)有限公司 Entity recognition method, device, equipment and storage medium based on Chinese case history
CN109992783A (en) * 2019-04-03 2019-07-09 同济大学 Chinese term vector modeling method
CN110310740A (en) * 2019-04-15 2019-10-08 山东大学 Based on see a doctor again information forecasting method and the system for intersecting attention neural network
CN110110324B (en) * 2019-04-15 2022-12-02 大连理工大学 Biomedical entity linking method based on knowledge representation
CN110110324A (en) * 2019-04-15 2019-08-09 大连理工大学 A kind of biomedical entity link method that knowledge based indicates
CN110310740B (en) * 2019-04-15 2020-11-20 山东大学 Re-hospitalizing information prediction method and system based on cross attention neural network
CN110083833A (en) * 2019-04-18 2019-08-02 东华大学 Term vector joint insertion sentiment analysis method in terms of Chinese words vector sum
CN110083833B (en) * 2019-04-18 2022-12-06 东华大学 Method for analyzing emotion by jointly embedding Chinese word vector and aspect word vector
WO2020211275A1 (en) * 2019-04-18 2020-10-22 五邑大学 Pre-trained model and fine-tuning technology-based medical text relationship extraction method
CN110032739A (en) * 2019-04-18 2019-07-19 清华大学 Chinese electronic health record name entity abstracting method and system
CN110069781A (en) * 2019-04-24 2019-07-30 北京奇艺世纪科技有限公司 A kind of recognition methods of entity tag and relevant device
CN110069781B (en) * 2019-04-24 2022-11-18 北京奇艺世纪科技有限公司 Entity label identification method and related equipment
CN109994215A (en) * 2019-04-25 2019-07-09 清华大学 Disease automatic coding system, method, equipment and storage medium
CN110263324A (en) * 2019-05-16 2019-09-20 华为技术有限公司 Text handling method, model training method and device
CN110263324B (en) * 2019-05-16 2021-02-12 华为技术有限公司 Text processing method, model training method and device
CN110287483B (en) * 2019-06-06 2023-12-05 广东技术师范大学 Unregistered word recognition method and system utilizing five-stroke character root deep learning
CN110287483A (en) * 2019-06-06 2019-09-27 广东技术师范大学 A kind of unknown word identification method and system using five-stroke etymon deep learning
CN110209823A (en) * 2019-06-12 2019-09-06 齐鲁工业大学 A kind of multi-tag file classification method and system
CN110209823B (en) * 2019-06-12 2021-04-13 齐鲁工业大学 Multi-label text classification method and system
CN110298040A (en) * 2019-06-20 2019-10-01 翼健(上海)信息科技有限公司 A kind of pair of Chinese corpus is labeled the control method and control device of identification
CN110377711A (en) * 2019-07-01 2019-10-25 浙江大学 A method of open long video question-answering task is solved from attention network using layering convolution
CN110459282B (en) * 2019-07-11 2021-03-09 新华三大数据技术有限公司 Sequence labeling model training method, electronic medical record processing method and related device
CN110457682A (en) * 2019-07-11 2019-11-15 新华三大数据技术有限公司 Electronic health record part-of-speech tagging method, model training method and relevant apparatus
CN110457682B (en) * 2019-07-11 2022-08-09 新华三大数据技术有限公司 Part-of-speech tagging method for electronic medical record, model training method and related device
CN110459282A (en) * 2019-07-11 2019-11-15 新华三大数据技术有限公司 Sequence labelling model training method, electronic health record processing method and relevant apparatus
CN110378318A (en) * 2019-07-30 2019-10-25 腾讯科技(深圳)有限公司 Character recognition method, device, computer equipment and storage medium
CN110569486A (en) * 2019-07-30 2019-12-13 平安科技(深圳)有限公司 sequence labeling method and device based on double architectures and computer equipment
CN110569486B (en) * 2019-07-30 2023-01-03 平安科技(深圳)有限公司 Sequence labeling method and device based on double architectures and computer equipment
CN110378318B (en) * 2019-07-30 2022-07-15 腾讯科技(深圳)有限公司 Character recognition method and device, computer equipment and storage medium
CN110569343B (en) * 2019-08-16 2023-05-09 华东理工大学 Clinical text structuring method based on question and answer
CN110569343A (en) * 2019-08-16 2019-12-13 华东理工大学 question and answer based clinical text structuring method
CN110569506A (en) * 2019-09-05 2019-12-13 清华大学 Medical named entity recognition method based on medical dictionary
CN110598212A (en) * 2019-09-05 2019-12-20 清华大学 Rapid named body identification method
CN110688855A (en) * 2019-09-29 2020-01-14 山东师范大学 Chinese medical entity identification method and system based on machine learning
CN110825875B (en) * 2019-11-01 2022-12-06 科大讯飞股份有限公司 Text entity type identification method and device, electronic equipment and storage medium
CN110825875A (en) * 2019-11-01 2020-02-21 科大讯飞股份有限公司 Text entity type identification method and device, electronic equipment and storage medium
CN111160023A (en) * 2019-12-23 2020-05-15 华南理工大学 Medical text named entity identification method based on multi-way recall
CN111160023B (en) * 2019-12-23 2023-06-20 华南理工大学 Medical text named entity recognition method based on multi-way recall
CN111090987A (en) * 2019-12-27 2020-05-01 北京百度网讯科技有限公司 Method and apparatus for outputting information
US11507748B2 (en) 2019-12-27 2022-11-22 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for outputting information
CN111581970B (en) * 2020-05-12 2023-01-24 厦门市美亚柏科信息股份有限公司 Text recognition method, device and storage medium for network context
CN111581970A (en) * 2020-05-12 2020-08-25 厦门市美亚柏科信息股份有限公司 Text recognition method, device and storage medium for network context
CN111597814B (en) * 2020-05-22 2023-05-26 北京慧闻科技(集团)有限公司 Man-machine interaction named entity recognition method, device, equipment and storage medium
CN111597814A (en) * 2020-05-22 2020-08-28 北京慧闻科技(集团)有限公司 Man-machine interaction named entity recognition method, device, equipment and storage medium
CN112101034B (en) * 2020-09-09 2024-02-27 沈阳东软智能医疗科技研究院有限公司 Method and device for judging attribute of medical entity and related product
CN112101034A (en) * 2020-09-09 2020-12-18 沈阳东软智能医疗科技研究院有限公司 Method and device for distinguishing attribute of medical entity and related product
CN112528653B (en) * 2020-12-02 2023-11-28 支付宝(杭州)信息技术有限公司 Short text entity recognition method and system
CN112528653A (en) * 2020-12-02 2021-03-19 支付宝(杭州)信息技术有限公司 Short text entity identification method and system
CN112597774A (en) * 2020-12-14 2021-04-02 山东师范大学 Chinese medical named entity recognition method, system, storage medium and equipment
CN112599211A (en) * 2020-12-25 2021-04-02 中电云脑(天津)科技有限公司 Medical entity relationship extraction method and device
CN112599211B (en) * 2020-12-25 2023-03-21 中电云脑(天津)科技有限公司 Medical entity relationship extraction method and device
CN113035303A (en) * 2021-02-09 2021-06-25 北京工业大学 Method and system for labeling named entity category of Chinese electronic medical record
CN112925995A (en) * 2021-02-22 2021-06-08 北京百度网讯科技有限公司 Method and device for acquiring POI state information
US11977574B2 (en) 2021-02-22 2024-05-07 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for acquiring POI state information
CN112925995B (en) * 2021-02-22 2022-01-28 北京百度网讯科技有限公司 Method and device for acquiring POI state information
CN113569575A (en) * 2021-08-10 2021-10-29 云南电网有限责任公司电力科学研究院 Evaluation expert recommendation method based on pictograph-semantic dual-feature space mapping
CN113569575B (en) * 2021-08-10 2024-02-09 云南电网有限责任公司电力科学研究院 Evaluation expert recommendation method based on pictographic-semantic dual-feature space mapping
CN113948217A (en) * 2021-11-23 2022-01-18 重庆邮电大学 Medical nested named entity recognition method based on local feature integration
CN114300081A (en) * 2022-03-09 2022-04-08 四川大学华西医院 Prediction device, system and storage medium based on electronic medical record multi-modal data
CN114300081B (en) * 2022-03-09 2022-05-27 四川大学华西医院 Prediction device, system and storage medium based on electronic medical record multi-modal data
CN114648029A (en) * 2022-03-31 2022-06-21 河海大学 Electric power field named entity identification method based on BiLSTM-CRF model
CN114927177A (en) * 2022-05-27 2022-08-19 浙江工业大学 Medical entity identification method and system fusing Chinese medical field characteristics

Also Published As

Publication number Publication date
CN107977361B (en) 2021-05-18

Similar Documents

Publication Publication Date Title
CN107977361A (en) The Chinese clinical treatment entity recognition method represented based on deep semantic information
CN109800411B (en) Clinical medical entity and attribute extraction method thereof
CN109669994B (en) Construction method and system of health knowledge map
Yin et al. Chinese clinical named entity recognition with radical-level feature and self-attention mechanism
Yu et al. Automatic ICD code assignment of Chinese clinical notes based on multilayer attention BiRNN
Lee et al. Machine learning in relation to emergency medicine clinical and operational scenarios: an overview
CN111834014A (en) Medical field named entity identification method and system
CN110032739A (en) Chinese electronic health record name entity abstracting method and system
CN108875809A (en) The biomedical entity relationship classification method of joint attention mechanism and neural network
CN113553440B (en) Medical entity relationship extraction method based on hierarchical reasoning
CN113901207B (en) Adverse drug reaction detection method based on data enhancement and semi-supervised learning
CN111695354A (en) Text question-answering method and device based on named entity and readable storage medium
CN114864076A (en) Multi-modal breast cancer classification training method and system based on graph attention network
CN111881292B (en) Text classification method and device
CN110444261A (en) Sequence labelling network training method, electronic health record processing method and relevant apparatus
CN114742059A (en) Chinese electronic medical record named entity identification method based on multitask learning
CN111582506A (en) Multi-label learning method based on global and local label relation
CN114781382A (en) Medical named entity recognition system and method based on RWLSTM model fusion
CN106407387A (en) A concept connection method for medical diagnosis texts
Cheng et al. Integration of automatic sentence segmentation and lexical analysis of ancient Chinese based on BiLSTM-CRF model
CN112216379A (en) Disease diagnosis system based on intelligent joint learning
Wu et al. AGNet: Automatic generation network for skin imaging reports
Liang et al. Disease prediction based on multi-type data fusion from Chinese electronic health record
Wang et al. Toxic comment classification based on bidirectional gated recurrent unit and convolutional neural network
CN116881336A (en) Efficient multi-mode contrast depth hash retrieval method for medical big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant