CN107977361A - The Chinese clinical treatment entity recognition method represented based on deep semantic information - Google Patents
The Chinese clinical treatment entity recognition method represented based on deep semantic information Download PDFInfo
- Publication number
- CN107977361A CN107977361A CN201711278996.2A CN201711278996A CN107977361A CN 107977361 A CN107977361 A CN 107977361A CN 201711278996 A CN201711278996 A CN 201711278996A CN 107977361 A CN107977361 A CN 107977361A
- Authority
- CN
- China
- Prior art keywords
- word
- sentence
- clinical treatment
- mrow
- semantic information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The present invention proposes a kind of Chinese clinical treatment entity recognition method represented based on deep semantic information, including two parts content:1) method for expressing of Chinese clinical treatment entity;2) recognition methods of Chinese clinical treatment entity.Method for expressing includes following two:Single tag representation and multi-tag represent.Recognition methods incorporates the Chinese character method for expressing based on medical field radical information, the local semantic information of medical text is obtained based on CNN, the overall situation semanteme of medical text is obtained based on two-way LSTM, and the semantic information of different words in sentence is made choice based on Attention mechanism.The present invention inherits the advantage of deep learning, has the advantages that the accuracy rate and recall rate of less manual features intervention and higher.
Description
Technical field
The invention belongs to intelligent medical treatment technical field, more particularly to a kind of Chinese clinic represented based on deep semantic information
Medical bodies recognition methods.
Background technology
With the development of medical information technology, clinical treatment information system is constantly popularized, and a large amount of available clinics occurs
Medical data.These data can not only support clinical decision, moreover it is possible to be used for the research in terms of clinical and translational medicine.Utilize
One major obstacle of these data is:The useful information resided in medical his- tory taking can not be direct by computer system
Use.Can the text analysis technique of drawing-out structure information be to solve the key of this obstacle from urtext.It is medical real
One of background task as clinical treatment text analyzing of body identification, its correlation technique to clinical treatment DSS, face
The development of bed medical knowledge Research on Mining etc. is significant, receives the extensive concern of academia and industrial quarters.
Within twenty or thirty year in past, researchers have studied the clinical treatment reality for different type clinography in large quantities
Body recognition methods, and develop some application systems.The method of existing clinical treatment Entity recognition is mainly directed towards the clinic of English
Medical text, the clinical treatment Entity recognition research of other language electronic health records is relatively fewer, and the method for use is mainly regular
Matching, statistical machine learning and both combinations.Rule-based matched method needs the expert of medical field to write rule,
It is with high costs, and transplantability is poor.Method based on statistical machine learning needs manual extraction, and largely reliable feature obtains
High Entity recognition performance.
In recent years, as deep learning has been widely applied in the rise of natural language processing field, deep learning
The many aspects such as morphology, syntax, word sense disambiguation, semantic analysis, information extraction, question and answer.With regard to the background task in text analyzing field
For naming Entity recognition, the deep-neural-network model based on window of early stage is just used for identifying that the name of general field is real
Body, its performance exceed statistical machine learning algorithm.There are within 2015 a series of name Entity recognitions using RNN combinations CRF
Work, in general field, its effect exceedes the name Entity recognition main model (such as CRF) based on feature-rich.But facing
Bed medical field, statistical machine learning algorithm is still the mainstream technology of medical bodies identification, towards Chinese clinical treatment text
Entity recognition method research it is also fewer.
Compared with English, Chinese has the characteristics of oneself is exclusive, and Chinese is pictograph, and radical is semantic with substantial amounts of words
Information.English text analytical technology is used directly to analysis Chinese text performance to tend not to be optimal.Therefore, in processes
When literary, it is necessary to from Chinese itself the characteristics of, design suitable processing method.
The content of the invention
In order to solve the problems, such as the name Entity recognition of Chinese medical field, the present invention is carried from the characteristics of Chinese itself
A kind of Chinese clinical treatment entity recognition method represented based on deep semantic information is supplied.This method has the characteristics that:1)
According to the form of expression of clinical treatment entity, two kinds of method for expressing for representing new are devised;2) take into full account that the composition of Chinese character is special
Point, depth representing is carried out by the radical comprising Chinese Character Semantics;3) devise one kind while consider words local context information
With the deep neural network of sentence global information.
The present invention adopts the following technical scheme that:
A kind of Chinese clinical treatment entity recognition method represented based on deep semantic information, it is characterised in that:The side
Method uses deep neural network model, is divided into 5 layers on the whole:(1) input layer, CNN layers of (2), (3) are LSTM layers two-way, (4)
Attention layers, (5) output layer;The described method includes:
During training, table is carried out to the sentence comprising Chinese clinical treatment entity with single label or multi-tag method for expressing first
Show, then carry out model training using following steps:
S1, using common words is distributed represent that learning algorithm training on the relevant text of a large amount of medical fields obtains
Word vector distribution formula represents;
S2, extract extensive medical bodies composition clinical field word automatically from Vertical Website and/or medical profession system
Word in dictionary, is carried out radical fractionation, extracts and count to obtain the common radical of clinical field dictionary by allusion quotation, and will
All words are classified by the common radical of clinical field dictionary, and are initialized immediately;
S3, by the obtained distributed vector of step S1 and S2 be spliced to form and merged Chinese clinical treatment field radical
The Chinese character deep semantic of information represents;
The contextual window of S4, selection centered on current word, the context part semantic information table of word is obtained with CNN
Show, and as LSTM layers of input;
S5, the global semantic information expression using sentence in two-way LSTM acquisition clinical treatment texts;
S6, using attention mechanism, by calculating the similarity of current word and other words in sentence, obtain in sentence
Other words are contributed by the semanteme of current word and weight, find useful information significantly correlated with current word in sentence, and will
Attention is vectorial to be spliced with current word vector, obtains current word context part semantic information and place sentence is global
The depth representing of semantic information;
The sentence of S7, the Chinese medical field of input one, by the processing of above-mentioned steps S1-S6, have obtained its depth language
Justice represents sequence, and sequence is represented as input using the deep semantic, and clinical treatment entity represents that sequence label as output, utilizes
Sequence labelling algorithm is modeled, and the Chinese character deep semantic to having merged Chinese clinical treatment field radical information represents to carry out
Adjustment;
During test, represented by tabling look-up to obtain the Chinese character deep semantic in step S3, then successively according to step S4, S5 and
S6 obtains the depth representing of current word context part semantic information and place sentence overall situation semantic information, finally by the depth of sentence
It is predicted to obtain sequence label in the model that degree semantic expressiveness sequence inputting is obtained to training and carries out clinical treatment entity also
It is former.
Further, when using single tag representation, reduced using following rule:(1) if the son labeled as H
Sequence, all subsequences for being labeled as D merge with the subsequence labeled as H in same words and expressions;(2) if be not marked with
The subsequence of H, all subsequences for being labeled as D merge;When being represented using multi-tag, there is no difference is represented, directly
Reduced.
Further, the method is local to the context in medical text sentence where Chinese character using convolutional neural networks
Semantic information is indicated, and is mainly comprised the following steps:
1) contextual window of a fixed size is chosen to each word in medical text sentence;
2) size and number of fixed convolution kernel carries out convolution operation to the contextual window of each word, in the process of convolution
In, assign a relevant depth representing of relative position to each word in window;
3) down-sampled by pondization operation progress to the feature vector that each convolution kernel obtains, the pond operation includes
Maximum pond, average pond;Wherein, maximum pondization is that the feature vector obtained to each convolution kernel takes its maximum,
Value pondization is that the feature vector obtained to each convolution kernel takes its average value.
Further, the method carries out global semantic table using two-way LSTM to medical text sentence where current word
Show, key step includes:
1) by the word for having merged clinical treatment field character structure feature represent acquired in vector sum CNN centered on word
Local semantic information represent to carry out splicing input as two-way LSTM, be denoted as xt, represent sentence in t-th of Chinese character depth
Semantic expressiveness;
2) input Chinese character deep semantic and represent sequence x1x2…xnTwo groups are obtained through forward and reverse LSTM network processes
State output sequences h1h2…hnAnd h 'nh’n-1…h’1, two groups of state output sequences are merged, obtain [h1h’1]
[h2h’2]…[hnh’n], for t-th of Chinese character, its global semantic information is expressed as [hth’t]。
Further, the method using attention mechanism selection sentence in different terms justice information in sentence semantics
In weight, key step includes:
1) the term vector s of each word i is calculatediWith the term vector s of other words j (j ≠ i) in sentencejSimilarity:eij=
sim(si,sj)=si·sj;
2) by softmax functions by its normalized, and then the weight factor of each word is calculated:
3) summation is weighted with reference to its weight factor by the LSTM layers output vector to each word, obtains fusion sentence
In other words to the vector of the semantic contribution information of current word:
4) splicing merges the current term vector inflow output layer of Attention vector sums.
Further, softmax functions are combined by the output layer with CRF, are specifically included:
1) if it is x=(x to assume its list entries1,x2,…,xn), the sequence label of prediction is y=(y1,y2,…,
yn), the output probability matrix P of two-way LSTMn×k, Pi,jThe probability of j-th of label is marked as i-th of word, then there have to be following fixed
Justice:
Wherein, k is the number of output label, and A is state-transition matrix, Ai,jRepresentative is transferred to j-th from i-th of label
The probability of label;
2) by solving maximum s (x, y), you can obtain the optimal output label sequence of sequence.
The beneficial effects of the invention are as follows:Incorporate the Chinese character method for expressing based on medical field radical information;Based on CNN
Obtain the local semantic information of medical text;The overall situation that medical text is obtained based on two-way LSTM is semantic;Based on Attention machines
Make and the semantic information of different words in sentence is made choice.
Brief description of the drawings
Fig. 1 is the method for expressing schematic diagram of the Chinese clinical treatment entity of the present invention;
Fig. 2 is the frame diagram of the Chinese clinical treatment entity recognition method represented based on deep semantic information;
Fig. 3 is the flow chart for the local semantic information that medical text is obtained based on CNN;
Fig. 4 is the flow chart for the global semantic information that medical text is obtained based on two-way LSTM;
Fig. 5 is the flow chart of the semantic information selection based on Attention mechanism.
Embodiment
The present invention is further described for explanation and embodiment below in conjunction with the accompanying drawings.
The present invention in depth have studied existing clinical treatment text entities (including disease, symptom, body part, inspection
Look into, treatment etc.) on the basis of recognition methods, a kind of Chinese clinical treatment Entity recognition represented based on deep semantic information of design
Method.
With the continuous popularization of clinical treatment information system, clinical medical data has obtained substantial amounts of accumulation, how to have analyzed
With using clinical medical data come adjuvant clinical medical decision making, promote the demand of clinical treatment correlative study development more and more stronger
It is strong.One of the background task of medical bodies identification as clinical treatment text analyzing, its correlation technique is to clinical treatment decision-making branch
It is significant to hold the development of system, clinical treatment knowledge excavation research etc., receives academia and the extensive of industrial quarters is closed
Note.The present invention includes two parts content:1) method for expressing of Chinese clinical treatment entity;2) identification of Chinese clinical treatment entity
Method.
Wherein, as shown in Figure 1, method for expressing is including following two:Single tag representation and multi-tag represent.The former according to work as
Whether preceding subsequence, which needs to combine with other subsequences, could form a complete clinical treatment entity, whether multiple
Clinical treatment entity is shared is divided into HDN three classes them, wherein, H represents that subsequence is shared by multiple clinical treatment entities, D
Represent that subsequence is not shared by multiple clinical treatment entities, N represents that subsequence oneself is exactly a clinical treatment entity;To appointing
One subsequence, according to where word position (such as " beginnings "-B, " centre "-M, " ending "-E, " individual character entity "-S with " entity it
- O etc. outside ") indicated.Each entity of the latter is individually indicated according to the position where word, to by multiple clinical treatments
The part that entity shares carries out multiple sign.
Recognition methods uses deep neural network frame on the whole, is broadly divided into 5 layers:Embedding layers, CNN (convolution god
Through network, Convolutional Neural Network) layer, RNN (Recognition with Recurrent Neural Network, Recurrent Neural
Network) layer (including LSTM (long short-term memory, Long Short-Term Memory), GRU (gating cycle neural unit,
Gated Recurrent Unit) etc., only explained below by taking LSTM as an example, RNN layers can also be multilayer, below only with one
Layer exemplified by explain), Attention layers and output layer.Wherein, (radical carries the composition feature of Embedding layers of fusion Chinese character
Semantic information) deep semantic expression is carried out to Chinese character, CNN layers utilize the office that the contextual window centered on Chinese character represents Chinese character
Portion's semantic information, LSTM layers utilize one-way or bi-directional LSTM (for convenience of period, only two-way LSTM is described in detail below)
The global semantic information of sentence is obtained, Attention layers are existed using different terms justice information in attention mechanism selection sentence
Weight in sentence semantics, output layer utilize sequence labelling algorithm (such as CRF (condition random field, Conditional Random
Filed), SSVM (structuring support vector machines, Structural Support Vector Machine) etc.) to Chinese clinical
Medical bodies identification is modeled.
The Chinese clinical treatment entity recognition method represented based on deep semantic information of the present invention as shown in Figure 2, its god
Include through the network architecture:
Input layer:The main term vector for completing list entries is distributed to be represented.The layer by merge word distribution represent to
Amount 1 and Chinese clinical treatment field radical it is distributed represent vector 2 obtain fusion radical information Chinese character it is distributed represent word to
Amount;
CNN layers:The main local semantic information that medical text is obtained using convolution method.Layer output is flowed by step 3
Enter LSTM layers two-way;
It is LSTM layers two-way:The global semantic information of medical text is mainly obtained using two-way LSTM.This layer of output vector is led to
Cross step 4 and flow into Attention layers;
Attention layers:It is based primarily upon Attention mechanism and assigns difference according to the semantic information of current sentence difference word
Weight.This layer of output vector flows into output layer by step 5;
Output layer:Using advantage of the CRF models in sequence labelling problem, optimal annotated sequence is obtained.
Wherein, using the distributed expression of term vector of Chinese character representation of the fusion based on radical, following step is included
Suddenly:
(1) represent learning algorithm (such as Continuous Bag-Of-Word, Skip-Gram) a large amount of by distribution
The acquistion of going to school of medical text represent that term vector not only solves one-hot tables to term vector distribution that can be general, good
Show the dimension disaster problem brought, and term vector has contained the semantic information of vocabulary, and good base is laid for work below
Plinth;
(2) faced on a large scale by disclosed Internet resources (such as Vertical Website of encyclopaedia, health medical treatment field) structure
These entities are carried out radical fractionation by bed medical bodies (such as disease, symptom, body part, checks, treatment etc.) dictionary,
And the radical with different type clinical treatment Entity Semantics information is obtained by statistics, give each according to radical
Word assigns a semantic expressiveness vector, carries out random initializtion;
(3) term vector obtained by step (1) and the random initialization vector obtained by step (2) are spliced, is formed new
Input of the term vector as deep neural network model.
As shown in figure 3, obtaining the local semantic information of medical text using convolution method, following several steps are specifically included
Suddenly:
The acquisition of step 31, term vector:Input layer is belonged to, the language of substantial amounts of medical text is trained by word2vec
The good term vector distribution that model obtains can be general represents.
The acquisition of step 32, medical field radical vector:Input layer is belonged to, by medical field radical
Radical random initialization vector in dictionary.
Step 33, the term vector being indicated based on fusion radical information to Chinese character are obtained:Input layer is belonged to, splicing is closed
And the vector that step 31 and step 32 are obtained obtains new term vector as CNN layers of input.
Step 34, convolution operation:CNN layers are belonged to, obtaining step 33 is rolled up based on the vector that contextual window obtains
Product processing.
Step 35, pondization operation:CNN layers are belonged to, pondization operation is carried out to the characteristic pattern of previous convolution operation.
Step 36, the acquisition based on the CNN layers of term vector for merging local semantic information:CNN layers are belonged to, splicing merges step
It is rapid 35 gained local semantic information vector sum step 33 obtained by fusion radical information Chinese character term vector obtain new word to
Amount.
As shown in figure 4, obtaining the global semantic information of medical text using two-way LSTM methods, specifically include following several
Step:
The acquisition of step 41, input vector:CNN layers are belonged to, it is local to obtain fusion by convolution operation and pond operation
The term vector of semantic information, new term vector is obtained by the term vector with merging the Chinese character term vector of radical information.
Step 42, forward direction LSTM:Belong to LSTM layers two-way, sample will be according to x1,x2,…,xnSequentially input Cell
In, then obtain one group of state output { h1,h2,…,hn}。
Step 43, backward LSTM:Belong to LSTM layers two-way in Fig. 1, sample will be according to xn,xn-1,…,x1Sequentially input
In Cell, one group of state output { h ' is then obtainedn,h’n-1,…,h’1}。
Step 44, output vector:Belong to LSTM layers two-way in Fig. 1, two groups of state variables are spelled according to following form
Get up { [h1,h’1],[h2,h’2],…,[hn,h’n] obtain the vector of amalgamation of global semantic information.
As shown in figure 5, power of the different terms justice information in sentence semantics in sentence is selected using Attention mechanism
Weight, including the following steps:
The acquisition of step 51, weight:CNN layers are belonged to, obtains merging local semanteme by convolution operation and pond operation
The term vector of information, new term vector is obtained by the term vector with merging the Chinese character term vector of radical information.
Step 52, normalized:Other words in sentence are passed through into softmax to the semantic contribution of current word and weight
Function is normalized.
Step 53, weighted sum:By other words in the weighted sum acquisition fusion sentence to different term vectors to working as
The vector of the semantic contribution information of preceding word.
Step 54, output vector:The Attention vectorial splicings with current term vector are merged to form new term vector inflow
Output layer.
Middle in the Chinese clinical treatment entity recognition method represented based on deep semantic information is exported most using CRF methods
Good sequence label.In final output sequence label formula, generally handled using traditional softmax functions, but this side
Effect is limited when method has the output label of the data of direct strong connection in processing.Due to neural network structure to data since
Very big, the size of data volume can seriously affect model training effect, so LSTM is combined in CRF, in simple terms, be exactly
Softmax functions are combined by output terminal with CRF, by CRF efficiently use sentence level label information obtain it is optimal defeated
Outgoing label sequence.
In conclusion the present invention proposes a kind of Chinese clinical treatment entity recognition method based on deep semantic information, melt
Enter the Chinese character method for expressing based on medical field radical information, the local semantic information of medical text, base are obtained based on CNN
The overall situation semanteme of medical text, and the semantic information based on Attention mechanism to different words in sentence are obtained in two-way LSTM
Make choice.It inherits the advantage of deep learning, has less manual features intervention and the accuracy rate of higher and recall rate etc.
Advantage.
Above content is that a further detailed description of the present invention in conjunction with specific preferred embodiments, it is impossible to is assert
The specific implementation of the present invention is confined to these explanations.For general technical staff of the technical field of the invention,
On the premise of not departing from present inventive concept, some simple deduction or replace can also be made, should all be considered as belonging to the present invention's
Protection domain.
Claims (6)
- A kind of 1. Chinese clinical treatment entity recognition method represented based on deep semantic information, it is characterised in that:The method Using deep neural network model, it is divided into 5 layers on the whole:(1) input layer, CNN layers of (2), (3) are LSTM layers two-way, (4) Attention layers, (5) output layer;The described method includes:During training, the sentence comprising Chinese clinical treatment entity is indicated with single label or multi-tag method for expressing first, Then model training is carried out using following steps:S1, using common words it is distributed represent learning algorithm training on the relevant text of a large amount of medical fields obtain word to Amount is distributed to be represented;S2, extract extensive medical bodies composition clinical field dictionary automatically from Vertical Website and/or medical profession system, Word in dictionary carries out radical fractionation, extracts and counts to obtain the common radical of clinical field dictionary, and will be all Word is classified by the common radical of clinical field dictionary, and is initialized immediately;S3, by the obtained distributed vector of step S1 and S2 be spliced to form and merged Chinese clinical treatment field radical information Chinese character deep semantic represent;S4, choose contextual window centered on current word, and the context part semantic information that word is obtained with CNN represents, and As LSTM layers of input;S5, the global semantic information expression using sentence in two-way LSTM acquisition clinical treatment texts;S6, using attention mechanism, by calculating the similarity of current word and other words in sentence, obtain other in sentence Word finds useful information significantly correlated with current word in sentence to the semantic contribution of current word and weight, and by attention It is vectorial to be spliced with current word vector, obtain current word context part semantic information and place sentence overall situation semantic information Depth representing;The sentence of S7, the Chinese medical field of input one, by the processing of above-mentioned steps S1-S6, have obtained its deep semantic table Show sequence, sequence is represented as input using the deep semantic, clinical treatment entity represents that sequence label as output, utilizes sequence Dimensioning algorithm is modeled, and the Chinese character deep semantic to having merged Chinese clinical treatment field radical information represents to adjust It is whole;During test, represented by tabling look-up to obtain the Chinese character deep semantic in step S3, then obtained successively according to step S4, S5 and S6 To the depth representing of current word context part semantic information and place sentence overall situation semantic information, finally by the depth language of sentence Justice represents to be predicted to obtain sequence label in the model that sequence inputting is obtained to training and carries out clinical treatment entity reduction.
- 2. according to the method described in claim 1, it is characterized in that:When using single tag representation, carried out using following rule Reduction:(1) if the subsequence labeled as H, in same words and expressions all subsequences for being labeled as D and the subsequence labeled as H into Row merges;(2) if being not marked with the subsequence of H, all subsequences for being labeled as D merge;Represented when using multi-tag When, there is no difference is represented, directly reduced.
- 3. according to the method described in claim 1, it is characterized in that:The method is using convolutional neural networks to being cured where Chinese character The context part semantic information treated in text sentence is indicated, and is mainly comprised the following steps:1) contextual window of a fixed size is chosen to each word in medical text sentence;2) size and number of fixed convolution kernel carries out convolution operation to the contextual window of each word, during convolution, A relevant depth representing of relative position is assigned to each word in window;3) down-sampled by pondization operation progress to the feature vector that each convolution kernel obtains, the pondization operation includes maximum It is worth pond, average pond;Wherein, maximum pondization is that the feature vector obtained to each convolution kernel takes its maximum, average pond Change the feature vector obtained to each convolution kernel and take its average value.
- 4. according to the method described in claim 1, it is characterized in that:The method is using two-way LSTM to medical treatment where current word Text sentence carries out global semantic expressiveness, and key step includes:1) word for having merged clinical treatment field character structure feature is represented to the office centered on word acquired in vector sum CNN Portion's semantic information represents to carry out splicing the input as two-way LSTM, is denoted as xt, represent sentence in t-th of Chinese character deep semantic Represent;2) input Chinese character deep semantic and represent sequence x1x2...xnTwo groups of shapes are obtained through forward and reverse LSTM network processes State output sequence h1h2...hnAnd h 'nh’n-1...h’1, two groups of state output sequences are merged, obtain [h1h’1] [h2h’2]...[hnh’n], for t-th of Chinese character, its global semantic information is expressed as [hth’t]。
- 5. according to the method described in claim 1, it is characterized in that:The method is using in attention mechanism selection sentence Weight of the different terms justice information in sentence semantics, key step include:1) the term vector s of each word i is calculatediWith the term vector s of other words j (j ≠ i) in sentencejSimilarity:eij=sim (si, sj)=si·sj;2) by softmax functions by its normalized, and then the weight factor of each word is calculated:3) summation is weighted with reference to its weight factor by the LSTM layers output vector to each word, obtains its in fusion sentence The vector of his word to the semantic contribution information of current word:4) splicing merges the current term vector inflow output layer of Attention vector sums.
- 6. according to the method described in claim 5, it is characterized in that:Softmax functions are combined by the output layer with CRF, Specifically include:1) if it is x=(x to assume its list entries1, x2..., xn), the sequence label of prediction is y=(y1, y2..., yn), The output probability matrix P of two-way LSTMn×k, PI, jThe probability of j-th of label is marked as i-th of word, then is defined as below:<mrow> <mi>s</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>n</mi> </msubsup> <msub> <mi>A</mi> <mrow> <msub> <mi>y</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mrow> <mi>i</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> </mrow> </msub> <mo>+</mo> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>n</mi> </msubsup> <msub> <mi>P</mi> <mrow> <mi>i</mi> <mo>,</mo> <msub> <mi>y</mi> <mi>i</mi> </msub> </mrow> </msub> <mo>,</mo> </mrow>Wherein, k is the number of output label, and A is state-transition matrix, AI, jRepresent from i-th of label and be transferred to j-th of label Probability;2) by solving maximum s (x, y), you can obtain the optimal output label sequence of sequence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711278996.2A CN107977361B (en) | 2017-12-06 | 2017-12-06 | Chinese clinical medical entity identification method based on deep semantic information representation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711278996.2A CN107977361B (en) | 2017-12-06 | 2017-12-06 | Chinese clinical medical entity identification method based on deep semantic information representation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107977361A true CN107977361A (en) | 2018-05-01 |
CN107977361B CN107977361B (en) | 2021-05-18 |
Family
ID=62009382
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711278996.2A Active CN107977361B (en) | 2017-12-06 | 2017-12-06 | Chinese clinical medical entity identification method based on deep semantic information representation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107977361B (en) |
Cited By (61)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108733837A (en) * | 2018-05-28 | 2018-11-02 | 杭州依图医疗技术有限公司 | A kind of the natural language structural method and device of case history text |
CN108804718A (en) * | 2018-06-11 | 2018-11-13 | 线粒体(北京)科技有限公司 | Data push method, device, electronic equipment and computer readable storage medium |
CN108875034A (en) * | 2018-06-25 | 2018-11-23 | 湖南丹尼尔智能科技有限公司 | A kind of Chinese Text Categorization based on stratification shot and long term memory network |
CN108920586A (en) * | 2018-06-26 | 2018-11-30 | 北京工业大学 | A kind of short text classification method based on depth nerve mapping support vector machines |
CN109002436A (en) * | 2018-07-12 | 2018-12-14 | 上海金仕达卫宁软件科技有限公司 | Medical text terms automatic identifying method and system based on shot and long term memory network |
CN109190113A (en) * | 2018-08-10 | 2019-01-11 | 北京科技大学 | A kind of knowledge mapping construction method of theory of traditional Chinese medical science ancient books and records |
CN109189882A (en) * | 2018-08-08 | 2019-01-11 | 北京百度网讯科技有限公司 | Answer type recognition methods, device, server and the storage medium of sequence content |
CN109214407A (en) * | 2018-07-06 | 2019-01-15 | 阿里巴巴集团控股有限公司 | Event detection model, calculates equipment and storage medium at method, apparatus |
CN109388807A (en) * | 2018-10-30 | 2019-02-26 | 中山大学 | The method, apparatus and storage medium of electronic health record name Entity recognition |
CN109493956A (en) * | 2018-10-15 | 2019-03-19 | 海口市人民医院(中南大学湘雅医学院附属海口医院) | Diagnosis guiding method |
CN109543187A (en) * | 2018-11-23 | 2019-03-29 | 中山大学 | Generation method, device and the storage medium of electronic health record feature |
CN109657239A (en) * | 2018-12-12 | 2019-04-19 | 电子科技大学 | The Chinese name entity recognition method learnt based on attention mechanism and language model |
CN109670179A (en) * | 2018-12-20 | 2019-04-23 | 中山大学 | Case history text based on iteration expansion convolutional neural networks names entity recognition method |
CN109697285A (en) * | 2018-12-13 | 2019-04-30 | 中南大学 | Enhance the hierarchical B iLSTM Chinese electronic health record disease code mask method of semantic expressiveness |
CN109710930A (en) * | 2018-12-20 | 2019-05-03 | 重庆邮电大学 | A kind of Chinese Resume analytic method based on deep neural network |
CN109800411A (en) * | 2018-12-03 | 2019-05-24 | 哈尔滨工业大学(深圳) | Clinical treatment entity and its attribute extraction method |
CN109871544A (en) * | 2019-03-25 | 2019-06-11 | 平安科技(深圳)有限公司 | Entity recognition method, device, equipment and storage medium based on Chinese case history |
CN109885824A (en) * | 2019-01-04 | 2019-06-14 | 北京捷通华声科技股份有限公司 | A kind of Chinese name entity recognition method, device and the readable storage medium storing program for executing of level |
CN109918506A (en) * | 2019-03-07 | 2019-06-21 | 安徽省泰岳祥升软件有限公司 | A kind of file classification method and device |
CN109949929A (en) * | 2019-03-19 | 2019-06-28 | 挂号网(杭州)科技有限公司 | A kind of assistant diagnosis system based on the extensive case history of deep learning |
CN109992783A (en) * | 2019-04-03 | 2019-07-09 | 同济大学 | Chinese term vector modeling method |
CN109994215A (en) * | 2019-04-25 | 2019-07-09 | 清华大学 | Disease automatic coding system, method, equipment and storage medium |
CN110032739A (en) * | 2019-04-18 | 2019-07-19 | 清华大学 | Chinese electronic health record name entity abstracting method and system |
CN110069781A (en) * | 2019-04-24 | 2019-07-30 | 北京奇艺世纪科技有限公司 | A kind of recognition methods of entity tag and relevant device |
CN110083833A (en) * | 2019-04-18 | 2019-08-02 | 东华大学 | Term vector joint insertion sentiment analysis method in terms of Chinese words vector sum |
CN110110324A (en) * | 2019-04-15 | 2019-08-09 | 大连理工大学 | A kind of biomedical entity link method that knowledge based indicates |
CN110209823A (en) * | 2019-06-12 | 2019-09-06 | 齐鲁工业大学 | A kind of multi-tag file classification method and system |
CN110263324A (en) * | 2019-05-16 | 2019-09-20 | 华为技术有限公司 | Text handling method, model training method and device |
CN110287483A (en) * | 2019-06-06 | 2019-09-27 | 广东技术师范大学 | A kind of unknown word identification method and system using five-stroke etymon deep learning |
CN110298040A (en) * | 2019-06-20 | 2019-10-01 | 翼健(上海)信息科技有限公司 | A kind of pair of Chinese corpus is labeled the control method and control device of identification |
CN110310740A (en) * | 2019-04-15 | 2019-10-08 | 山东大学 | Based on see a doctor again information forecasting method and the system for intersecting attention neural network |
CN110377711A (en) * | 2019-07-01 | 2019-10-25 | 浙江大学 | A method of open long video question-answering task is solved from attention network using layering convolution |
CN110378318A (en) * | 2019-07-30 | 2019-10-25 | 腾讯科技(深圳)有限公司 | Character recognition method, device, computer equipment and storage medium |
EP3564964A1 (en) * | 2018-05-04 | 2019-11-06 | Avaintec Oy | Method for utilising natural language processing technology in decision-making support of abnormal state of object |
CN110459282A (en) * | 2019-07-11 | 2019-11-15 | 新华三大数据技术有限公司 | Sequence labelling model training method, electronic health record processing method and relevant apparatus |
CN110457682A (en) * | 2019-07-11 | 2019-11-15 | 新华三大数据技术有限公司 | Electronic health record part-of-speech tagging method, model training method and relevant apparatus |
CN110569343A (en) * | 2019-08-16 | 2019-12-13 | 华东理工大学 | question and answer based clinical text structuring method |
CN110569506A (en) * | 2019-09-05 | 2019-12-13 | 清华大学 | Medical named entity recognition method based on medical dictionary |
CN110569486A (en) * | 2019-07-30 | 2019-12-13 | 平安科技(深圳)有限公司 | sequence labeling method and device based on double architectures and computer equipment |
CN110598212A (en) * | 2019-09-05 | 2019-12-20 | 清华大学 | Rapid named body identification method |
CN110688855A (en) * | 2019-09-29 | 2020-01-14 | 山东师范大学 | Chinese medical entity identification method and system based on machine learning |
CN110825875A (en) * | 2019-11-01 | 2020-02-21 | 科大讯飞股份有限公司 | Text entity type identification method and device, electronic equipment and storage medium |
CN111090987A (en) * | 2019-12-27 | 2020-05-01 | 北京百度网讯科技有限公司 | Method and apparatus for outputting information |
CN111160023A (en) * | 2019-12-23 | 2020-05-15 | 华南理工大学 | Medical text named entity identification method based on multi-way recall |
CN111180076A (en) * | 2018-11-13 | 2020-05-19 | 零氪科技(北京)有限公司 | Medical information extraction method based on multilayer semantic analysis |
CN111538817A (en) * | 2019-01-18 | 2020-08-14 | 北京京东尚科信息技术有限公司 | Man-machine interaction method and device |
CN111563380A (en) * | 2019-01-25 | 2020-08-21 | 浙江大学 | Named entity identification method and device |
CN111581970A (en) * | 2020-05-12 | 2020-08-25 | 厦门市美亚柏科信息股份有限公司 | Text recognition method, device and storage medium for network context |
CN111597814A (en) * | 2020-05-22 | 2020-08-28 | 北京慧闻科技(集团)有限公司 | Man-machine interaction named entity recognition method, device, equipment and storage medium |
WO2020211275A1 (en) * | 2019-04-18 | 2020-10-22 | 五邑大学 | Pre-trained model and fine-tuning technology-based medical text relationship extraction method |
CN112101034A (en) * | 2020-09-09 | 2020-12-18 | 沈阳东软智能医疗科技研究院有限公司 | Method and device for distinguishing attribute of medical entity and related product |
CN112528653A (en) * | 2020-12-02 | 2021-03-19 | 支付宝(杭州)信息技术有限公司 | Short text entity identification method and system |
CN112599211A (en) * | 2020-12-25 | 2021-04-02 | 中电云脑(天津)科技有限公司 | Medical entity relationship extraction method and device |
CN112597774A (en) * | 2020-12-14 | 2021-04-02 | 山东师范大学 | Chinese medical named entity recognition method, system, storage medium and equipment |
CN112925995A (en) * | 2021-02-22 | 2021-06-08 | 北京百度网讯科技有限公司 | Method and device for acquiring POI state information |
CN113035303A (en) * | 2021-02-09 | 2021-06-25 | 北京工业大学 | Method and system for labeling named entity category of Chinese electronic medical record |
CN113569575A (en) * | 2021-08-10 | 2021-10-29 | 云南电网有限责任公司电力科学研究院 | Evaluation expert recommendation method based on pictograph-semantic dual-feature space mapping |
CN113948217A (en) * | 2021-11-23 | 2022-01-18 | 重庆邮电大学 | Medical nested named entity recognition method based on local feature integration |
CN114300081A (en) * | 2022-03-09 | 2022-04-08 | 四川大学华西医院 | Prediction device, system and storage medium based on electronic medical record multi-modal data |
CN114648029A (en) * | 2022-03-31 | 2022-06-21 | 河海大学 | Electric power field named entity identification method based on BiLSTM-CRF model |
CN114927177A (en) * | 2022-05-27 | 2022-08-19 | 浙江工业大学 | Medical entity identification method and system fusing Chinese medical field characteristics |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105894088A (en) * | 2016-03-25 | 2016-08-24 | 苏州赫博特医疗信息科技有限公司 | Medical information extraction system and method based on depth learning and distributed semantic features |
CN106776711A (en) * | 2016-11-14 | 2017-05-31 | 浙江大学 | A kind of Chinese medical knowledge mapping construction method based on deep learning |
CN107092596A (en) * | 2017-04-24 | 2017-08-25 | 重庆邮电大学 | Text emotion analysis method based on attention CNNs and CCR |
CN107122416A (en) * | 2017-03-31 | 2017-09-01 | 北京大学 | A kind of Chinese event abstracting method |
-
2017
- 2017-12-06 CN CN201711278996.2A patent/CN107977361B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105894088A (en) * | 2016-03-25 | 2016-08-24 | 苏州赫博特医疗信息科技有限公司 | Medical information extraction system and method based on depth learning and distributed semantic features |
CN106776711A (en) * | 2016-11-14 | 2017-05-31 | 浙江大学 | A kind of Chinese medical knowledge mapping construction method based on deep learning |
CN107122416A (en) * | 2017-03-31 | 2017-09-01 | 北京大学 | A kind of Chinese event abstracting method |
CN107092596A (en) * | 2017-04-24 | 2017-08-25 | 重庆邮电大学 | Text emotion analysis method based on attention CNNs and CCR |
Non-Patent Citations (1)
Title |
---|
YANRAN LI,WENJIE LI,FEI SUN,SUJIAN LI: "《Component-Enhanced Chinese Character Embeddings》", 《PROCEEDINGS OF THE 2015 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING》 * |
Cited By (95)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3564964A1 (en) * | 2018-05-04 | 2019-11-06 | Avaintec Oy | Method for utilising natural language processing technology in decision-making support of abnormal state of object |
CN108733837A (en) * | 2018-05-28 | 2018-11-02 | 杭州依图医疗技术有限公司 | A kind of the natural language structural method and device of case history text |
CN108804718A (en) * | 2018-06-11 | 2018-11-13 | 线粒体(北京)科技有限公司 | Data push method, device, electronic equipment and computer readable storage medium |
CN108875034A (en) * | 2018-06-25 | 2018-11-23 | 湖南丹尼尔智能科技有限公司 | A kind of Chinese Text Categorization based on stratification shot and long term memory network |
CN108920586A (en) * | 2018-06-26 | 2018-11-30 | 北京工业大学 | A kind of short text classification method based on depth nerve mapping support vector machines |
CN109214407A (en) * | 2018-07-06 | 2019-01-15 | 阿里巴巴集团控股有限公司 | Event detection model, calculates equipment and storage medium at method, apparatus |
CN109214407B (en) * | 2018-07-06 | 2022-04-19 | 创新先进技术有限公司 | Event detection model, method and device, computing equipment and storage medium |
CN109002436A (en) * | 2018-07-12 | 2018-12-14 | 上海金仕达卫宁软件科技有限公司 | Medical text terms automatic identifying method and system based on shot and long term memory network |
CN109189882A (en) * | 2018-08-08 | 2019-01-11 | 北京百度网讯科技有限公司 | Answer type recognition methods, device, server and the storage medium of sequence content |
CN109190113A (en) * | 2018-08-10 | 2019-01-11 | 北京科技大学 | A kind of knowledge mapping construction method of theory of traditional Chinese medical science ancient books and records |
CN109190113B (en) * | 2018-08-10 | 2021-08-31 | 北京科技大学 | Knowledge graph construction method of traditional Chinese medicine theory book |
CN109493956A (en) * | 2018-10-15 | 2019-03-19 | 海口市人民医院(中南大学湘雅医学院附属海口医院) | Diagnosis guiding method |
CN109388807A (en) * | 2018-10-30 | 2019-02-26 | 中山大学 | The method, apparatus and storage medium of electronic health record name Entity recognition |
CN109388807B (en) * | 2018-10-30 | 2021-09-21 | 中山大学 | Method, device and storage medium for identifying named entities of electronic medical records |
CN111180076B (en) * | 2018-11-13 | 2023-09-05 | 零氪科技(北京)有限公司 | Medical information extraction method based on multi-layer semantic analysis |
CN111180076A (en) * | 2018-11-13 | 2020-05-19 | 零氪科技(北京)有限公司 | Medical information extraction method based on multilayer semantic analysis |
CN109543187A (en) * | 2018-11-23 | 2019-03-29 | 中山大学 | Generation method, device and the storage medium of electronic health record feature |
CN109543187B (en) * | 2018-11-23 | 2021-09-17 | 中山大学 | Method and device for generating electronic medical record characteristics and storage medium |
CN109800411A (en) * | 2018-12-03 | 2019-05-24 | 哈尔滨工业大学(深圳) | Clinical treatment entity and its attribute extraction method |
CN109800411B (en) * | 2018-12-03 | 2023-07-18 | 哈尔滨工业大学(深圳) | Clinical medical entity and attribute extraction method thereof |
CN109657239B (en) * | 2018-12-12 | 2020-04-21 | 电子科技大学 | Chinese named entity recognition method based on attention mechanism and language model learning |
CN109657239A (en) * | 2018-12-12 | 2019-04-19 | 电子科技大学 | The Chinese name entity recognition method learnt based on attention mechanism and language model |
CN109697285A (en) * | 2018-12-13 | 2019-04-30 | 中南大学 | Enhance the hierarchical B iLSTM Chinese electronic health record disease code mask method of semantic expressiveness |
CN109710930A (en) * | 2018-12-20 | 2019-05-03 | 重庆邮电大学 | A kind of Chinese Resume analytic method based on deep neural network |
CN109670179B (en) * | 2018-12-20 | 2022-11-11 | 中山大学 | Medical record text named entity identification method based on iterative expansion convolutional neural network |
CN109670179A (en) * | 2018-12-20 | 2019-04-23 | 中山大学 | Case history text based on iteration expansion convolutional neural networks names entity recognition method |
CN109885824B (en) * | 2019-01-04 | 2024-02-20 | 北京捷通华声科技股份有限公司 | Hierarchical Chinese named entity recognition method, hierarchical Chinese named entity recognition device and readable storage medium |
CN109885824A (en) * | 2019-01-04 | 2019-06-14 | 北京捷通华声科技股份有限公司 | A kind of Chinese name entity recognition method, device and the readable storage medium storing program for executing of level |
CN111538817A (en) * | 2019-01-18 | 2020-08-14 | 北京京东尚科信息技术有限公司 | Man-machine interaction method and device |
CN111563380A (en) * | 2019-01-25 | 2020-08-21 | 浙江大学 | Named entity identification method and device |
CN109918506A (en) * | 2019-03-07 | 2019-06-21 | 安徽省泰岳祥升软件有限公司 | A kind of file classification method and device |
CN109918506B (en) * | 2019-03-07 | 2022-12-16 | 安徽省泰岳祥升软件有限公司 | Text classification method and device |
CN109949929A (en) * | 2019-03-19 | 2019-06-28 | 挂号网(杭州)科技有限公司 | A kind of assistant diagnosis system based on the extensive case history of deep learning |
CN109871544A (en) * | 2019-03-25 | 2019-06-11 | 平安科技(深圳)有限公司 | Entity recognition method, device, equipment and storage medium based on Chinese case history |
CN109992783A (en) * | 2019-04-03 | 2019-07-09 | 同济大学 | Chinese term vector modeling method |
CN110310740A (en) * | 2019-04-15 | 2019-10-08 | 山东大学 | Based on see a doctor again information forecasting method and the system for intersecting attention neural network |
CN110110324B (en) * | 2019-04-15 | 2022-12-02 | 大连理工大学 | Biomedical entity linking method based on knowledge representation |
CN110110324A (en) * | 2019-04-15 | 2019-08-09 | 大连理工大学 | A kind of biomedical entity link method that knowledge based indicates |
CN110310740B (en) * | 2019-04-15 | 2020-11-20 | 山东大学 | Re-hospitalizing information prediction method and system based on cross attention neural network |
CN110083833A (en) * | 2019-04-18 | 2019-08-02 | 东华大学 | Term vector joint insertion sentiment analysis method in terms of Chinese words vector sum |
CN110083833B (en) * | 2019-04-18 | 2022-12-06 | 东华大学 | Method for analyzing emotion by jointly embedding Chinese word vector and aspect word vector |
WO2020211275A1 (en) * | 2019-04-18 | 2020-10-22 | 五邑大学 | Pre-trained model and fine-tuning technology-based medical text relationship extraction method |
CN110032739A (en) * | 2019-04-18 | 2019-07-19 | 清华大学 | Chinese electronic health record name entity abstracting method and system |
CN110069781A (en) * | 2019-04-24 | 2019-07-30 | 北京奇艺世纪科技有限公司 | A kind of recognition methods of entity tag and relevant device |
CN110069781B (en) * | 2019-04-24 | 2022-11-18 | 北京奇艺世纪科技有限公司 | Entity label identification method and related equipment |
CN109994215A (en) * | 2019-04-25 | 2019-07-09 | 清华大学 | Disease automatic coding system, method, equipment and storage medium |
CN110263324A (en) * | 2019-05-16 | 2019-09-20 | 华为技术有限公司 | Text handling method, model training method and device |
CN110263324B (en) * | 2019-05-16 | 2021-02-12 | 华为技术有限公司 | Text processing method, model training method and device |
CN110287483B (en) * | 2019-06-06 | 2023-12-05 | 广东技术师范大学 | Unregistered word recognition method and system utilizing five-stroke character root deep learning |
CN110287483A (en) * | 2019-06-06 | 2019-09-27 | 广东技术师范大学 | A kind of unknown word identification method and system using five-stroke etymon deep learning |
CN110209823A (en) * | 2019-06-12 | 2019-09-06 | 齐鲁工业大学 | A kind of multi-tag file classification method and system |
CN110209823B (en) * | 2019-06-12 | 2021-04-13 | 齐鲁工业大学 | Multi-label text classification method and system |
CN110298040A (en) * | 2019-06-20 | 2019-10-01 | 翼健(上海)信息科技有限公司 | A kind of pair of Chinese corpus is labeled the control method and control device of identification |
CN110377711A (en) * | 2019-07-01 | 2019-10-25 | 浙江大学 | A method of open long video question-answering task is solved from attention network using layering convolution |
CN110459282B (en) * | 2019-07-11 | 2021-03-09 | 新华三大数据技术有限公司 | Sequence labeling model training method, electronic medical record processing method and related device |
CN110457682A (en) * | 2019-07-11 | 2019-11-15 | 新华三大数据技术有限公司 | Electronic health record part-of-speech tagging method, model training method and relevant apparatus |
CN110457682B (en) * | 2019-07-11 | 2022-08-09 | 新华三大数据技术有限公司 | Part-of-speech tagging method for electronic medical record, model training method and related device |
CN110459282A (en) * | 2019-07-11 | 2019-11-15 | 新华三大数据技术有限公司 | Sequence labelling model training method, electronic health record processing method and relevant apparatus |
CN110378318A (en) * | 2019-07-30 | 2019-10-25 | 腾讯科技(深圳)有限公司 | Character recognition method, device, computer equipment and storage medium |
CN110569486A (en) * | 2019-07-30 | 2019-12-13 | 平安科技(深圳)有限公司 | sequence labeling method and device based on double architectures and computer equipment |
CN110569486B (en) * | 2019-07-30 | 2023-01-03 | 平安科技(深圳)有限公司 | Sequence labeling method and device based on double architectures and computer equipment |
CN110378318B (en) * | 2019-07-30 | 2022-07-15 | 腾讯科技(深圳)有限公司 | Character recognition method and device, computer equipment and storage medium |
CN110569343B (en) * | 2019-08-16 | 2023-05-09 | 华东理工大学 | Clinical text structuring method based on question and answer |
CN110569343A (en) * | 2019-08-16 | 2019-12-13 | 华东理工大学 | question and answer based clinical text structuring method |
CN110569506A (en) * | 2019-09-05 | 2019-12-13 | 清华大学 | Medical named entity recognition method based on medical dictionary |
CN110598212A (en) * | 2019-09-05 | 2019-12-20 | 清华大学 | Rapid named body identification method |
CN110688855A (en) * | 2019-09-29 | 2020-01-14 | 山东师范大学 | Chinese medical entity identification method and system based on machine learning |
CN110825875B (en) * | 2019-11-01 | 2022-12-06 | 科大讯飞股份有限公司 | Text entity type identification method and device, electronic equipment and storage medium |
CN110825875A (en) * | 2019-11-01 | 2020-02-21 | 科大讯飞股份有限公司 | Text entity type identification method and device, electronic equipment and storage medium |
CN111160023A (en) * | 2019-12-23 | 2020-05-15 | 华南理工大学 | Medical text named entity identification method based on multi-way recall |
CN111160023B (en) * | 2019-12-23 | 2023-06-20 | 华南理工大学 | Medical text named entity recognition method based on multi-way recall |
CN111090987A (en) * | 2019-12-27 | 2020-05-01 | 北京百度网讯科技有限公司 | Method and apparatus for outputting information |
US11507748B2 (en) | 2019-12-27 | 2022-11-22 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for outputting information |
CN111581970B (en) * | 2020-05-12 | 2023-01-24 | 厦门市美亚柏科信息股份有限公司 | Text recognition method, device and storage medium for network context |
CN111581970A (en) * | 2020-05-12 | 2020-08-25 | 厦门市美亚柏科信息股份有限公司 | Text recognition method, device and storage medium for network context |
CN111597814B (en) * | 2020-05-22 | 2023-05-26 | 北京慧闻科技(集团)有限公司 | Man-machine interaction named entity recognition method, device, equipment and storage medium |
CN111597814A (en) * | 2020-05-22 | 2020-08-28 | 北京慧闻科技(集团)有限公司 | Man-machine interaction named entity recognition method, device, equipment and storage medium |
CN112101034B (en) * | 2020-09-09 | 2024-02-27 | 沈阳东软智能医疗科技研究院有限公司 | Method and device for judging attribute of medical entity and related product |
CN112101034A (en) * | 2020-09-09 | 2020-12-18 | 沈阳东软智能医疗科技研究院有限公司 | Method and device for distinguishing attribute of medical entity and related product |
CN112528653B (en) * | 2020-12-02 | 2023-11-28 | 支付宝(杭州)信息技术有限公司 | Short text entity recognition method and system |
CN112528653A (en) * | 2020-12-02 | 2021-03-19 | 支付宝(杭州)信息技术有限公司 | Short text entity identification method and system |
CN112597774A (en) * | 2020-12-14 | 2021-04-02 | 山东师范大学 | Chinese medical named entity recognition method, system, storage medium and equipment |
CN112599211A (en) * | 2020-12-25 | 2021-04-02 | 中电云脑(天津)科技有限公司 | Medical entity relationship extraction method and device |
CN112599211B (en) * | 2020-12-25 | 2023-03-21 | 中电云脑(天津)科技有限公司 | Medical entity relationship extraction method and device |
CN113035303A (en) * | 2021-02-09 | 2021-06-25 | 北京工业大学 | Method and system for labeling named entity category of Chinese electronic medical record |
CN112925995A (en) * | 2021-02-22 | 2021-06-08 | 北京百度网讯科技有限公司 | Method and device for acquiring POI state information |
US11977574B2 (en) | 2021-02-22 | 2024-05-07 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for acquiring POI state information |
CN112925995B (en) * | 2021-02-22 | 2022-01-28 | 北京百度网讯科技有限公司 | Method and device for acquiring POI state information |
CN113569575A (en) * | 2021-08-10 | 2021-10-29 | 云南电网有限责任公司电力科学研究院 | Evaluation expert recommendation method based on pictograph-semantic dual-feature space mapping |
CN113569575B (en) * | 2021-08-10 | 2024-02-09 | 云南电网有限责任公司电力科学研究院 | Evaluation expert recommendation method based on pictographic-semantic dual-feature space mapping |
CN113948217A (en) * | 2021-11-23 | 2022-01-18 | 重庆邮电大学 | Medical nested named entity recognition method based on local feature integration |
CN114300081A (en) * | 2022-03-09 | 2022-04-08 | 四川大学华西医院 | Prediction device, system and storage medium based on electronic medical record multi-modal data |
CN114300081B (en) * | 2022-03-09 | 2022-05-27 | 四川大学华西医院 | Prediction device, system and storage medium based on electronic medical record multi-modal data |
CN114648029A (en) * | 2022-03-31 | 2022-06-21 | 河海大学 | Electric power field named entity identification method based on BiLSTM-CRF model |
CN114927177A (en) * | 2022-05-27 | 2022-08-19 | 浙江工业大学 | Medical entity identification method and system fusing Chinese medical field characteristics |
Also Published As
Publication number | Publication date |
---|---|
CN107977361B (en) | 2021-05-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107977361A (en) | The Chinese clinical treatment entity recognition method represented based on deep semantic information | |
CN109800411B (en) | Clinical medical entity and attribute extraction method thereof | |
CN109669994B (en) | Construction method and system of health knowledge map | |
Yin et al. | Chinese clinical named entity recognition with radical-level feature and self-attention mechanism | |
Yu et al. | Automatic ICD code assignment of Chinese clinical notes based on multilayer attention BiRNN | |
Lee et al. | Machine learning in relation to emergency medicine clinical and operational scenarios: an overview | |
CN111834014A (en) | Medical field named entity identification method and system | |
CN110032739A (en) | Chinese electronic health record name entity abstracting method and system | |
CN108875809A (en) | The biomedical entity relationship classification method of joint attention mechanism and neural network | |
CN113553440B (en) | Medical entity relationship extraction method based on hierarchical reasoning | |
CN113901207B (en) | Adverse drug reaction detection method based on data enhancement and semi-supervised learning | |
CN111695354A (en) | Text question-answering method and device based on named entity and readable storage medium | |
CN114864076A (en) | Multi-modal breast cancer classification training method and system based on graph attention network | |
CN111881292B (en) | Text classification method and device | |
CN110444261A (en) | Sequence labelling network training method, electronic health record processing method and relevant apparatus | |
CN114742059A (en) | Chinese electronic medical record named entity identification method based on multitask learning | |
CN111582506A (en) | Multi-label learning method based on global and local label relation | |
CN114781382A (en) | Medical named entity recognition system and method based on RWLSTM model fusion | |
CN106407387A (en) | A concept connection method for medical diagnosis texts | |
Cheng et al. | Integration of automatic sentence segmentation and lexical analysis of ancient Chinese based on BiLSTM-CRF model | |
CN112216379A (en) | Disease diagnosis system based on intelligent joint learning | |
Wu et al. | AGNet: Automatic generation network for skin imaging reports | |
Liang et al. | Disease prediction based on multi-type data fusion from Chinese electronic health record | |
Wang et al. | Toxic comment classification based on bidirectional gated recurrent unit and convolutional neural network | |
CN116881336A (en) | Efficient multi-mode contrast depth hash retrieval method for medical big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |