CN108170675A - A kind of name entity recognition method based on deep learning towards medical field - Google Patents
A kind of name entity recognition method based on deep learning towards medical field Download PDFInfo
- Publication number
- CN108170675A CN108170675A CN201711446980.8A CN201711446980A CN108170675A CN 108170675 A CN108170675 A CN 108170675A CN 201711446980 A CN201711446980 A CN 201711446980A CN 108170675 A CN108170675 A CN 108170675A
- Authority
- CN
- China
- Prior art keywords
- parameter
- lstm
- hidden layer
- value
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The present invention proposes a kind of name entity recognition method based on deep learning towards medical field.This method is that the training of shot and long term mnemon network LSTM is carried out by one, using the training corpus having in mark language material of medical field;2nd, the newer neural network parameter θ in one is labeled the path searching of result, obtains the annotation results of mark language material, and the annotation results for having the testing material in mark language material are assessed using Entity recognition evaluation criteria F values are named;3rd, in the training process in one, first with the training for having mark language material progress shot and long term mnemon network LSTM of News Field, there is mark language material further according to the model and medical field instructed, carry out the training of the model of medical field, using name Entity recognition evaluation criteria F values the annotation results for having the testing material in mark language material assess and etc. realization.The present invention is applied to name Entity recognition field.
Description
Technical field
It is more particularly to a kind of towards name of the medical field based on deep learning the present invention relates to name entity recognition method
Entity recognition method.
Background technology
One of the basic task of Entity recognition as information extraction is named, in question answering system, syntactic analysis, machine translation etc.
There is important application in field.Medical bodies and common solid difference are larger, and Opening field entity marks corpus information to medical treatment
The effect of entity mark is little;The Entity recognition of medical field lacks mark language material again simultaneously, and this is mainly due to medical bodies
Judgement needs professional person to carry out, and substantially increases the cost of medical field entity mark.Therefore, it is how sharp in medical field
It is highly important with a small amount of mark language material preferably mark.
Deep learning is achieving major progress in recent years, it has been proved to be able to excavate out the complexity in high dimensional data
Structure is learnt.At present in natural language processing field, a kind of new word representation method:Term vector (word
Embedding immense success) is achieved.
Term vector (word embedding) is the word expression for being commonly used to substitute traditional bag of words (bag of word) in recent years
Method solves the problems, such as the dimension disaster that bag of words expression is brought.Researcher also found, the word obtained by train language model
Vector has contained the semantic information of vocabulary, and similarity of vocabulary etc. can also be to a certain extent obtained by some algorithms
Data.Further, since the training of term vector is without any mark work, so around term vector study can be much less
Workload can also train on demand:Both can use a large amount of open language materials train to obtain can be general good term vector represent,
The language material in same field can also be selected to train to obtain the term vector to some domain-specific, more can directly be carried out according to task
Training.
The training of term vector is generally carried out using deep neural network, and in natural language processing field, cycle nerve net
Network (RNN) model is one of most widely used neural network.In natural language processing field, information above is to shadow hereafter
Sound is generally portrayed with language model, and information above is naturally utilized using the hidden layer of a cycle feedback in RNN models,
And can use in theory to whole information above, this is that conventional language model cannot be accomplished.But RNN models are in reality
There are problems that gradient disappearance in the application of border, shot and long term mnemon (Long Short-Term Memory, LSTM) is exactly pair
One in RNN is effectively improved.LSTM can not be effectively retained the present situation of information needed for RNN, use mnemon
(Memory Cell) records information, and introduces the update and use of multiple doors (gate) control mnemon so that required letter
Breath can be preserved effectively.LSTM has been widely used in now from participle, part-of-speech tagging, name Entity recognition to machine
In the natural language processings tasks such as translation.
A common technology is pre-training technology in deep neural network.Multiple achievements in research prove, use big rule
Mould language material carries out the term vector that unsupervised training obtains to initialize the parameter of neural network, can be with than random initializtion training
Better model is obtained, the term vector obtained this is mainly due to pre-training can be utilized on a large scale without labeled data, contained
The information not having in training data, and the term vector of random initializtion can be prevented to be absorbed in office in optimization process to a certain extent
Portion's extreme value.For the rare medical field of training data, can be using supplemental training is carried out without labeled data on a large scale
It is very meaningful.
The model that name Entity recognition task uses at present mainly has using CRF as the conventional model of representative and depth nerve net
Two class of network model, and generally also using traditional CRF models in medical field.
CRF models, in the case where training corpus extremely lacks, will appear due to not considering semantic information in annotation results
A large amount of meaningless annotation results, and the semantic information that LSTM models contain can prevent this from occurring.
Invention content
The purpose of the present invention is to solve CRF models due to not considering semantic information, extremely lack in training corpus
In the case of, the problem of a large amount of meaningless annotation results are will appear in annotation results, by large-scale News Field language material,
And a kind of name entity recognition method based on deep learning towards medical field proposed.
Above-mentioned goal of the invention is achieved through the following technical solutions:
A kind of name entity recognition method based on deep learning towards medical field, which is characterized in that the tool of this method
Body step is as follows:
Step 1:Term vector vec is carried out using the medical language material of no markiTraining, obtain supplement medical field language material
Vocabulary voc and the corresponding term vector vec of vocabulary voc;Vec=[vec1,vec2,…,vecn];Voc=[voc1,voc2,…,
vocn];Wherein i=1,2 ..., n, n are without the word type total number in mark language material;
Step 2:There is the training corpus in mark language material to carry out shot and long term mnemon network LSTM using News Field
Training;Pre-training by the use of term vector vec described in step 1 as the training of the shot and long term mnemon network LSTM to
Amount, using LSTM methods according to pre-training vector and xk、ykCalculation optimization targetUtilize gradient descent algorithm
OptimizationCarry out the parameter θ of LSTMCUpdate;It is described to there is mark language material to include training corpus and testing material, finally
To the parameter of LSTMWherein, parameterFor LSTM model parameters θCThe numerical value when final nth iteration restrains, specifically
Including:WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、Wh_f、Wc_f、bin、boOr bf;Wherein:Wx_in:Hidden layer input gate inputs
Weighting parameter;Wh_in:Hidden layer input door state input weighting parameter;Wc_in:Hidden layer mnemon inputs weighting parameter;Wx_o:It is hidden
Layer out gate input weighting parameter;Wh_o:Hidden layer output door state input weighting parameter;Wc_o:Hidden layer mnemon output layer is weighed
Value parameter;Wx_f:Hidden layer forgets door input weighting parameter;Wh_f:Hidden layer forgets door state input weighting parameter;Wc_f:Hidden layer is forgotten
Door mnemon input weighting parameter;bin:Hidden layer input gate offset parameter; bo:Hidden layer out gate offset parameter;bf:Hidden layer is lost
Forget an offset parameter;
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykFor
The corresponding annotation results vector of the training corpus having in mark language material of k-th of sample;
Step 3:There is the training corpus in mark language material to carry out shot and long term mnemon network LSTM using medical domain
Training;By the use of the term vector vec that step 1 obtains as the pre-training of the training of the shot and long term mnemon network LSTM
Vector, using LSTM methods according to pre-training vector and xk、ykCalculation optimization targetDeclined using gradient and calculated
Method optimizesCarry out the update of the parameter θ of LSTM;It is described to there is mark language material to include training corpus and testing material;
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykFor
The corresponding annotation results vector of the training corpus having in mark language material of k-th of sample;
Step 4:To parameter, updated LSTM is tested, and test process is:Have described in input step two and step 3
Mark language material, the newer neural network parameter θ in step 2CThe path searching of result is labeled, has obtained mark
The annotation results of language material;Using name Entity recognition evaluation criteria F values to have mark language material in testing material annotation results
It is assessed, and obtains having after assessment annotation results and mark it is anticipated that specifically assessment computational methods are as follows:
The entity word sum of the correct entity word number/mark of accuracy rate=mark
The correct entity word number of recall rate=mark/entity word sum
Accuracy rate recall rate/(accuracy rate+recall rate) of F values=2
Step 5: there will be mark language material to repeat step 2 to step 4, until name Entity recognition described in step 4 is commented
Estimate standard F values do not increase or repeat step 2 and step 4 number reach maximum value 50~100 times until.
Further, the newer of the parameter θ of LSTM described in step 2 is as follows:
Step 2 one:The corresponding term vector vec of vocabulary voc and vocabulary voc are subjected to pre-training;Utilize xkIn step 1
The list entries X of LSTM neural networks is calculated wherein in the term vector vec of acquisition, X=X1, X2..., Xt..., XT;
Step 2 two:Using inputting Xt, the t-1 times hidden layer h being calculatedt-1The memory list being calculated with the t-1 times
First ct-1Calculate the input gate in of the LSTM models of the t times calculatingt, LSTM models out gate otAnd the forgetting door of LSTM models
ft;According int、otAnd ftMnemon value c is calculatedtWith hidden layer value ht;Wherein, hidden layer value htConcrete model be: ht=
otgtanh(ct);
Step 2 three:By list entries X=X described in step 2 one1, X2..., Xt..., XTInternal each element is pressed
From X1To XTSequence be sequentially inputted to hidden layer value h described in step 2 twotConcrete model in and obtain hidden layer forget door output
hf;Then, by list entries X=X described in step 2 one1, X2..., Xt..., XTInternal each element is pressed from XTTo X1's
Sequence is sequentially inputted to hidden layer value h described in step 2 twotConcrete model in and obtain hidden layer output hb;
Step 2 four:The hidden layer result for being obtained step 2 three using the cost computational methods of the entire sequence of transfer value
hfAnd hbIt carries out the calculating of sequence cost and obtains optimization aimOptimized using gradient descent algorithmIt carries out
The parameter θ of LSTMCUpdate;Wherein, θCFor word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、 Wh_f、Wc_f、
bin、boOr bf, wherein, word_emb is pre-training term vector weighting parameter.
Further, the newer of the parameter θ of LSTM described in step 3 is as follows:
The corresponding term vector vec of vocabulary voc and vocabulary voc are carried out pre-training by step 3 one;Utilize xkIn step 1
The term vector vec of acquisition is calculated the list entries X of LSTM neural networks, wherein, wherein, X=X1, X2...,
Xt..., XT;
Step 3 two, load News Field LSTM train to obtain model parameter θn, in θnParameter basis on using input
Xt, the t-1 times hidden layer h being calculatedt-1The mnemon c being calculated with the t-1 timest-1Calculate the LSTM of the t times calculating
The input gate in of modelt, LSTM models out gate otAnd the forgetting door f of LSTM modelst;According int、otAnd ftIt is calculated
Mnemon value ctWith hidden layer value ht;Wherein, hidden layer value htConcrete model be:ht=otgtanh(ct);
Step 3 three:By list entries X=X described in step 3 one1, X2..., Xt..., XTIt is pressed successively from X1To XTIt is suitable
Sequence is separately input to hidden layer value h described in step 3 twotConcrete model in and obtain hidden layer output hf;Then, by one institute of step 2
State list entries X=X1, X2..., Xt..., XTInternal each element is pressed from XTTo X1Sequence be separately input to step 3
Two are brought into the hidden layer value htConcrete model in and obtain hidden layer output hb;
Step 3 four, the hidden layer result for being obtained step 3 three using the cost computational methods of the entire sequence of transfer value
hfAnd hbIt carries out sequence cost and optimization aim is calculatedOptimized using gradient descent algorithmIt carries out
The update of the parameter θ of LSTM;Wherein, θ word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、 Wh_f、Wc_f、bin、
boOr bf。
Further, step 2 one and the specific acquisition process of input X of LSTM neural networks described in step 3 one are:
The training corpus vocabulary voc ' having in mark language material is established, vo ' c and voc are merged into vocabulary VO C;VOC=
VOC1,VOC2,VOC3,K,VOCN;
The corresponding vector matrix word_emb of random initializtion vocabulary VOC so that vector matrix word_emb dimensions and word
Vector v ec is identical, and carries out assignment by formula (1):
word_embiFor i-th of term vector in word_emb;
Finally by xk[k1,k2]It is multiplied to obtain the input X of LSTM neural networks with word_emb:
X=xk[k1,k2]gword_emb(2)
Wherein, xk[k1,k2]For word sequence xkWord sequence between middle k1 and k2.
Further, step 2 one and the specific acquisition process of input X of LSTM neural networks described in step 3 one are:
The corresponding vector matrix word_emb of random initializtion vocabulary VOC, and keep vector after carrying out assignment by formula (1)
word_embiIt is constant, i.e., it is not updated vector as parameter,
The corresponding vector matrix of a vocabulary in random initializtion vocabulary VOC is word_emb_para again, according to formula
(3) model calculates the input X of LSTM neural networks:
X=(xK [k1, k2]gword_emb)⊕(xK [k1, k2]gword_emb_para) (3)。
Further:Step 2 two and the input gate in of the t times calculating LSTM model described in step 3 twotIt is according to mould
What type (4) obtained, model (4) is as follows:
int=σ (WX_inXt+Wh_inht-1+Wc_inct-1+bin) (4)
Wherein, σ is sigmoid functions;WX_inFor with XtThe input gate parameter matrix of multiplication;Wh_inFor ht-1Multiplication input gate
Parameter matrix;Wc_inFor with ct-1The input gate parameter matrix of multiplication;binTo calculate the biasing of input gate.
Further, rapid 22 with step 3 two described in the t time calculate LSTM models out gate otIt is according to mould
What type (5) obtained, model (5) is as follows:
ot=σ (WX_oXt+Wh_oht-1+Wc_oct-1+bo) (5)
Wherein, WX_oFor with XtThe out gate parameter matrix of multiplication;Wh_oFor ht-1Multiplication out gate parameter matrix;Wc_oFor with
ct-1The out gate parameter matrix of multiplication;boTo calculate the biasing of out gate.
Further, step 2 two and the forgetting door f of the t times calculating LSTM model described in step 3 twotIt is basis
What model (6) obtained, model (6) is as follows:
ft=σ (WX_fXt+Wh_fht-1+Wc_fct-1+bf) (6)
Wherein, WX_fFor with XtThe forgetting door parameter matrix of multiplication;Wh_fFor ht-1It is multiplied and forgets door parameter matrix;Wc_fFor with
ct-1The forgetting door parameter matrix of multiplication;bfThe biasing of door is forgotten to calculate.
Further, according in step 2 two and step 3 twot、otAnd ftMnemon value c is calculatedtAnd hidden layer
Value htDetailed process be:
step1:The t times first mnemon value calculated when being not added with
Wherein, WX_cFor with XtThe mnemon parameter matrix of multiplication;Wh_cFor ht-1Multiplication mnemon parameter matrix;bcFor
The biasing of mnemon;
step 2:The input gate value in being calculated respectively according to model (4) and model (6)t, forget gate value ft, be not added with door
When mnemon valueAnd ct-1Calculate the mnemon value c of the t times calculatingt:
Finally, using mnemon value ctThe out gate o being calculated with formula (5)tThe value h of hidden layer is calculatedt, ht's
Concrete model is as follows
ht=otgtanh(ct) (9)。
Further, step 2 four in step 3 four using the cost computational methods of the entire sequence of transfer value with that will be walked
The rapid 23 hidden layer result h obtained with step 3 threefAnd hbIt carries out sequence cost and optimization aim is calculatedIt utilizes
Gradient descent algorithm optimizesCarry out LSTM the newer detailed process of parameter θ be:
The first step:First with hidden layer hfAnd hbSequence of calculation xkLabeled as the cost Q of labelt:
Qt=hf(t)gWf+hb(t)gWb+b (10)
Wherein, WfFor with hf(t) parameter matrix being multiplied;WbFor with hb(t) parameter matrix being multiplied;B is inclined for final output
It puts;
Second step:The cost of label transfer is described using transfer value matrix A, wherein, transfer value Ai,jRepresent from
Whole cost, that is, optimization aim of the transfer value of label i to label j, then list entries XFor:
Third walks:Using Maximum Likelihood Estimation Method, the Probability p for maximizing correct path is calculated:
costrightCost for correct path;
4th step:Neural network parameter θ is obtained according to the Probability p for maximizing correct path using gradient descent algorithm.
Further, the newer neural network parameter in step 3 four and step 2 four in step 2 and step 3
θ is labeled the path searching of result, obtains the annotation results specific method of language material:The cost cost of list entries X is carried out
Arrangement obtains Matrix C, and the annotation results of the testing material in mark language material are obtained using viterbi algorithm calculating matrix C.
Further, it is repeated in step 5 Step 2: the number of step 3 and step 4 reaches maximum value 60~90 times.
Advantageous effect of the present invention:
A kind of name entity recognition method based on deep learning towards medical field, the present invention relates to name Entity recognitions
Method, affiliated information extraction field, correlative study have facilitation to name Entity recognition research.The present invention wishes to alleviate medical treatment
The Entity recognition in field lacks the problem of mark language material again, studies how medical field utilizes a large amount of News Field mark language
Material and a small amount of medical field mark language material are preferably marked.By the present invention in that with deep learning method, further excavate
The information that language material is contained;Large-scale corpus information is introduced to prevent model in testing simultaneously, due to occurring not having excessively
Trained Opening field conventional word and the problem of reducing effect.The results show, it is this to be based on deeply towards medical field
The name entity recognition method of study is spent compared with traditional medical field name entity recognition method, is more suitable for medical field
Name Entity recognition.
A kind of name entity recognition method based on deep learning towards medical field, the present invention relates to name Entity recognitions
Method, affiliated information extraction field, correlative study have facilitation to name Entity recognition research.The present invention wishes to alleviate medical treatment
The Entity recognition in field lacks the problem of mark language material again, studies how more preferable using a small amount of mark language material progress medical field is
Mark.By the present invention in that with deep learning method, the information that language material is contained further is excavated;Introduce extensive language simultaneously
Information is expected to prevent model in testing, due to there is the excessively reducing effect without trained Opening field conventional word
The problem of.The results show, this name entity recognition method and traditional medical based on deep learning towards medical field
Field name entity recognition method is compared, and is more suitable for the name Entity recognition of medical field.
The present invention relates to name entity recognition method, the name towards medical field more particularly to based on deep learning is real
Body recognition methods.There is facilitation in information extraction field belonging to the present invention to name Entity recognition research.
The purpose of the invention is to make full use of existing medical field name Entity recognition mark language material, depth is promoted
Neural network is in the performance of medical field name Entity recognition task.It is marked simultaneously in order to solve medical field name Entity recognition
The present situation of language material scarcity, using on a large scale without labeled data and extensive News Field language material participation model training, it is proposed that one
It plants towards name entity recognition method of the medical field based on deep learning.
The correlative study of the present invention improves the performance of medical field name Entity recognition, is not only to informatics, language
The evidence of correlation theory is learned, while has facilitation to natural language understanding.In order to improve the performance of name Entity recognition, this hair
The bright name Entity recognition mark language material for taking full advantage of existing a small amount of medical field, by using LSTM deep neural networks
Modeling, and the information of extensive raw language material is added using the pre-training technology of deep neural network, this method is compared to tradition
Method is compared, and both without manually marking more Entity recognition language materials, reduces drain on manpower and material resources, and can improve medical treatment
Name the performance of Entity recognition in field.
The present invention does not require the granularity that language material pre-processes, and can be both labeled by word, can also be carried out by word, this
Training is expected used in depending primarily on.All seldom occur in view of many words of entity of medical field in Opening field, use
Word granularity be trained may require that for pre-training language material segment, may bring some difficulty.In order to reduce people to greatest extent
The consumption of power material resources, compares to recommend and is handled by word.
Generally speaking, this method propose a kind of name entity recognition methods based on deep learning towards medical field.
Using a small amount of medical language material training pattern, and the text largely crawled in online medical question and answer website is marked, it is right
Two kinds of model annotation results have carried out the statistics of high frequency words, comparison such as following table:
Table 1 is compared for the high frequency words of CRF models question and answer language material test online with LSTM models
Runic is the annotation results significantly without medical meaning in table, it can be seen that LSTM performances are much better than CRF models.
Description of the drawings
Fig. 1 is a kind of name Entity recognition side based on deep learning towards medical field that specific embodiment one proposes
Method flow chart;Fig. 2 is the calculation flow chart for the LSTM that specific embodiment one proposes.
Specific embodiment
With reference to specific embodiment, the present invention will be further described, but the present invention should not be limited by the examples.
Specific embodiment one:With reference to a kind of towards name of the medical field based on deep learning of Fig. 1 present embodiments
Entity recognition method is specifically prepared according to following steps:
Step 1: the medical language material using no mark carries out term vector veciTraining, obtained supplement medical field language
The corresponding term vector vec of vocabulary voc and vocabulary voc of material;Wherein, vec=[vec1,vec2,…,vecn];Voc=[voc1,
voc2,…,vocn];Wherein i=1,2 ..., n;Vec=vec1,vec2,K,veci,K,vecn;Voc=voc1,voc2,K,
voci,K,vocn;N is without the word type total number in mark language material;
Step 2: there is the training corpus in mark language material to carry out shot and long term mnemon network LSTM using News Field
Training;Pre-training by the use of term vector vec described in step 1 as the training of the shot and long term mnemon network LSTM to
Amount, using LSTM methods according to pre-training vector and xk、ykCalculation optimization targetUtilize gradient descent algorithm
OptimizationCarry out the parameter θ of LSTMCUpdate;It is described to there is mark language material to include training corpus and testing material, finally
To the parameter of LSTMWherein, parameterFor LSTM model parameters θCThe numerical value when final nth iteration restrains, specifically
Including:WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、Wh_f、Wc_f、bin、boOr bf;Wherein:Wx_in:Hidden layer input gate inputs
Weighting parameter;Wh_in:Hidden layer input door state input weighting parameter;Wc_in:Hidden layer mnemon inputs weighting parameter;Wx_o:It is hidden
Layer out gate input weighting parameter;Wh_o:Hidden layer output door state input weighting parameter;Wc_o:Hidden layer mnemon output layer is weighed
Value parameter;Wx_f:Hidden layer forgets door input weighting parameter;Wh_f:Hidden layer forgets door state input weighting parameter;Wc_f:Hidden layer is forgotten
Door mnemon input weighting parameter;bin:Hidden layer input gate offset parameter; bo:Hidden layer out gate offset parameter;bf:Hidden layer is lost
Forget an offset parameter.
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykFor
The corresponding annotation results vector of the training corpus having in mark language material of k-th of sample;
Step 3: there is the training corpus in mark language material to carry out shot and long term mnemon network LSTM using medical domain
Training;By the use of the term vector vec that step 1 obtains as the pre-training of the training of the shot and long term mnemon network LSTM
Vector, using LSTM methods according to pre-training vector and xk、ykCalculation optimization targetDeclined using gradient and calculated
Method optimizesCarry out the update of the parameter θ of LSTM;It is described to there is mark language material to include training corpus and testing material;
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykFor
The corresponding annotation results vector of the training corpus having in mark language material of k-th of sample;
Step 4: the test of LSTM;Input has mark language material, the newer neural network parameter θ in step 2CInto
The path searching of row annotation results obtains the annotation results of mark language material;Using naming Entity recognition evaluation criteria F values to having
The annotation results of testing material in mark language material are assessed, and specific assessment computational methods are as follows:
The entity word sum of the correct entity word number/mark of accuracy rate=mark
The correct entity word number of recall rate=mark/entity word sum
Accuracy rate recall rate/(accuracy rate+recall rate) (14) of F values=2;
Step 5: there will be mark language material to repeat Step 2: step 3 and step 4, until the name Entity recognition of step 4
Evaluation criteria F values do not increase or repeat step 2 and step 3 number reach maximum value 50~100 times until.
Present embodiment effect:
A kind of name entity recognition method based on deep learning towards medical field, present embodiment are related to naming entity
Recognition methods, affiliated information extraction field, correlative study have facilitation to name Entity recognition research.Present embodiment is wished
The Entity recognition for alleviating medical field lacks the problem of marking language material again, studies how medical field utilizes a small amount of medical domain
Mark language material and extensive News Field mark language material are preferably marked.Present embodiment is by using deep learning side
Method further excavates the information that language material is contained, and the study of extensive linguistic feature is carried out using News Field mark language material;
Large-scale corpus information is introduced to prevent model in testing simultaneously, due to occurring excessively without trained Opening field
Conventional word and the problem of reducing effect.The results show, this name entity towards medical field based on deep learning are known
Compared with other method names entity recognition method with traditional medical field, it is more suitable for the name Entity recognition of medical field.
Present embodiment is related to naming entity recognition method, the life towards medical field more particularly to based on deep learning
Name entity recognition method.There is facilitation in the affiliated information extraction field of present embodiment to name Entity recognition research.
The purpose of present embodiment is in order to which existing medical field name Entity recognition is made full use of to mark language material, and borrow
Extensive News Field mark language material is helped to promote performance of the deep neural network in medical field name Entity recognition task.Together
When in order to solve medical field name Entity recognition mark language material scarcity present situation, using extensive medical field without labeled data
Participate in model training, it is proposed that a kind of name entity recognition method based on deep learning towards medical field.
The correlative study of present embodiment improve medical field name Entity recognition performance, be not only to informatics,
The evidence of linguistics correlation theory, while have facilitation to natural language understanding.In order to improve the performance of name Entity recognition,
Present embodiment takes full advantage of the name Entity recognition mark language material of existing a small amount of medical field, by using LSTM depth
Neural net model establishing adds the information on a large scale without mark language material, and will be new using the pre-training technology of deep neural network
The model parameter in news field is dissolved into the LSTM deep neural network models of medical field.This method compares conventional method phase
Than both without manually marking more Entity recognition language materials, reducing drain on manpower and material resources, and medical field life can be improved
The performance of name Entity recognition.
Present embodiment does not require the granularity that language material pre-processes, and can be both labeled by word, can also by word into
Row, this depends primarily on used training and expects.All seldom go out in view of many words of entity of medical field in Opening field
Existing, word granularity, which is trained, may require that and segmented for pre-training language material, may bring some difficulties.In order to subtract to greatest extent
The consumption of few human and material resources, compares to recommend and is handled by word.
Generally speaking, this method propose a kind of name entity recognition methods based on deep learning towards medical field.
Using a small amount of medical language material training pattern, and the text largely crawled in online medical question and answer website is marked, it is right
Two kinds of model annotation results have carried out the statistics of high frequency words, comparison such as following table:
The high frequency words of table 2CRF models question and answer language material test online with LSTM models compare
Runic is apparent meaningless annotation results in table, it can be seen that LSTM performances are much better than CRF models.
Specific embodiment two:The present embodiment is different from the first embodiment in that:
The corresponding term vector vec of vocabulary voc and vocabulary voc are carried out pre-training by step 2 one;Utilize xkIt is obtained with step 1
To term vector vec be calculated the input X of LSTM neural networks, wherein, the input X of LSTM neural networks is calculated
Using two methods, two methods are specially:It is a kind of be using term vector vec as the initial value of LSTM models selected by method
That is method one;Another method is using term vector vec as the method selected by the input of LSTM neural networks i.e. method two;
Step 2 two, using inputting Xt, the t-1 times hidden layer h being calculatedt-1The memory list being calculated with the t-1 times
First ct-1Calculate the input gate in of the LSTM models of the t times calculatingt, LSTM models out gate otAnd the forgetting door of LSTM models
ft;According int、otAnd ftMnemon value c is calculatedtWith hidden layer value ht;Wherein, X=X1, X2..., Xt..., XT;
Step 2 three, to list entries X, respectively from by X1To XTSequence be separately input to step 2 two and be brought into formula
(9) the hidden layer output h obtainedf;From XTTo X1Sequence be separately input to step 2 two and be brought into formula (9), obtained hidden layer
Export hb;
Step 2 four, the hidden layer result for being obtained step 2 three using the cost computational methods of the entire sequence of transfer value
hfAnd hbIt carries out sequence cost and optimization aim is calculatedOptimized using gradient descent algorithmIt carries out
The parameter θ of LSTMCUpdate;Wherein, θCFor word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、 Wh_f、Wc_f、
bin、boOr bf;Other steps and parameter are same as the specific embodiment one.
Specific embodiment three:The present embodiment is different from the first embodiment in that:
The corresponding term vector vec of vocabulary voc and vocabulary voc are carried out pre-training by step 3 one;Utilize xkIt is obtained with step 1
To term vector vec be calculated the input X of LSTM neural networks, wherein, the input X of LSTM neural networks is calculated
Using two methods, two methods are specially:It is a kind of be using term vector vec as the initial value of LSTM models selected by method
That is method one;Another method is using term vector vec as the method selected by the input of LSTM neural networks i.e. method two;
Step 3 two, load News Field LSTM train to obtain model parameter θn, in θnParameter basis on using input
Xt, the t-1 times hidden layer h being calculatedt-1The mnemon c being calculated with the t-1 timest-1Calculate the LSTM of the t times calculating
The input gate in of modelt, LSTM models out gate otAnd the forgetting door f of LSTM modelst;According int、otAnd ftIt is calculated
Mnemon value ctWith hidden layer value ht;Wherein, X=X1, X2..., Xt..., XT;
Step 3 three, to list entries X, respectively from by X1To XTSequence be separately input to step 2 two and be brought into formula
(9) the hidden layer output h obtainedf;From XTTo X1Sequence be separately input to step 2 two and be brought into formula (9), obtained hidden layer
Export hb;
Step 3 four, the hidden layer result for being obtained step 2 three using the cost computational methods of the entire sequence of transfer value
hfAnd hbIt carries out sequence cost and optimization aim is calculatedOptimized using gradient descent algorithmIt carries out
The update of the parameter θ of LSTM;Wherein, θ word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、 Wh_f、Wc_f、bin、
boOr bf;Other steps and parameter are same as the specific embodiment one.
Specific embodiment four:Unlike one of present embodiment and specific embodiment two to three:Step 2 one with
The input X detailed processes of LSTM neural networks are calculated described in step 3 one using method one:
The training corpus vocabulary voc ' having in mark language material is established, by v o ' c and voc combinatorial word Table V OC;VOC=
VOC1,VOC2,VOC3,K,VOCN;
The corresponding vector matrix word_emb of random initializtion vocabulary VOC so that vector matrix word_emb dimensions and word
Vector v ec is identical, and carries out assignment by formula (1):
word_embiFor i-th of term vector in word_emb;
Finally by xk[k1,k2]It is multiplied to obtain the input X of LSTM neural networks with word_emb:
X=xk[k1,k2]gword_emb (2)
Wherein, xk[k1,k2]For word sequence xkWord sequence between middle k1 and k2.Other steps and parameter and specific embodiment party
One of formula two or three is identical.
Specific embodiment five:Unlike one of present embodiment and specific embodiment two to three:
Step 2 one and the specific mistakes of input X that LSTM neural networks are calculated described in step 3 one using method two
Journey:
The corresponding vector matrix word_emb of random initializtion vocabulary VOC, and keep vector after carrying out assignment by formula (1)
word_embiIt is constant, i.e., it is updated not as parameter, then the corresponding vector of a vocabulary in random initializtion vocabulary VOC
Matrix is word_emb_para, calculates the input X of LSTM neural networks:
By word_emb parameters it is fixed in the case of, word_emb_para then fully according to standard parameter update.Other steps
Rapid and one of parameter and specific embodiment two or three are identical.
Specific embodiment six:Unlike one of present embodiment and specific embodiment two to three:Step 2 two with
Calculate the input gate in of LSTM models (or mnemon) the t time described in step 3 twotSpecially:
int=σ (WX_inXt+Wh_inht-1+Wc_inct-1+bin) (4)
Wherein, σ is sigmoid functions;WX_inFor with XtThe input gate parameter matrix of multiplication;Wh_inFor ht-1Multiplication input gate
Parameter matrix;Wc_inFor with ct-1The input gate parameter matrix of multiplication;binTo calculate the biasing of input gate.Other steps and parameter
It is identical with one of specific embodiment two to three.
Specific embodiment seven:Unlike one of present embodiment and specific embodiment two to three:Step 2 two with
The out gate o of (or the mnemon) of the t times described in step 3 two calculating LSTM modelt(output gate's) is specific
Process is:
ot=σ (WX_oXt+Wh_oht-1+Wc_oct-1+bo) (5)
Wherein, WX_oFor with XtThe out gate parameter matrix of multiplication;Wh_oFor ht-1Multiplication out gate parameter matrix;Wc_oFor with
ct-1The out gate parameter matrix of multiplication;boTo calculate the biasing of out gate.Other steps and parameter and specific embodiment two to
One of three is identical.
Specific embodiment eight:Unlike one of present embodiment and specific embodiment two to three:Step 2 two with
Forgetting door (forget gate) f of (or the mnemon) of the t times described in step 3 two calculating LSTM modeltIt is specific
Process is:
ft=σ (WX_fXt+Wh_fht-1+Wc_fct-1+bf) (6)
Wherein, WX_fFor with XtThe forgetting door parameter matrix of multiplication;Wh_fFor ht-1It is multiplied and forgets door parameter matrix;Wc_fFor with
ct-1The forgetting door parameter matrix of multiplication;bfThe biasing of door is forgotten to calculate.Other steps and parameter and specific embodiment two to
One of three is identical.
Specific embodiment nine:Unlike one of present embodiment and specific embodiment two to three:Step 2 two with
According in step 3 twot、otAnd ftMnemon value c is calculatedtWith hidden layer value htSpecially:
(1), the t times first mnemon value calculated when being not added with
Wherein, WX_cFor with XtThe mnemon parameter matrix of multiplication;Wh_cFor ht-1Multiplication mnemon parameter matrix;bcFor
The biasing of mnemon;
(2), the input gate value in being calculated according to (4), (6)t, forget gate value ft, mnemon value when being not added with
And ct-1Calculate the mnemon value c of the t times calculatingt:
Finally, using mnemon value ctThe out gate o being calculated with formula (5)tThe value h of hidden layer is calculatedt:
ht=otgtanh(ct) (9)。
Other steps and one of parameter and specific embodiment one to six are identical.
Specific embodiment ten:Unlike one of present embodiment and specific embodiment two to three:Step 2 four with
The hidden layer for being obtained step 2 three and step 3 three using the cost computational methods of the entire sequence of transfer value in step 3 four
As a result hfAnd hbIt carries out sequence cost and optimization aim is calculatedOptimized using gradient descent algorithm
Carry out the update detailed process of the parameter θ of LSTM:
(1), first with hidden layer hfAnd hbSequence of calculation xkLabeled as the cost Q of labelt:
Qt=hf(t)gWf+hb(t)gWb+b (10)
Wherein, WfFor with hf(t) parameter matrix being multiplied;WbFor with hb(t) parameter matrix being multiplied;B is inclined for final output
It puts;
(2), transfer value matrix A is described to the cost of label transfer, if transfer value is Ai,jRepresent from label i to
Whole cost, that is, optimization aim of the transfer value of label j, then list entries XFor:
(3), using Maximum Likelihood Estimation Method, the Probability p for maximizing correct path is calculated:
costrightCost for correct path;
Although the number in all paths is the number of an index exploding, all path costs in formula (12) it
With need not traverse all paths, can be obtained in linear session using dynamic programming algorithm;
(4), using gradient descent algorithm according to the Probability p neural network parameter θ for maximizing correct path;Wherein, θ is updated
Include the variable mentioned in all step 2 one, 22 as neural network parameter θ;Sequence of calculation cost is needed to obtain system
Optimization aim.Other steps and one of parameter and specific embodiment two to three are identical.
Specific embodiment 11:Unlike one of present embodiment and specific embodiment two or three:Step 3 four
The path searching of result is labeled with the newer neural network parameter in step 2 four in step 2 and step 3, is obtained
To the annotation results specific method of language material:
The cost cost of list entries X is arranged to obtain Matrix C, is had using viterbi algorithm calculating matrix C
Mark the annotation results of the testing material in language material.Other steps and one of parameter and specific embodiment two or three are identical.
Specific embodiment 12:Unlike one of present embodiment and specific embodiment two or three:In step 5
It repeats Step 2: the number of step 3 and step 4 reaches maximum value 60~90 times.Other steps and parameter and specific embodiment party
One of formula two or three is identical.
Although the present invention is disclosed as above with preferred embodiment, it is not limited to the present invention, any to be familiar with this
The people of technology without departing from the spirit and scope of the present invention, can do various changes and modification, therefore the protection of the present invention
Range should be subject to what claims were defined.
Claims (10)
1. a kind of name entity recognition method based on deep learning towards medical field, which is characterized in that this method it is specific
Step is as follows:
Step 1:Term vector vec is carried out using the medical language material of no markiTraining, obtain supplement medical field language material vocabulary
The corresponding term vector vec of voc and vocabulary voc;Vec=[vec1,vec2,…,vecn];Voc=[voc1,voc2,…,vocn];
Wherein i=1,2 ..., n, n are without the word type total number in mark language material;
Step 2:The instruction of shot and long term mnemon network LSTM is carried out using the training corpus having in mark language material of News Field
Practice;By the use of term vector vec described in step 1 as the pre-training of the training of shot and long term mnemon network LSTM vector, profit
With LSTM methods according to pre-training vector and xk、ykCalculation optimization targetOptimized using gradient descent algorithmCarry out the parameter θ of LSTMCUpdate;It is described to there is mark language material to include training corpus and testing material, it finally obtains
The parameter of LSTMWherein, parameterFor LSTM model parameters θCThe numerical value when final nth iteration restrains, it is specific to wrap
It includes:WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、Wh_f、Wc_f、bin、boOr bf;Wherein:Wx_in:Hidden layer input gate input power
Value parameter;Wh_in:Hidden layer input door state input weighting parameter;Wc_in:Hidden layer mnemon inputs weighting parameter;Wx_o:Hidden layer
Out gate inputs weighting parameter;Wh_o:Hidden layer output door state input weighting parameter;Wc_o:Hidden layer mnemon output layer weights
Parameter;Wx_f:Hidden layer forgets door input weighting parameter;Wh_f:Hidden layer forgets door state input weighting parameter;Wc_f:Hidden layer forgets door
Mnemon inputs weighting parameter;bin:Hidden layer input gate offset parameter;bo:Hidden layer out gate offset parameter;bf:Hidden layer is forgotten
Door offset parameter;
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykIt is k-th
The corresponding annotation results vector of the training corpus having in mark language material of sample;
Step 3:The instruction of shot and long term mnemon network LSTM is carried out using the training corpus having in mark language material of medical domain
Practice;It is vectorial as the pre-training of the training of the shot and long term mnemon network LSTM by the use of the term vector vec that step 1 obtains,
Using LSTM methods according to pre-training vector and xk、ykCalculation optimization targetIt is excellent using gradient descent algorithm
ChangeCarry out the update of the parameter θ of LSTM;It is described to there is mark language material to include training corpus and testing material;
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykIt is k-th
The corresponding annotation results vector of the training corpus having in mark language material of sample;
Step 4:To parameter, updated LSTM is tested, and test process is:There is mark described in input step two and step 3
Language material, the newer neural network parameter θ in step 2CThe path searching of result is labeled, has obtained mark language material
Annotation results;The annotation results for having the testing material in mark language material are carried out using Entity recognition evaluation criteria F values are named
Assessment, and obtain having after assessment annotation results and mark it is anticipated that specifically assessment computational methods are as follows:
The entity word sum of the correct entity word number/mark of accuracy rate=mark
The correct entity word number of recall rate=mark/entity word sum
Accuracy rate recall rate/(accuracy rate+recall rate) of F values=2
Step 5: there will be mark language material to repeat step 2 to step 4, until the assessment mark of name Entity recognition described in step 4
Quasi- F values do not increase or repeat step 2 and step 4 number reach maximum value 50~100 times until.
2. entity recognition method is named according to claim 1, which is characterized in that the parameter θ of LSTM described in step 2CMore
Newly it is as follows:
Step 2 one:The corresponding term vector vec of vocabulary voc and vocabulary voc are subjected to pre-training;Utilize xkIt is obtained in step 1
Term vector vec the list entries X of LSTM neural networks is calculated wherein, X=X1, X2..., Xt..., XT;
Step 2 two:Using inputting Xt, the t-1 times hidden layer h being calculatedt-1The mnemon c being calculated with the t-1 timest-1
Calculate the input gate in of the LSTM models of the t times calculatingt, LSTM models out gate otAnd the forgetting door f of LSTM modelst;Root
According int、otAnd ftMnemon value c is calculatedtWith hidden layer value ht;Wherein, hidden layer value htConcrete model be:ht=otgtanh
(ct);
Step 2 three:By list entries X=X described in step 2 one1, X2..., Xt..., XTInternal each element is pressed from X1It arrives
XTSequence be sequentially inputted to hidden layer value h described in step 2 twotConcrete model in and obtain hidden layer forget door output hf;Then,
By list entries X=X described in step 2 one1, X2..., Xt..., XTInternal each element is pressed from XTTo X1Sequence successively
It is input to hidden layer value h described in step 2 twotConcrete model in and obtain hidden layer output hb;
Step 2 four:The hidden layer result h for being obtained step 2 three using the cost computational methods of the entire sequence of transfer valuefWith
hbIt carries out the calculating of sequence cost and obtains optimization aimOptimized using gradient descent algorithmCarry out LSTM's
Parameter θCUpdate;Wherein, θCFor word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、Wh_f、Wc_f、bin、boOr
bf, wherein, word_emb is pre-training term vector weighting parameter.
3. entity recognition method is named according to claim 1, which is characterized in that the parameter θ of LSTM described in step 3 is more
Newly it is as follows:
The corresponding term vector vec of vocabulary voc and vocabulary voc are carried out pre-training by step 3 one;Utilize xkIt is obtained in step 1
Term vector vec be calculated the list entries X of LSTM neural networks, wherein, wherein, X=X1, X2..., Xt..., XT;
Step 3 two, load News Field LSTM train to obtain model parameter θn, in θnParameter basis on using inputting Xt、
The t-1 times hidden layer h being calculatedt-1The mnemon c being calculated with the t-1 timest-1Calculate the LSTM models of the t times calculating
Input gate int, LSTM models out gate otAnd the forgetting door f of LSTM modelst;According int、otAnd ftMemory is calculated
Cell value ctWith hidden layer value ht;Wherein, hidden layer value htConcrete model be:ht=otgtanh(ct);
Step 3 three:By list entries X=X described in step 3 one1, X2..., Xt..., XTIt is pressed successively from X1To XTSequence point
It is not input to hidden layer value h described in step 3 twotConcrete model in and obtain hidden layer output hf;It then, will be defeated described in step 2 one
Enter sequence X=X1, X2..., Xt..., XTInternal each element is pressed from XTTo X1Sequence be separately input to two band of step 3
Enter to the hidden layer value htConcrete model in and obtain hidden layer output hb;
Step 3 four, the hidden layer result h for being obtained step 3 three using the cost computational methods of the entire sequence of transfer valuefWith
hbIt carries out sequence cost and optimization aim is calculatedOptimized using gradient descent algorithmCarry out LSTM
Parameter θ update;Wherein, θ word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、Wh_f、Wc_f、bin、boOr
bf。
4. entity recognition method is named according to Claims 2 or 3, which is characterized in that step 2 one and institute in step 3 one
The specific acquisition process of input X for stating LSTM neural networks is:
The training corpus vocabulary voc ' having in mark language material is established, vo ' c and voc are merged into vocabulary VO C;
VOC=VOC1,VOC2,VOC3,K,VOCN;
The corresponding vector matrix word_emb of random initializtion vocabulary VOC so that vector matrix word_emb dimensions and term vector
Vec is identical, and carries out assignment by formula (1):
word_embiFor i-th of term vector in word_emb;
Finally by xk[k1,k2]It is multiplied to obtain the input X of LSTM neural networks with word_emb:
X=xk[k1,k2]gword_emb (2)
Wherein, xk[k1,k2]For word sequence xkWord sequence between middle k1 and k2.
5. entity recognition method is named according to Claims 2 or 3, which is characterized in that step 2 one and institute in step 3 one
The specific acquisition process of input X for stating LSTM neural networks is:
The corresponding vector matrix word_emb of random initializtion vocabulary VOC, and keep vector after carrying out assignment by formula (1)
word_embiIt is constant, i.e., it is not updated vector as parameter,
The corresponding vector matrix of a vocabulary in random initializtion vocabulary VOC is word_emb_para again, according to formula (3)
Model calculate LSTM neural networks input X:
6. entity recognition method is named according to Claims 2 or 3, it is characterised in that:Described in step 2 two and step 3 two
The t times calculating LSTM models input gate intIt is to be obtained according to model (4), model (4) is as follows:
int=σ (WX_inXt+Wh_inht-1+Wc_inct-1+bin) (4)
Wherein, σ is sigmoid functions;WX_inFor with XtThe input gate parameter matrix of multiplication;Wh_inFor ht-1Multiplication input gate parameter
Matrix;Wc_inFor with ct-1The input gate parameter matrix of multiplication;binTo calculate the biasing of input gate.
7. entity recognition method is named according to Claims 2 or 3, which is characterized in that rapid 22 with step 3 two described in
The out gate o of the t times calculating LSTM modeltIt is to be obtained according to model (5), model (5) is as follows:
ot=σ (WX_oXt+Wh_oht-1+Wc_oct-1+bo) (5)
Wherein, WX_oFor with XtThe out gate parameter matrix of multiplication;Wh_oFor ht-1Multiplication out gate parameter matrix;Wc_oFor with ct-1Phase
The out gate parameter matrix multiplied;boTo calculate the biasing of out gate.
8. entity recognition method is named according to Claims 2 or 3, which is characterized in that described in step 2 two and step 3 two
The t times calculating LSTM models forgetting door ftIt is to be obtained according to model (6), model (6) is as follows:
ft=σ (WX_fXt+Wh_fht-1+Wc_fct-1+bf) (6)
Wherein, WX_fFor with XtThe forgetting door parameter matrix of multiplication;Wh_fFor ht-1It is multiplied and forgets door parameter matrix;Wc_fFor with ct-1Phase
The forgetting door parameter matrix multiplied;bfThe biasing of door is forgotten to calculate.
9. entity recognition method is named according to Claims 2 or 3, which is characterized in that step 2 two and root in step 3 two
According int、otAnd ftMnemon value c is calculatedtWith hidden layer value htDetailed process be:
step1:The t times first mnemon value calculated when being not added with
Wherein, WX_cFor with XtThe mnemon parameter matrix of multiplication;Wh_cFor ht-1Multiplication mnemon parameter matrix;bcFor memory
The biasing of unit;
step2:The input gate value in being calculated respectively according to model (4) and model (6)t, forget gate value ft, when being not added with
Mnemon valueAnd ct-1Calculate the mnemon value c of the t times calculatingt:
Finally, using mnemon value ctThe out gate o being calculated with formula (5)tThe value h of hidden layer is calculatedt, htSpecific mould
Type is as follows
ht=otgtanh(ct) (9)。
10. entity recognition method is named according to Claims 2 or 3, which is characterized in that step 2 four in step 3 four with adopting
The hidden layer result h for being obtained step 2 three and step 3 three with the cost computational methods of the entire sequence of transfer valuefAnd hbIt carries out
Optimization aim is calculated in sequence costOptimized using gradient descent algorithmCarry out the parameter of LSTM
Newer detailed process is:
The first step:First with hidden layer hfAnd hbSequence of calculation xkLabeled as the cost Q of labelt:
Qt=hf(t)gWf+hb(t)gWb+b (10)
Wherein, WfFor with hf(t) parameter matrix being multiplied;WbFor with hb(t) parameter matrix being multiplied;B is biased for final output;
Second step:The cost of label transfer is described using transfer value matrix A, wherein, transfer value Ai,jIt represents from label
Whole cost, that is, optimization aim of the transfer value of i to label j, then list entries XFor:
Third walks:Using Maximum Likelihood Estimation Method, the Probability p for maximizing correct path is calculated:
costrightCost for correct path;
4th step:Neural network parameter θ is obtained according to the Probability p for maximizing correct path using gradient descent algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711446980.8A CN108170675A (en) | 2017-12-27 | 2017-12-27 | A kind of name entity recognition method based on deep learning towards medical field |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711446980.8A CN108170675A (en) | 2017-12-27 | 2017-12-27 | A kind of name entity recognition method based on deep learning towards medical field |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108170675A true CN108170675A (en) | 2018-06-15 |
Family
ID=62518135
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711446980.8A Pending CN108170675A (en) | 2017-12-27 | 2017-12-27 | A kind of name entity recognition method based on deep learning towards medical field |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108170675A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109002436A (en) * | 2018-07-12 | 2018-12-14 | 上海金仕达卫宁软件科技有限公司 | Medical text terms automatic identifying method and system based on shot and long term memory network |
CN109284400A (en) * | 2018-11-28 | 2019-01-29 | 电子科技大学 | A kind of name entity recognition method based on Lattice LSTM and language model |
CN109325225A (en) * | 2018-08-28 | 2019-02-12 | 昆明理工大学 | It is a kind of general based on associated part-of-speech tagging method |
CN110598206A (en) * | 2019-08-13 | 2019-12-20 | 平安国际智慧城市科技股份有限公司 | Text semantic recognition method and device, computer equipment and storage medium |
CN111444720A (en) * | 2020-03-30 | 2020-07-24 | 华南理工大学 | Named entity recognition method for English text |
US20220067486A1 (en) * | 2020-09-02 | 2022-03-03 | Sap Se | Collaborative learning of question generation and question answering |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106202054A (en) * | 2016-07-25 | 2016-12-07 | 哈尔滨工业大学 | A kind of name entity recognition method learnt based on the degree of depth towards medical field |
US20170024645A1 (en) * | 2015-06-01 | 2017-01-26 | Salesforce.Com, Inc. | Dynamic Memory Network |
CN106557462A (en) * | 2016-11-02 | 2017-04-05 | 数库(上海)科技有限公司 | Name entity recognition method and system |
CN106569998A (en) * | 2016-10-27 | 2017-04-19 | 浙江大学 | Text named entity recognition method based on Bi-LSTM, CNN and CRF |
-
2017
- 2017-12-27 CN CN201711446980.8A patent/CN108170675A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170024645A1 (en) * | 2015-06-01 | 2017-01-26 | Salesforce.Com, Inc. | Dynamic Memory Network |
CN106202054A (en) * | 2016-07-25 | 2016-12-07 | 哈尔滨工业大学 | A kind of name entity recognition method learnt based on the degree of depth towards medical field |
CN106569998A (en) * | 2016-10-27 | 2017-04-19 | 浙江大学 | Text named entity recognition method based on Bi-LSTM, CNN and CRF |
CN106557462A (en) * | 2016-11-02 | 2017-04-05 | 数库(上海)科技有限公司 | Name entity recognition method and system |
Non-Patent Citations (1)
Title |
---|
李剑风: "融合外部知识的中文命名实体识别研究及其医疗领域应用", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109002436A (en) * | 2018-07-12 | 2018-12-14 | 上海金仕达卫宁软件科技有限公司 | Medical text terms automatic identifying method and system based on shot and long term memory network |
CN109325225A (en) * | 2018-08-28 | 2019-02-12 | 昆明理工大学 | It is a kind of general based on associated part-of-speech tagging method |
CN109325225B (en) * | 2018-08-28 | 2022-04-12 | 昆明理工大学 | Universal relevance-based part-of-speech tagging method |
CN109284400A (en) * | 2018-11-28 | 2019-01-29 | 电子科技大学 | A kind of name entity recognition method based on Lattice LSTM and language model |
CN109284400B (en) * | 2018-11-28 | 2020-10-23 | 电子科技大学 | Named entity identification method based on Lattice LSTM and language model |
CN110598206A (en) * | 2019-08-13 | 2019-12-20 | 平安国际智慧城市科技股份有限公司 | Text semantic recognition method and device, computer equipment and storage medium |
CN111444720A (en) * | 2020-03-30 | 2020-07-24 | 华南理工大学 | Named entity recognition method for English text |
US20220067486A1 (en) * | 2020-09-02 | 2022-03-03 | Sap Se | Collaborative learning of question generation and question answering |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106202054B (en) | A kind of name entity recognition method towards medical field based on deep learning | |
CN108170675A (en) | A kind of name entity recognition method based on deep learning towards medical field | |
CN109948165B (en) | Fine granularity emotion polarity prediction method based on mixed attention network | |
CN107239446B (en) | A kind of intelligence relationship extracting method based on neural network Yu attention mechanism | |
CN108874782B (en) | A kind of more wheel dialogue management methods of level attention LSTM and knowledge mapping | |
CN107168945B (en) | Bidirectional cyclic neural network fine-grained opinion mining method integrating multiple features | |
CN106156003B (en) | A kind of question sentence understanding method in question answering system | |
CN106886543B (en) | Knowledge graph representation learning method and system combined with entity description | |
CN105894088B (en) | Based on deep learning and distributed semantic feature medical information extraction system and method | |
CN108804654A (en) | A kind of collaborative virtual learning environment construction method based on intelligent answer | |
CN111414461B (en) | Intelligent question-answering method and system fusing knowledge base and user modeling | |
CN107562792A (en) | A kind of question and answer matching process based on deep learning | |
CN108229582A (en) | Entity recognition dual training method is named in a kind of multitask towards medical domain | |
CN109285562A (en) | Speech-emotion recognition method based on attention mechanism | |
CN109977234A (en) | A kind of knowledge mapping complementing method based on subject key words filtering | |
CN110232122A (en) | A kind of Chinese Question Classification method based on text error correction and neural network | |
CN108197294A (en) | A kind of text automatic generation method based on deep learning | |
CN108804677A (en) | In conjunction with the deep learning question classification method and system of multi-layer attention mechanism | |
CN107644062A (en) | The knowledge content Weight Analysis System and method of a kind of knowledge based collection of illustrative plates | |
CN111428481A (en) | Entity relation extraction method based on deep learning | |
CN112364623A (en) | Bi-LSTM-CRF-based three-in-one word notation Chinese lexical analysis method | |
CN114398976A (en) | Machine reading understanding method based on BERT and gate control type attention enhancement network | |
He et al. | Analysis of the communication method of national traditional sports culture based on deep learning | |
Etchells et al. | Learning what is important: feature selection and rule extraction in a virtual course. | |
CN111400445B (en) | Case complex distribution method based on similar text |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
AD01 | Patent right deemed abandoned | ||
AD01 | Patent right deemed abandoned |
Effective date of abandoning: 20210924 |