CN108170675A - A kind of name entity recognition method based on deep learning towards medical field - Google Patents

A kind of name entity recognition method based on deep learning towards medical field Download PDF

Info

Publication number
CN108170675A
CN108170675A CN201711446980.8A CN201711446980A CN108170675A CN 108170675 A CN108170675 A CN 108170675A CN 201711446980 A CN201711446980 A CN 201711446980A CN 108170675 A CN108170675 A CN 108170675A
Authority
CN
China
Prior art keywords
parameter
lstm
hidden layer
value
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711446980.8A
Other languages
Chinese (zh)
Inventor
朱聪慧
赵铁军
关毅
李岳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Fuman Science And Technology Co Ltd
Original Assignee
Harbin Fuman Science And Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Fuman Science And Technology Co Ltd filed Critical Harbin Fuman Science And Technology Co Ltd
Priority to CN201711446980.8A priority Critical patent/CN108170675A/en
Publication of CN108170675A publication Critical patent/CN108170675A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The present invention proposes a kind of name entity recognition method based on deep learning towards medical field.This method is that the training of shot and long term mnemon network LSTM is carried out by one, using the training corpus having in mark language material of medical field;2nd, the newer neural network parameter θ in one is labeled the path searching of result, obtains the annotation results of mark language material, and the annotation results for having the testing material in mark language material are assessed using Entity recognition evaluation criteria F values are named;3rd, in the training process in one, first with the training for having mark language material progress shot and long term mnemon network LSTM of News Field, there is mark language material further according to the model and medical field instructed, carry out the training of the model of medical field, using name Entity recognition evaluation criteria F values the annotation results for having the testing material in mark language material assess and etc. realization.The present invention is applied to name Entity recognition field.

Description

A kind of name entity recognition method based on deep learning towards medical field
Technical field
It is more particularly to a kind of towards name of the medical field based on deep learning the present invention relates to name entity recognition method Entity recognition method.
Background technology
One of the basic task of Entity recognition as information extraction is named, in question answering system, syntactic analysis, machine translation etc. There is important application in field.Medical bodies and common solid difference are larger, and Opening field entity marks corpus information to medical treatment The effect of entity mark is little;The Entity recognition of medical field lacks mark language material again simultaneously, and this is mainly due to medical bodies Judgement needs professional person to carry out, and substantially increases the cost of medical field entity mark.Therefore, it is how sharp in medical field It is highly important with a small amount of mark language material preferably mark.
Deep learning is achieving major progress in recent years, it has been proved to be able to excavate out the complexity in high dimensional data Structure is learnt.At present in natural language processing field, a kind of new word representation method:Term vector (word Embedding immense success) is achieved.
Term vector (word embedding) is the word expression for being commonly used to substitute traditional bag of words (bag of word) in recent years Method solves the problems, such as the dimension disaster that bag of words expression is brought.Researcher also found, the word obtained by train language model Vector has contained the semantic information of vocabulary, and similarity of vocabulary etc. can also be to a certain extent obtained by some algorithms Data.Further, since the training of term vector is without any mark work, so around term vector study can be much less Workload can also train on demand:Both can use a large amount of open language materials train to obtain can be general good term vector represent, The language material in same field can also be selected to train to obtain the term vector to some domain-specific, more can directly be carried out according to task Training.
The training of term vector is generally carried out using deep neural network, and in natural language processing field, cycle nerve net Network (RNN) model is one of most widely used neural network.In natural language processing field, information above is to shadow hereafter Sound is generally portrayed with language model, and information above is naturally utilized using the hidden layer of a cycle feedback in RNN models, And can use in theory to whole information above, this is that conventional language model cannot be accomplished.But RNN models are in reality There are problems that gradient disappearance in the application of border, shot and long term mnemon (Long Short-Term Memory, LSTM) is exactly pair One in RNN is effectively improved.LSTM can not be effectively retained the present situation of information needed for RNN, use mnemon (Memory Cell) records information, and introduces the update and use of multiple doors (gate) control mnemon so that required letter Breath can be preserved effectively.LSTM has been widely used in now from participle, part-of-speech tagging, name Entity recognition to machine In the natural language processings tasks such as translation.
A common technology is pre-training technology in deep neural network.Multiple achievements in research prove, use big rule Mould language material carries out the term vector that unsupervised training obtains to initialize the parameter of neural network, can be with than random initializtion training Better model is obtained, the term vector obtained this is mainly due to pre-training can be utilized on a large scale without labeled data, contained The information not having in training data, and the term vector of random initializtion can be prevented to be absorbed in office in optimization process to a certain extent Portion's extreme value.For the rare medical field of training data, can be using supplemental training is carried out without labeled data on a large scale It is very meaningful.
The model that name Entity recognition task uses at present mainly has using CRF as the conventional model of representative and depth nerve net Two class of network model, and generally also using traditional CRF models in medical field.
CRF models, in the case where training corpus extremely lacks, will appear due to not considering semantic information in annotation results A large amount of meaningless annotation results, and the semantic information that LSTM models contain can prevent this from occurring.
Invention content
The purpose of the present invention is to solve CRF models due to not considering semantic information, extremely lack in training corpus In the case of, the problem of a large amount of meaningless annotation results are will appear in annotation results, by large-scale News Field language material, And a kind of name entity recognition method based on deep learning towards medical field proposed.
Above-mentioned goal of the invention is achieved through the following technical solutions:
A kind of name entity recognition method based on deep learning towards medical field, which is characterized in that the tool of this method Body step is as follows:
Step 1:Term vector vec is carried out using the medical language material of no markiTraining, obtain supplement medical field language material Vocabulary voc and the corresponding term vector vec of vocabulary voc;Vec=[vec1,vec2,…,vecn];Voc=[voc1,voc2,…, vocn];Wherein i=1,2 ..., n, n are without the word type total number in mark language material;
Step 2:There is the training corpus in mark language material to carry out shot and long term mnemon network LSTM using News Field Training;Pre-training by the use of term vector vec described in step 1 as the training of the shot and long term mnemon network LSTM to Amount, using LSTM methods according to pre-training vector and xk、ykCalculation optimization targetUtilize gradient descent algorithm OptimizationCarry out the parameter θ of LSTMCUpdate;It is described to there is mark language material to include training corpus and testing material, finally To the parameter of LSTMWherein, parameterFor LSTM model parameters θCThe numerical value when final nth iteration restrains, specifically Including:WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、Wh_f、Wc_f、bin、boOr bf;Wherein:Wx_in:Hidden layer input gate inputs Weighting parameter;Wh_in:Hidden layer input door state input weighting parameter;Wc_in:Hidden layer mnemon inputs weighting parameter;Wx_o:It is hidden Layer out gate input weighting parameter;Wh_o:Hidden layer output door state input weighting parameter;Wc_o:Hidden layer mnemon output layer is weighed Value parameter;Wx_f:Hidden layer forgets door input weighting parameter;Wh_f:Hidden layer forgets door state input weighting parameter;Wc_f:Hidden layer is forgotten Door mnemon input weighting parameter;bin:Hidden layer input gate offset parameter; bo:Hidden layer out gate offset parameter;bf:Hidden layer is lost Forget an offset parameter;
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykFor The corresponding annotation results vector of the training corpus having in mark language material of k-th of sample;
Step 3:There is the training corpus in mark language material to carry out shot and long term mnemon network LSTM using medical domain Training;By the use of the term vector vec that step 1 obtains as the pre-training of the training of the shot and long term mnemon network LSTM Vector, using LSTM methods according to pre-training vector and xk、ykCalculation optimization targetDeclined using gradient and calculated Method optimizesCarry out the update of the parameter θ of LSTM;It is described to there is mark language material to include training corpus and testing material;
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykFor The corresponding annotation results vector of the training corpus having in mark language material of k-th of sample;
Step 4:To parameter, updated LSTM is tested, and test process is:Have described in input step two and step 3 Mark language material, the newer neural network parameter θ in step 2CThe path searching of result is labeled, has obtained mark The annotation results of language material;Using name Entity recognition evaluation criteria F values to have mark language material in testing material annotation results It is assessed, and obtains having after assessment annotation results and mark it is anticipated that specifically assessment computational methods are as follows:
The entity word sum of the correct entity word number/mark of accuracy rate=mark
The correct entity word number of recall rate=mark/entity word sum
Accuracy rate recall rate/(accuracy rate+recall rate) of F values=2
Step 5: there will be mark language material to repeat step 2 to step 4, until name Entity recognition described in step 4 is commented Estimate standard F values do not increase or repeat step 2 and step 4 number reach maximum value 50~100 times until.
Further, the newer of the parameter θ of LSTM described in step 2 is as follows:
Step 2 one:The corresponding term vector vec of vocabulary voc and vocabulary voc are subjected to pre-training;Utilize xkIn step 1 The list entries X of LSTM neural networks is calculated wherein in the term vector vec of acquisition, X=X1, X2..., Xt..., XT
Step 2 two:Using inputting Xt, the t-1 times hidden layer h being calculatedt-1The memory list being calculated with the t-1 times First ct-1Calculate the input gate in of the LSTM models of the t times calculatingt, LSTM models out gate otAnd the forgetting door of LSTM models ft;According int、otAnd ftMnemon value c is calculatedtWith hidden layer value ht;Wherein, hidden layer value htConcrete model be: ht= otgtanh(ct);
Step 2 three:By list entries X=X described in step 2 one1, X2..., Xt..., XTInternal each element is pressed From X1To XTSequence be sequentially inputted to hidden layer value h described in step 2 twotConcrete model in and obtain hidden layer forget door output hf;Then, by list entries X=X described in step 2 one1, X2..., Xt..., XTInternal each element is pressed from XTTo X1's Sequence is sequentially inputted to hidden layer value h described in step 2 twotConcrete model in and obtain hidden layer output hb
Step 2 four:The hidden layer result for being obtained step 2 three using the cost computational methods of the entire sequence of transfer value hfAnd hbIt carries out the calculating of sequence cost and obtains optimization aimOptimized using gradient descent algorithmIt carries out The parameter θ of LSTMCUpdate;Wherein, θCFor word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、 Wh_f、Wc_f、 bin、boOr bf, wherein, word_emb is pre-training term vector weighting parameter.
Further, the newer of the parameter θ of LSTM described in step 3 is as follows:
The corresponding term vector vec of vocabulary voc and vocabulary voc are carried out pre-training by step 3 one;Utilize xkIn step 1 The term vector vec of acquisition is calculated the list entries X of LSTM neural networks, wherein, wherein, X=X1, X2..., Xt..., XT
Step 3 two, load News Field LSTM train to obtain model parameter θn, in θnParameter basis on using input Xt, the t-1 times hidden layer h being calculatedt-1The mnemon c being calculated with the t-1 timest-1Calculate the LSTM of the t times calculating The input gate in of modelt, LSTM models out gate otAnd the forgetting door f of LSTM modelst;According int、otAnd ftIt is calculated Mnemon value ctWith hidden layer value ht;Wherein, hidden layer value htConcrete model be:ht=otgtanh(ct);
Step 3 three:By list entries X=X described in step 3 one1, X2..., Xt..., XTIt is pressed successively from X1To XTIt is suitable Sequence is separately input to hidden layer value h described in step 3 twotConcrete model in and obtain hidden layer output hf;Then, by one institute of step 2 State list entries X=X1, X2..., Xt..., XTInternal each element is pressed from XTTo X1Sequence be separately input to step 3 Two are brought into the hidden layer value htConcrete model in and obtain hidden layer output hb
Step 3 four, the hidden layer result for being obtained step 3 three using the cost computational methods of the entire sequence of transfer value hfAnd hbIt carries out sequence cost and optimization aim is calculatedOptimized using gradient descent algorithmIt carries out The update of the parameter θ of LSTM;Wherein, θ word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、 Wh_f、Wc_f、bin、 boOr bf
Further, step 2 one and the specific acquisition process of input X of LSTM neural networks described in step 3 one are:
The training corpus vocabulary voc ' having in mark language material is established, vo ' c and voc are merged into vocabulary VO C;VOC= VOC1,VOC2,VOC3,K,VOCN
The corresponding vector matrix word_emb of random initializtion vocabulary VOC so that vector matrix word_emb dimensions and word Vector v ec is identical, and carries out assignment by formula (1):
word_embiFor i-th of term vector in word_emb;
Finally by xk[k1,k2]It is multiplied to obtain the input X of LSTM neural networks with word_emb:
X=xk[k1,k2]gword_emb(2)
Wherein, xk[k1,k2]For word sequence xkWord sequence between middle k1 and k2.
Further, step 2 one and the specific acquisition process of input X of LSTM neural networks described in step 3 one are:
The corresponding vector matrix word_emb of random initializtion vocabulary VOC, and keep vector after carrying out assignment by formula (1) word_embiIt is constant, i.e., it is not updated vector as parameter,
The corresponding vector matrix of a vocabulary in random initializtion vocabulary VOC is word_emb_para again, according to formula (3) model calculates the input X of LSTM neural networks:
X=(xK [k1, k2]gword_emb)⊕(xK [k1, k2]gword_emb_para) (3)。
Further:Step 2 two and the input gate in of the t times calculating LSTM model described in step 3 twotIt is according to mould What type (4) obtained, model (4) is as follows:
int=σ (WX_inXt+Wh_inht-1+Wc_inct-1+bin) (4)
Wherein, σ is sigmoid functions;WX_inFor with XtThe input gate parameter matrix of multiplication;Wh_inFor ht-1Multiplication input gate Parameter matrix;Wc_inFor with ct-1The input gate parameter matrix of multiplication;binTo calculate the biasing of input gate.
Further, rapid 22 with step 3 two described in the t time calculate LSTM models out gate otIt is according to mould What type (5) obtained, model (5) is as follows:
ot=σ (WX_oXt+Wh_oht-1+Wc_oct-1+bo) (5)
Wherein, WX_oFor with XtThe out gate parameter matrix of multiplication;Wh_oFor ht-1Multiplication out gate parameter matrix;Wc_oFor with ct-1The out gate parameter matrix of multiplication;boTo calculate the biasing of out gate.
Further, step 2 two and the forgetting door f of the t times calculating LSTM model described in step 3 twotIt is basis What model (6) obtained, model (6) is as follows:
ft=σ (WX_fXt+Wh_fht-1+Wc_fct-1+bf) (6)
Wherein, WX_fFor with XtThe forgetting door parameter matrix of multiplication;Wh_fFor ht-1It is multiplied and forgets door parameter matrix;Wc_fFor with ct-1The forgetting door parameter matrix of multiplication;bfThe biasing of door is forgotten to calculate.
Further, according in step 2 two and step 3 twot、otAnd ftMnemon value c is calculatedtAnd hidden layer Value htDetailed process be:
step1:The t times first mnemon value calculated when being not added with
Wherein, WX_cFor with XtThe mnemon parameter matrix of multiplication;Wh_cFor ht-1Multiplication mnemon parameter matrix;bcFor The biasing of mnemon;
step 2:The input gate value in being calculated respectively according to model (4) and model (6)t, forget gate value ft, be not added with door When mnemon valueAnd ct-1Calculate the mnemon value c of the t times calculatingt
Finally, using mnemon value ctThe out gate o being calculated with formula (5)tThe value h of hidden layer is calculatedt, ht's Concrete model is as follows
ht=otgtanh(ct) (9)。
Further, step 2 four in step 3 four using the cost computational methods of the entire sequence of transfer value with that will be walked The rapid 23 hidden layer result h obtained with step 3 threefAnd hbIt carries out sequence cost and optimization aim is calculatedIt utilizes Gradient descent algorithm optimizesCarry out LSTM the newer detailed process of parameter θ be:
The first step:First with hidden layer hfAnd hbSequence of calculation xkLabeled as the cost Q of labelt
Qt=hf(t)gWf+hb(t)gWb+b (10)
Wherein, WfFor with hf(t) parameter matrix being multiplied;WbFor with hb(t) parameter matrix being multiplied;B is inclined for final output It puts;
Second step:The cost of label transfer is described using transfer value matrix A, wherein, transfer value Ai,jRepresent from Whole cost, that is, optimization aim of the transfer value of label i to label j, then list entries XFor:
Third walks:Using Maximum Likelihood Estimation Method, the Probability p for maximizing correct path is calculated:
costrightCost for correct path;
4th step:Neural network parameter θ is obtained according to the Probability p for maximizing correct path using gradient descent algorithm.
Further, the newer neural network parameter in step 3 four and step 2 four in step 2 and step 3 θ is labeled the path searching of result, obtains the annotation results specific method of language material:The cost cost of list entries X is carried out Arrangement obtains Matrix C, and the annotation results of the testing material in mark language material are obtained using viterbi algorithm calculating matrix C.
Further, it is repeated in step 5 Step 2: the number of step 3 and step 4 reaches maximum value 60~90 times.
Advantageous effect of the present invention:
A kind of name entity recognition method based on deep learning towards medical field, the present invention relates to name Entity recognitions Method, affiliated information extraction field, correlative study have facilitation to name Entity recognition research.The present invention wishes to alleviate medical treatment The Entity recognition in field lacks the problem of mark language material again, studies how medical field utilizes a large amount of News Field mark language Material and a small amount of medical field mark language material are preferably marked.By the present invention in that with deep learning method, further excavate The information that language material is contained;Large-scale corpus information is introduced to prevent model in testing simultaneously, due to occurring not having excessively Trained Opening field conventional word and the problem of reducing effect.The results show, it is this to be based on deeply towards medical field The name entity recognition method of study is spent compared with traditional medical field name entity recognition method, is more suitable for medical field Name Entity recognition.
A kind of name entity recognition method based on deep learning towards medical field, the present invention relates to name Entity recognitions Method, affiliated information extraction field, correlative study have facilitation to name Entity recognition research.The present invention wishes to alleviate medical treatment The Entity recognition in field lacks the problem of mark language material again, studies how more preferable using a small amount of mark language material progress medical field is Mark.By the present invention in that with deep learning method, the information that language material is contained further is excavated;Introduce extensive language simultaneously Information is expected to prevent model in testing, due to there is the excessively reducing effect without trained Opening field conventional word The problem of.The results show, this name entity recognition method and traditional medical based on deep learning towards medical field Field name entity recognition method is compared, and is more suitable for the name Entity recognition of medical field.
The present invention relates to name entity recognition method, the name towards medical field more particularly to based on deep learning is real Body recognition methods.There is facilitation in information extraction field belonging to the present invention to name Entity recognition research.
The purpose of the invention is to make full use of existing medical field name Entity recognition mark language material, depth is promoted Neural network is in the performance of medical field name Entity recognition task.It is marked simultaneously in order to solve medical field name Entity recognition The present situation of language material scarcity, using on a large scale without labeled data and extensive News Field language material participation model training, it is proposed that one It plants towards name entity recognition method of the medical field based on deep learning.
The correlative study of the present invention improves the performance of medical field name Entity recognition, is not only to informatics, language The evidence of correlation theory is learned, while has facilitation to natural language understanding.In order to improve the performance of name Entity recognition, this hair The bright name Entity recognition mark language material for taking full advantage of existing a small amount of medical field, by using LSTM deep neural networks Modeling, and the information of extensive raw language material is added using the pre-training technology of deep neural network, this method is compared to tradition Method is compared, and both without manually marking more Entity recognition language materials, reduces drain on manpower and material resources, and can improve medical treatment Name the performance of Entity recognition in field.
The present invention does not require the granularity that language material pre-processes, and can be both labeled by word, can also be carried out by word, this Training is expected used in depending primarily on.All seldom occur in view of many words of entity of medical field in Opening field, use Word granularity be trained may require that for pre-training language material segment, may bring some difficulty.In order to reduce people to greatest extent The consumption of power material resources, compares to recommend and is handled by word.
Generally speaking, this method propose a kind of name entity recognition methods based on deep learning towards medical field.
Using a small amount of medical language material training pattern, and the text largely crawled in online medical question and answer website is marked, it is right Two kinds of model annotation results have carried out the statistics of high frequency words, comparison such as following table:
Table 1 is compared for the high frequency words of CRF models question and answer language material test online with LSTM models
Runic is the annotation results significantly without medical meaning in table, it can be seen that LSTM performances are much better than CRF models.
Description of the drawings
Fig. 1 is a kind of name Entity recognition side based on deep learning towards medical field that specific embodiment one proposes Method flow chart;Fig. 2 is the calculation flow chart for the LSTM that specific embodiment one proposes.
Specific embodiment
With reference to specific embodiment, the present invention will be further described, but the present invention should not be limited by the examples.
Specific embodiment one:With reference to a kind of towards name of the medical field based on deep learning of Fig. 1 present embodiments Entity recognition method is specifically prepared according to following steps:
Step 1: the medical language material using no mark carries out term vector veciTraining, obtained supplement medical field language The corresponding term vector vec of vocabulary voc and vocabulary voc of material;Wherein, vec=[vec1,vec2,…,vecn];Voc=[voc1, voc2,…,vocn];Wherein i=1,2 ..., n;Vec=vec1,vec2,K,veci,K,vecn;Voc=voc1,voc2,K, voci,K,vocn;N is without the word type total number in mark language material;
Step 2: there is the training corpus in mark language material to carry out shot and long term mnemon network LSTM using News Field Training;Pre-training by the use of term vector vec described in step 1 as the training of the shot and long term mnemon network LSTM to Amount, using LSTM methods according to pre-training vector and xk、ykCalculation optimization targetUtilize gradient descent algorithm OptimizationCarry out the parameter θ of LSTMCUpdate;It is described to there is mark language material to include training corpus and testing material, finally To the parameter of LSTMWherein, parameterFor LSTM model parameters θCThe numerical value when final nth iteration restrains, specifically Including:WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、Wh_f、Wc_f、bin、boOr bf;Wherein:Wx_in:Hidden layer input gate inputs Weighting parameter;Wh_in:Hidden layer input door state input weighting parameter;Wc_in:Hidden layer mnemon inputs weighting parameter;Wx_o:It is hidden Layer out gate input weighting parameter;Wh_o:Hidden layer output door state input weighting parameter;Wc_o:Hidden layer mnemon output layer is weighed Value parameter;Wx_f:Hidden layer forgets door input weighting parameter;Wh_f:Hidden layer forgets door state input weighting parameter;Wc_f:Hidden layer is forgotten Door mnemon input weighting parameter;bin:Hidden layer input gate offset parameter; bo:Hidden layer out gate offset parameter;bf:Hidden layer is lost Forget an offset parameter.
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykFor The corresponding annotation results vector of the training corpus having in mark language material of k-th of sample;
Step 3: there is the training corpus in mark language material to carry out shot and long term mnemon network LSTM using medical domain Training;By the use of the term vector vec that step 1 obtains as the pre-training of the training of the shot and long term mnemon network LSTM Vector, using LSTM methods according to pre-training vector and xk、ykCalculation optimization targetDeclined using gradient and calculated Method optimizesCarry out the update of the parameter θ of LSTM;It is described to there is mark language material to include training corpus and testing material;
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykFor The corresponding annotation results vector of the training corpus having in mark language material of k-th of sample;
Step 4: the test of LSTM;Input has mark language material, the newer neural network parameter θ in step 2CInto The path searching of row annotation results obtains the annotation results of mark language material;Using naming Entity recognition evaluation criteria F values to having The annotation results of testing material in mark language material are assessed, and specific assessment computational methods are as follows:
The entity word sum of the correct entity word number/mark of accuracy rate=mark
The correct entity word number of recall rate=mark/entity word sum
Accuracy rate recall rate/(accuracy rate+recall rate) (14) of F values=2;
Step 5: there will be mark language material to repeat Step 2: step 3 and step 4, until the name Entity recognition of step 4 Evaluation criteria F values do not increase or repeat step 2 and step 3 number reach maximum value 50~100 times until.
Present embodiment effect:
A kind of name entity recognition method based on deep learning towards medical field, present embodiment are related to naming entity Recognition methods, affiliated information extraction field, correlative study have facilitation to name Entity recognition research.Present embodiment is wished The Entity recognition for alleviating medical field lacks the problem of marking language material again, studies how medical field utilizes a small amount of medical domain Mark language material and extensive News Field mark language material are preferably marked.Present embodiment is by using deep learning side Method further excavates the information that language material is contained, and the study of extensive linguistic feature is carried out using News Field mark language material; Large-scale corpus information is introduced to prevent model in testing simultaneously, due to occurring excessively without trained Opening field Conventional word and the problem of reducing effect.The results show, this name entity towards medical field based on deep learning are known Compared with other method names entity recognition method with traditional medical field, it is more suitable for the name Entity recognition of medical field.
Present embodiment is related to naming entity recognition method, the life towards medical field more particularly to based on deep learning Name entity recognition method.There is facilitation in the affiliated information extraction field of present embodiment to name Entity recognition research.
The purpose of present embodiment is in order to which existing medical field name Entity recognition is made full use of to mark language material, and borrow Extensive News Field mark language material is helped to promote performance of the deep neural network in medical field name Entity recognition task.Together When in order to solve medical field name Entity recognition mark language material scarcity present situation, using extensive medical field without labeled data Participate in model training, it is proposed that a kind of name entity recognition method based on deep learning towards medical field.
The correlative study of present embodiment improve medical field name Entity recognition performance, be not only to informatics, The evidence of linguistics correlation theory, while have facilitation to natural language understanding.In order to improve the performance of name Entity recognition, Present embodiment takes full advantage of the name Entity recognition mark language material of existing a small amount of medical field, by using LSTM depth Neural net model establishing adds the information on a large scale without mark language material, and will be new using the pre-training technology of deep neural network The model parameter in news field is dissolved into the LSTM deep neural network models of medical field.This method compares conventional method phase Than both without manually marking more Entity recognition language materials, reducing drain on manpower and material resources, and medical field life can be improved The performance of name Entity recognition.
Present embodiment does not require the granularity that language material pre-processes, and can be both labeled by word, can also by word into Row, this depends primarily on used training and expects.All seldom go out in view of many words of entity of medical field in Opening field Existing, word granularity, which is trained, may require that and segmented for pre-training language material, may bring some difficulties.In order to subtract to greatest extent The consumption of few human and material resources, compares to recommend and is handled by word.
Generally speaking, this method propose a kind of name entity recognition methods based on deep learning towards medical field.
Using a small amount of medical language material training pattern, and the text largely crawled in online medical question and answer website is marked, it is right Two kinds of model annotation results have carried out the statistics of high frequency words, comparison such as following table:
The high frequency words of table 2CRF models question and answer language material test online with LSTM models compare
Runic is apparent meaningless annotation results in table, it can be seen that LSTM performances are much better than CRF models.
Specific embodiment two:The present embodiment is different from the first embodiment in that:
The corresponding term vector vec of vocabulary voc and vocabulary voc are carried out pre-training by step 2 one;Utilize xkIt is obtained with step 1 To term vector vec be calculated the input X of LSTM neural networks, wherein, the input X of LSTM neural networks is calculated Using two methods, two methods are specially:It is a kind of be using term vector vec as the initial value of LSTM models selected by method That is method one;Another method is using term vector vec as the method selected by the input of LSTM neural networks i.e. method two;
Step 2 two, using inputting Xt, the t-1 times hidden layer h being calculatedt-1The memory list being calculated with the t-1 times First ct-1Calculate the input gate in of the LSTM models of the t times calculatingt, LSTM models out gate otAnd the forgetting door of LSTM models ft;According int、otAnd ftMnemon value c is calculatedtWith hidden layer value ht;Wherein, X=X1, X2..., Xt..., XT
Step 2 three, to list entries X, respectively from by X1To XTSequence be separately input to step 2 two and be brought into formula (9) the hidden layer output h obtainedf;From XTTo X1Sequence be separately input to step 2 two and be brought into formula (9), obtained hidden layer Export hb
Step 2 four, the hidden layer result for being obtained step 2 three using the cost computational methods of the entire sequence of transfer value hfAnd hbIt carries out sequence cost and optimization aim is calculatedOptimized using gradient descent algorithmIt carries out The parameter θ of LSTMCUpdate;Wherein, θCFor word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、 Wh_f、Wc_f、 bin、boOr bf;Other steps and parameter are same as the specific embodiment one.
Specific embodiment three:The present embodiment is different from the first embodiment in that:
The corresponding term vector vec of vocabulary voc and vocabulary voc are carried out pre-training by step 3 one;Utilize xkIt is obtained with step 1 To term vector vec be calculated the input X of LSTM neural networks, wherein, the input X of LSTM neural networks is calculated Using two methods, two methods are specially:It is a kind of be using term vector vec as the initial value of LSTM models selected by method That is method one;Another method is using term vector vec as the method selected by the input of LSTM neural networks i.e. method two;
Step 3 two, load News Field LSTM train to obtain model parameter θn, in θnParameter basis on using input Xt, the t-1 times hidden layer h being calculatedt-1The mnemon c being calculated with the t-1 timest-1Calculate the LSTM of the t times calculating The input gate in of modelt, LSTM models out gate otAnd the forgetting door f of LSTM modelst;According int、otAnd ftIt is calculated Mnemon value ctWith hidden layer value ht;Wherein, X=X1, X2..., Xt..., XT
Step 3 three, to list entries X, respectively from by X1To XTSequence be separately input to step 2 two and be brought into formula (9) the hidden layer output h obtainedf;From XTTo X1Sequence be separately input to step 2 two and be brought into formula (9), obtained hidden layer Export hb
Step 3 four, the hidden layer result for being obtained step 2 three using the cost computational methods of the entire sequence of transfer value hfAnd hbIt carries out sequence cost and optimization aim is calculatedOptimized using gradient descent algorithmIt carries out The update of the parameter θ of LSTM;Wherein, θ word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、 Wh_f、Wc_f、bin、 boOr bf;Other steps and parameter are same as the specific embodiment one.
Specific embodiment four:Unlike one of present embodiment and specific embodiment two to three:Step 2 one with The input X detailed processes of LSTM neural networks are calculated described in step 3 one using method one:
The training corpus vocabulary voc ' having in mark language material is established, by v o ' c and voc combinatorial word Table V OC;VOC= VOC1,VOC2,VOC3,K,VOCN
The corresponding vector matrix word_emb of random initializtion vocabulary VOC so that vector matrix word_emb dimensions and word Vector v ec is identical, and carries out assignment by formula (1):
word_embiFor i-th of term vector in word_emb;
Finally by xk[k1,k2]It is multiplied to obtain the input X of LSTM neural networks with word_emb:
X=xk[k1,k2]gword_emb (2)
Wherein, xk[k1,k2]For word sequence xkWord sequence between middle k1 and k2.Other steps and parameter and specific embodiment party One of formula two or three is identical.
Specific embodiment five:Unlike one of present embodiment and specific embodiment two to three:
Step 2 one and the specific mistakes of input X that LSTM neural networks are calculated described in step 3 one using method two Journey:
The corresponding vector matrix word_emb of random initializtion vocabulary VOC, and keep vector after carrying out assignment by formula (1) word_embiIt is constant, i.e., it is updated not as parameter, then the corresponding vector of a vocabulary in random initializtion vocabulary VOC Matrix is word_emb_para, calculates the input X of LSTM neural networks:
By word_emb parameters it is fixed in the case of, word_emb_para then fully according to standard parameter update.Other steps Rapid and one of parameter and specific embodiment two or three are identical.
Specific embodiment six:Unlike one of present embodiment and specific embodiment two to three:Step 2 two with Calculate the input gate in of LSTM models (or mnemon) the t time described in step 3 twotSpecially:
int=σ (WX_inXt+Wh_inht-1+Wc_inct-1+bin) (4)
Wherein, σ is sigmoid functions;WX_inFor with XtThe input gate parameter matrix of multiplication;Wh_inFor ht-1Multiplication input gate Parameter matrix;Wc_inFor with ct-1The input gate parameter matrix of multiplication;binTo calculate the biasing of input gate.Other steps and parameter It is identical with one of specific embodiment two to three.
Specific embodiment seven:Unlike one of present embodiment and specific embodiment two to three:Step 2 two with The out gate o of (or the mnemon) of the t times described in step 3 two calculating LSTM modelt(output gate's) is specific Process is:
ot=σ (WX_oXt+Wh_oht-1+Wc_oct-1+bo) (5)
Wherein, WX_oFor with XtThe out gate parameter matrix of multiplication;Wh_oFor ht-1Multiplication out gate parameter matrix;Wc_oFor with ct-1The out gate parameter matrix of multiplication;boTo calculate the biasing of out gate.Other steps and parameter and specific embodiment two to One of three is identical.
Specific embodiment eight:Unlike one of present embodiment and specific embodiment two to three:Step 2 two with Forgetting door (forget gate) f of (or the mnemon) of the t times described in step 3 two calculating LSTM modeltIt is specific Process is:
ft=σ (WX_fXt+Wh_fht-1+Wc_fct-1+bf) (6)
Wherein, WX_fFor with XtThe forgetting door parameter matrix of multiplication;Wh_fFor ht-1It is multiplied and forgets door parameter matrix;Wc_fFor with ct-1The forgetting door parameter matrix of multiplication;bfThe biasing of door is forgotten to calculate.Other steps and parameter and specific embodiment two to One of three is identical.
Specific embodiment nine:Unlike one of present embodiment and specific embodiment two to three:Step 2 two with According in step 3 twot、otAnd ftMnemon value c is calculatedtWith hidden layer value htSpecially:
(1), the t times first mnemon value calculated when being not added with
Wherein, WX_cFor with XtThe mnemon parameter matrix of multiplication;Wh_cFor ht-1Multiplication mnemon parameter matrix;bcFor The biasing of mnemon;
(2), the input gate value in being calculated according to (4), (6)t, forget gate value ft, mnemon value when being not added with And ct-1Calculate the mnemon value c of the t times calculatingt
Finally, using mnemon value ctThe out gate o being calculated with formula (5)tThe value h of hidden layer is calculatedt
ht=otgtanh(ct) (9)。
Other steps and one of parameter and specific embodiment one to six are identical.
Specific embodiment ten:Unlike one of present embodiment and specific embodiment two to three:Step 2 four with The hidden layer for being obtained step 2 three and step 3 three using the cost computational methods of the entire sequence of transfer value in step 3 four As a result hfAnd hbIt carries out sequence cost and optimization aim is calculatedOptimized using gradient descent algorithm Carry out the update detailed process of the parameter θ of LSTM:
(1), first with hidden layer hfAnd hbSequence of calculation xkLabeled as the cost Q of labelt
Qt=hf(t)gWf+hb(t)gWb+b (10)
Wherein, WfFor with hf(t) parameter matrix being multiplied;WbFor with hb(t) parameter matrix being multiplied;B is inclined for final output It puts;
(2), transfer value matrix A is described to the cost of label transfer, if transfer value is Ai,jRepresent from label i to Whole cost, that is, optimization aim of the transfer value of label j, then list entries XFor:
(3), using Maximum Likelihood Estimation Method, the Probability p for maximizing correct path is calculated:
costrightCost for correct path;
Although the number in all paths is the number of an index exploding, all path costs in formula (12) it With need not traverse all paths, can be obtained in linear session using dynamic programming algorithm;
(4), using gradient descent algorithm according to the Probability p neural network parameter θ for maximizing correct path;Wherein, θ is updated Include the variable mentioned in all step 2 one, 22 as neural network parameter θ;Sequence of calculation cost is needed to obtain system Optimization aim.Other steps and one of parameter and specific embodiment two to three are identical.
Specific embodiment 11:Unlike one of present embodiment and specific embodiment two or three:Step 3 four The path searching of result is labeled with the newer neural network parameter in step 2 four in step 2 and step 3, is obtained To the annotation results specific method of language material:
The cost cost of list entries X is arranged to obtain Matrix C, is had using viterbi algorithm calculating matrix C Mark the annotation results of the testing material in language material.Other steps and one of parameter and specific embodiment two or three are identical.
Specific embodiment 12:Unlike one of present embodiment and specific embodiment two or three:In step 5 It repeats Step 2: the number of step 3 and step 4 reaches maximum value 60~90 times.Other steps and parameter and specific embodiment party One of formula two or three is identical.
Although the present invention is disclosed as above with preferred embodiment, it is not limited to the present invention, any to be familiar with this The people of technology without departing from the spirit and scope of the present invention, can do various changes and modification, therefore the protection of the present invention Range should be subject to what claims were defined.

Claims (10)

1. a kind of name entity recognition method based on deep learning towards medical field, which is characterized in that this method it is specific Step is as follows:
Step 1:Term vector vec is carried out using the medical language material of no markiTraining, obtain supplement medical field language material vocabulary The corresponding term vector vec of voc and vocabulary voc;Vec=[vec1,vec2,…,vecn];Voc=[voc1,voc2,…,vocn]; Wherein i=1,2 ..., n, n are without the word type total number in mark language material;
Step 2:The instruction of shot and long term mnemon network LSTM is carried out using the training corpus having in mark language material of News Field Practice;By the use of term vector vec described in step 1 as the pre-training of the training of shot and long term mnemon network LSTM vector, profit With LSTM methods according to pre-training vector and xk、ykCalculation optimization targetOptimized using gradient descent algorithmCarry out the parameter θ of LSTMCUpdate;It is described to there is mark language material to include training corpus and testing material, it finally obtains The parameter of LSTMWherein, parameterFor LSTM model parameters θCThe numerical value when final nth iteration restrains, it is specific to wrap It includes:WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、Wh_f、Wc_f、bin、boOr bf;Wherein:Wx_in:Hidden layer input gate input power Value parameter;Wh_in:Hidden layer input door state input weighting parameter;Wc_in:Hidden layer mnemon inputs weighting parameter;Wx_o:Hidden layer Out gate inputs weighting parameter;Wh_o:Hidden layer output door state input weighting parameter;Wc_o:Hidden layer mnemon output layer weights Parameter;Wx_f:Hidden layer forgets door input weighting parameter;Wh_f:Hidden layer forgets door state input weighting parameter;Wc_f:Hidden layer forgets door Mnemon inputs weighting parameter;bin:Hidden layer input gate offset parameter;bo:Hidden layer out gate offset parameter;bf:Hidden layer is forgotten Door offset parameter;
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykIt is k-th The corresponding annotation results vector of the training corpus having in mark language material of sample;
Step 3:The instruction of shot and long term mnemon network LSTM is carried out using the training corpus having in mark language material of medical domain Practice;It is vectorial as the pre-training of the training of the shot and long term mnemon network LSTM by the use of the term vector vec that step 1 obtains, Using LSTM methods according to pre-training vector and xk、ykCalculation optimization targetIt is excellent using gradient descent algorithm ChangeCarry out the update of the parameter θ of LSTM;It is described to there is mark language material to include training corpus and testing material;
Wherein, xkThe word sequence of the corresponding LSTM inputs of the training corpus having in mark language material for k-th of sample;ykIt is k-th The corresponding annotation results vector of the training corpus having in mark language material of sample;
Step 4:To parameter, updated LSTM is tested, and test process is:There is mark described in input step two and step 3 Language material, the newer neural network parameter θ in step 2CThe path searching of result is labeled, has obtained mark language material Annotation results;The annotation results for having the testing material in mark language material are carried out using Entity recognition evaluation criteria F values are named Assessment, and obtain having after assessment annotation results and mark it is anticipated that specifically assessment computational methods are as follows:
The entity word sum of the correct entity word number/mark of accuracy rate=mark
The correct entity word number of recall rate=mark/entity word sum
Accuracy rate recall rate/(accuracy rate+recall rate) of F values=2
Step 5: there will be mark language material to repeat step 2 to step 4, until the assessment mark of name Entity recognition described in step 4 Quasi- F values do not increase or repeat step 2 and step 4 number reach maximum value 50~100 times until.
2. entity recognition method is named according to claim 1, which is characterized in that the parameter θ of LSTM described in step 2CMore Newly it is as follows:
Step 2 one:The corresponding term vector vec of vocabulary voc and vocabulary voc are subjected to pre-training;Utilize xkIt is obtained in step 1 Term vector vec the list entries X of LSTM neural networks is calculated wherein, X=X1, X2..., Xt..., XT
Step 2 two:Using inputting Xt, the t-1 times hidden layer h being calculatedt-1The mnemon c being calculated with the t-1 timest-1 Calculate the input gate in of the LSTM models of the t times calculatingt, LSTM models out gate otAnd the forgetting door f of LSTM modelst;Root According int、otAnd ftMnemon value c is calculatedtWith hidden layer value ht;Wherein, hidden layer value htConcrete model be:ht=otgtanh (ct);
Step 2 three:By list entries X=X described in step 2 one1, X2..., Xt..., XTInternal each element is pressed from X1It arrives XTSequence be sequentially inputted to hidden layer value h described in step 2 twotConcrete model in and obtain hidden layer forget door output hf;Then, By list entries X=X described in step 2 one1, X2..., Xt..., XTInternal each element is pressed from XTTo X1Sequence successively It is input to hidden layer value h described in step 2 twotConcrete model in and obtain hidden layer output hb
Step 2 four:The hidden layer result h for being obtained step 2 three using the cost computational methods of the entire sequence of transfer valuefWith hbIt carries out the calculating of sequence cost and obtains optimization aimOptimized using gradient descent algorithmCarry out LSTM's Parameter θCUpdate;Wherein, θCFor word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、Wh_f、Wc_f、bin、boOr bf, wherein, word_emb is pre-training term vector weighting parameter.
3. entity recognition method is named according to claim 1, which is characterized in that the parameter θ of LSTM described in step 3 is more Newly it is as follows:
The corresponding term vector vec of vocabulary voc and vocabulary voc are carried out pre-training by step 3 one;Utilize xkIt is obtained in step 1 Term vector vec be calculated the list entries X of LSTM neural networks, wherein, wherein, X=X1, X2..., Xt..., XT
Step 3 two, load News Field LSTM train to obtain model parameter θn, in θnParameter basis on using inputting Xt、 The t-1 times hidden layer h being calculatedt-1The mnemon c being calculated with the t-1 timest-1Calculate the LSTM models of the t times calculating Input gate int, LSTM models out gate otAnd the forgetting door f of LSTM modelst;According int、otAnd ftMemory is calculated Cell value ctWith hidden layer value ht;Wherein, hidden layer value htConcrete model be:ht=otgtanh(ct);
Step 3 three:By list entries X=X described in step 3 one1, X2..., Xt..., XTIt is pressed successively from X1To XTSequence point It is not input to hidden layer value h described in step 3 twotConcrete model in and obtain hidden layer output hf;It then, will be defeated described in step 2 one Enter sequence X=X1, X2..., Xt..., XTInternal each element is pressed from XTTo X1Sequence be separately input to two band of step 3 Enter to the hidden layer value htConcrete model in and obtain hidden layer output hb
Step 3 four, the hidden layer result h for being obtained step 3 three using the cost computational methods of the entire sequence of transfer valuefWith hbIt carries out sequence cost and optimization aim is calculatedOptimized using gradient descent algorithmCarry out LSTM Parameter θ update;Wherein, θ word_emb, WX_in、Wh_in、Wc_in、WX_o、Wh_o、Wc_o、WX_f、Wh_f、Wc_f、bin、boOr bf
4. entity recognition method is named according to Claims 2 or 3, which is characterized in that step 2 one and institute in step 3 one The specific acquisition process of input X for stating LSTM neural networks is:
The training corpus vocabulary voc ' having in mark language material is established, vo ' c and voc are merged into vocabulary VO C;
VOC=VOC1,VOC2,VOC3,K,VOCN
The corresponding vector matrix word_emb of random initializtion vocabulary VOC so that vector matrix word_emb dimensions and term vector Vec is identical, and carries out assignment by formula (1):
word_embiFor i-th of term vector in word_emb;
Finally by xk[k1,k2]It is multiplied to obtain the input X of LSTM neural networks with word_emb:
X=xk[k1,k2]gword_emb (2)
Wherein, xk[k1,k2]For word sequence xkWord sequence between middle k1 and k2.
5. entity recognition method is named according to Claims 2 or 3, which is characterized in that step 2 one and institute in step 3 one The specific acquisition process of input X for stating LSTM neural networks is:
The corresponding vector matrix word_emb of random initializtion vocabulary VOC, and keep vector after carrying out assignment by formula (1) word_embiIt is constant, i.e., it is not updated vector as parameter,
The corresponding vector matrix of a vocabulary in random initializtion vocabulary VOC is word_emb_para again, according to formula (3) Model calculate LSTM neural networks input X:
6. entity recognition method is named according to Claims 2 or 3, it is characterised in that:Described in step 2 two and step 3 two The t times calculating LSTM models input gate intIt is to be obtained according to model (4), model (4) is as follows:
int=σ (WX_inXt+Wh_inht-1+Wc_inct-1+bin) (4)
Wherein, σ is sigmoid functions;WX_inFor with XtThe input gate parameter matrix of multiplication;Wh_inFor ht-1Multiplication input gate parameter Matrix;Wc_inFor with ct-1The input gate parameter matrix of multiplication;binTo calculate the biasing of input gate.
7. entity recognition method is named according to Claims 2 or 3, which is characterized in that rapid 22 with step 3 two described in The out gate o of the t times calculating LSTM modeltIt is to be obtained according to model (5), model (5) is as follows:
ot=σ (WX_oXt+Wh_oht-1+Wc_oct-1+bo) (5)
Wherein, WX_oFor with XtThe out gate parameter matrix of multiplication;Wh_oFor ht-1Multiplication out gate parameter matrix;Wc_oFor with ct-1Phase The out gate parameter matrix multiplied;boTo calculate the biasing of out gate.
8. entity recognition method is named according to Claims 2 or 3, which is characterized in that described in step 2 two and step 3 two The t times calculating LSTM models forgetting door ftIt is to be obtained according to model (6), model (6) is as follows:
ft=σ (WX_fXt+Wh_fht-1+Wc_fct-1+bf) (6)
Wherein, WX_fFor with XtThe forgetting door parameter matrix of multiplication;Wh_fFor ht-1It is multiplied and forgets door parameter matrix;Wc_fFor with ct-1Phase The forgetting door parameter matrix multiplied;bfThe biasing of door is forgotten to calculate.
9. entity recognition method is named according to Claims 2 or 3, which is characterized in that step 2 two and root in step 3 two According int、otAnd ftMnemon value c is calculatedtWith hidden layer value htDetailed process be:
step1:The t times first mnemon value calculated when being not added with
Wherein, WX_cFor with XtThe mnemon parameter matrix of multiplication;Wh_cFor ht-1Multiplication mnemon parameter matrix;bcFor memory The biasing of unit;
step2:The input gate value in being calculated respectively according to model (4) and model (6)t, forget gate value ft, when being not added with Mnemon valueAnd ct-1Calculate the mnemon value c of the t times calculatingt
Finally, using mnemon value ctThe out gate o being calculated with formula (5)tThe value h of hidden layer is calculatedt, htSpecific mould Type is as follows
ht=otgtanh(ct) (9)。
10. entity recognition method is named according to Claims 2 or 3, which is characterized in that step 2 four in step 3 four with adopting The hidden layer result h for being obtained step 2 three and step 3 three with the cost computational methods of the entire sequence of transfer valuefAnd hbIt carries out Optimization aim is calculated in sequence costOptimized using gradient descent algorithmCarry out the parameter of LSTM Newer detailed process is:
The first step:First with hidden layer hfAnd hbSequence of calculation xkLabeled as the cost Q of labelt
Qt=hf(t)gWf+hb(t)gWb+b (10)
Wherein, WfFor with hf(t) parameter matrix being multiplied;WbFor with hb(t) parameter matrix being multiplied;B is biased for final output;
Second step:The cost of label transfer is described using transfer value matrix A, wherein, transfer value Ai,jIt represents from label Whole cost, that is, optimization aim of the transfer value of i to label j, then list entries XFor:
Third walks:Using Maximum Likelihood Estimation Method, the Probability p for maximizing correct path is calculated:
costrightCost for correct path;
4th step:Neural network parameter θ is obtained according to the Probability p for maximizing correct path using gradient descent algorithm.
CN201711446980.8A 2017-12-27 2017-12-27 A kind of name entity recognition method based on deep learning towards medical field Pending CN108170675A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711446980.8A CN108170675A (en) 2017-12-27 2017-12-27 A kind of name entity recognition method based on deep learning towards medical field

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711446980.8A CN108170675A (en) 2017-12-27 2017-12-27 A kind of name entity recognition method based on deep learning towards medical field

Publications (1)

Publication Number Publication Date
CN108170675A true CN108170675A (en) 2018-06-15

Family

ID=62518135

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711446980.8A Pending CN108170675A (en) 2017-12-27 2017-12-27 A kind of name entity recognition method based on deep learning towards medical field

Country Status (1)

Country Link
CN (1) CN108170675A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002436A (en) * 2018-07-12 2018-12-14 上海金仕达卫宁软件科技有限公司 Medical text terms automatic identifying method and system based on shot and long term memory network
CN109284400A (en) * 2018-11-28 2019-01-29 电子科技大学 A kind of name entity recognition method based on Lattice LSTM and language model
CN109325225A (en) * 2018-08-28 2019-02-12 昆明理工大学 It is a kind of general based on associated part-of-speech tagging method
CN110598206A (en) * 2019-08-13 2019-12-20 平安国际智慧城市科技股份有限公司 Text semantic recognition method and device, computer equipment and storage medium
CN111444720A (en) * 2020-03-30 2020-07-24 华南理工大学 Named entity recognition method for English text
US20220067486A1 (en) * 2020-09-02 2022-03-03 Sap Se Collaborative learning of question generation and question answering

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202054A (en) * 2016-07-25 2016-12-07 哈尔滨工业大学 A kind of name entity recognition method learnt based on the degree of depth towards medical field
US20170024645A1 (en) * 2015-06-01 2017-01-26 Salesforce.Com, Inc. Dynamic Memory Network
CN106557462A (en) * 2016-11-02 2017-04-05 数库(上海)科技有限公司 Name entity recognition method and system
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170024645A1 (en) * 2015-06-01 2017-01-26 Salesforce.Com, Inc. Dynamic Memory Network
CN106202054A (en) * 2016-07-25 2016-12-07 哈尔滨工业大学 A kind of name entity recognition method learnt based on the degree of depth towards medical field
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN106557462A (en) * 2016-11-02 2017-04-05 数库(上海)科技有限公司 Name entity recognition method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李剑风: "融合外部知识的中文命名实体识别研究及其医疗领域应用", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002436A (en) * 2018-07-12 2018-12-14 上海金仕达卫宁软件科技有限公司 Medical text terms automatic identifying method and system based on shot and long term memory network
CN109325225A (en) * 2018-08-28 2019-02-12 昆明理工大学 It is a kind of general based on associated part-of-speech tagging method
CN109325225B (en) * 2018-08-28 2022-04-12 昆明理工大学 Universal relevance-based part-of-speech tagging method
CN109284400A (en) * 2018-11-28 2019-01-29 电子科技大学 A kind of name entity recognition method based on Lattice LSTM and language model
CN109284400B (en) * 2018-11-28 2020-10-23 电子科技大学 Named entity identification method based on Lattice LSTM and language model
CN110598206A (en) * 2019-08-13 2019-12-20 平安国际智慧城市科技股份有限公司 Text semantic recognition method and device, computer equipment and storage medium
CN111444720A (en) * 2020-03-30 2020-07-24 华南理工大学 Named entity recognition method for English text
US20220067486A1 (en) * 2020-09-02 2022-03-03 Sap Se Collaborative learning of question generation and question answering

Similar Documents

Publication Publication Date Title
CN106202054B (en) A kind of name entity recognition method towards medical field based on deep learning
CN108170675A (en) A kind of name entity recognition method based on deep learning towards medical field
CN109948165B (en) Fine granularity emotion polarity prediction method based on mixed attention network
CN107239446B (en) A kind of intelligence relationship extracting method based on neural network Yu attention mechanism
CN108874782B (en) A kind of more wheel dialogue management methods of level attention LSTM and knowledge mapping
CN107168945B (en) Bidirectional cyclic neural network fine-grained opinion mining method integrating multiple features
CN106156003B (en) A kind of question sentence understanding method in question answering system
CN106886543B (en) Knowledge graph representation learning method and system combined with entity description
CN105894088B (en) Based on deep learning and distributed semantic feature medical information extraction system and method
CN108804654A (en) A kind of collaborative virtual learning environment construction method based on intelligent answer
CN111414461B (en) Intelligent question-answering method and system fusing knowledge base and user modeling
CN107562792A (en) A kind of question and answer matching process based on deep learning
CN108229582A (en) Entity recognition dual training method is named in a kind of multitask towards medical domain
CN109285562A (en) Speech-emotion recognition method based on attention mechanism
CN109977234A (en) A kind of knowledge mapping complementing method based on subject key words filtering
CN110232122A (en) A kind of Chinese Question Classification method based on text error correction and neural network
CN108197294A (en) A kind of text automatic generation method based on deep learning
CN108804677A (en) In conjunction with the deep learning question classification method and system of multi-layer attention mechanism
CN107644062A (en) The knowledge content Weight Analysis System and method of a kind of knowledge based collection of illustrative plates
CN111428481A (en) Entity relation extraction method based on deep learning
CN112364623A (en) Bi-LSTM-CRF-based three-in-one word notation Chinese lexical analysis method
CN114398976A (en) Machine reading understanding method based on BERT and gate control type attention enhancement network
He et al. Analysis of the communication method of national traditional sports culture based on deep learning
Etchells et al. Learning what is important: feature selection and rule extraction in a virtual course.
CN111400445B (en) Case complex distribution method based on similar text

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned
AD01 Patent right deemed abandoned

Effective date of abandoning: 20210924