CN107967251A - A kind of name entity recognition method based on Bi-LSTM-CNN - Google Patents

A kind of name entity recognition method based on Bi-LSTM-CNN Download PDF

Info

Publication number
CN107967251A
CN107967251A CN201710946531.3A CN201710946531A CN107967251A CN 107967251 A CN107967251 A CN 107967251A CN 201710946531 A CN201710946531 A CN 201710946531A CN 107967251 A CN107967251 A CN 107967251A
Authority
CN
China
Prior art keywords
data
character
lstm
label
layers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201710946531.3A
Other languages
Chinese (zh)
Inventor
唐华阳
岳永鹏
刘林峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Know Future Information Technology Co ltd
Original Assignee
Beijing Know Future Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Know Future Information Technology Co ltd filed Critical Beijing Know Future Information Technology Co ltd
Priority to CN201710946531.3A priority Critical patent/CN107967251A/en
Publication of CN107967251A publication Critical patent/CN107967251A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The present invention relates to a kind of name entity recognition method based on Bi LSTM CNN.Training corpus data with label are converted to the corpus data of character level in the training stage by this method, then deep learning model of the training based on Bi LSTM CNN, the testing material data of no label are converted to the corpus data of character level in forecast period, are then predicted using training stage trained deep learning model.The present invention, can be against the influence of the precision of word segmentation, while the problem of can also evade unregistered word using the vector of character level rather than word-level;In addition the built-up pattern of two-way shot and long term Memory Neural Networks Bi LSTM and convolutional neural networks CNN is used, precision is significantly improved compared to traditional algorithm.

Description

A kind of name entity recognition method based on Bi-LSTM-CNN
Technical field
The invention belongs to information technology field, and in particular to a kind of name entity recognition method based on Bi-LSTM-CNN.
Background technology
Name Entity recognition (Named Entity Recognition, abbreviation NER) is referred to for given data set Identify the process for the substantive noun with certain sense specified.The scene of putting into practice of name entity recognition method includes:
Scene 1:Event detection.Place, time, personage are several basic composition parts of time, in plucking for structure event When wanting, related person, place, unit etc. can be protruded.In event search system, relevant personage, time, place can be made For indexing key words.Relation between several composition parts of event, event has been described in more detail from semantic level.
Scene 2:Information retrieval.Name entity can be used for improving and improving the effect of searching system, when user's input " weight When greatly ", it can be found that what user more thought retrieval is " University Of Chongqing ", rather than its corresponding adjective implication.In addition, establishing When inverted index, if name entity is cut into multiple words, it will cause search efficiency to reduce.In addition, search engine Develop to the direction of semantic understanding, calculating answer.
Scene 3:Semantic network.Concept and example and its corresponding relation, such as " country " are generally comprised in semantic network It is a concept, China is an example, and " China " is the relation between one " country " expression entity and concept.Semantic network In example have be greatly name entity.
Scene 4:Machine translation.The translation of name entity often has some special translation rules, such as Chinese people's translation To be represented into during English using the phonetic of name, it is famous in the posterior rule of preceding surname, and common word will translate into correspondence English word.The name entity in text is recognized accurately, has important meaning to the effect for improving machine translation.
Scene 5:Question answering system.Accurately identify that each part to go wrong is especially important, the association area of problem, Related notion.At present, most of question answering system can only all search for answer, and cannot calculate answer.Search for answer and carry out keyword Matching, user manually extracts answer according to search result, and more friendly mode is that answer is calculated to be presented to user. Some problem needs to consider the relation between entity, such as " the 45th, U.S. president " in question answering system, at present Search engine can return to answer " Donald Trump " in a particular format.
Traditional name entity recognition method can be divided into the name entity recognition method based on dictionary, based on word frequency statistics Method and method based on artificial nerve network model.Name entity recognition method based on dictionary, its principle are by the greatest extent In the more different classes of entity vocabulary income dictionary of amount, when identification, is matched text message with the word in dictionary, That mixes is then labeled as corresponding entity class.Method based on word frequency statistics, such as CRF (condition random field), its principle be Learn the semantic information to preceding the latter word, then make classification and judge.
Name Entity recognition based on dictionary depends critically upon dictionary, it is impossible to identifies unregistered word.United based on word frequency The HMM (hidden Markov) of meter can only associate the semanteme of the previous word of current word, identification essence with CRF (condition random field) method Spend not high enough, the discrimination of especially unregistered word is relatively low., there is ladder in training in the method based on artificial nerve network model Disappearance problem is spent, and the network number of plies is few in actual application, it is final to name Entity recognition result advantage unobvious.
The content of the invention
The present invention, can be effective in view of the above-mentioned problems, provide a kind of name entity recognition method based on Bi-LSTM-CNN Improve the precision of name Entity recognition.Wherein Bi-LSTM is Bi-directional Long Short-Term Memory, i.e., Two-way shot and long term Memory Neural Networks;CNN is Convolution Neural Network, i.e. convolutional neural networks.
In the present invention, posting term refers to being already present in the word in vocabulary, and unregistered word refers to not appearing in word Word in table.
The technical solution adopted by the present invention is as follows:
A kind of name entity recognition method based on Bi-LSTM-CNN, comprises the following steps:
1) original language material data OrgData is converted into the corpus data NewData of character level;
2) character in NewData is counted, character set CharSet is obtained, each character is numbered, obtains character The corresponding character number set CharID of set CharSet;The label of character in NewData is counted, obtains tag set LabelSet, each label is numbered, and obtains the corresponding tag number set LabelID of tag set LabelSet;
3) NewData is grouped sentence according to sentence length, obtains the data acquisition system for including n group sentences GroupData;
4) BatchSize data w, and corresponding label are extracted in random certain group without the slave GroupData put back to Y, and the data w of extraction is converted to the data BatchData of regular length by CharID, corresponding label is passed through LabelID is converted to the label y of regular lengthID
5) by data BatchData and label yIDIt is sent into the deep learning model based on Bi-LSTM-CNN, the training depth The parameter of learning model, when the penalty values satisfaction setting condition that deep learning model produces or reaches maximum iteration N, then Terminate the training of the deep learning model;Otherwise step 4) is used to regenerate data with the training deep learning model;
6) data PreData to be predicted is converted into data PreMData with the deep learning Model Matching, and will It is sent into the trained deep learning model, obtains name Entity recognition result OrgResult.
Further, step 1) is marked each character using the mark mode of BMESO:If some word is corresponding Label is Label, then the character marking most started positioned at the word is Label_B, and the character marking among the word is Label_M, the word positioned at the word end are labeled as Label_E, and Label_ is labeled as if the word only has a character S, is labeled as o if the word does not have tape label or is not belonging to entity tag.
Further, in step 3), if liRepresent the sentence length of the i-th word, then incite somebody to action | li-lj| the sentence of < δ is included into One group, wherein δ represents sentence length interval.
Further, step 4) includes:
The data w being drawn into 4-1) is converted into numeral, namely the correspondence by CharSet and CharID, by w Each character be converted into corresponding numeral;
The corresponding label y of the data w of extraction 4-2) are converted into numeral, namely pair by LabelSet and LabelID It should be related to, each character in y is converted into corresponding numeral;
4-3) assume that specific length is maxLen, as the data sentence length l < maxLen being drawn into, behind sentence MaxLen-l 0 is mended, obtains BatchData, and maxLen-l 0 will be mended behind the corresponding label y of w, obtains yID
Further, the deep learning model based on Bi-LSTM-CNN described in step 5) includes:
Embedding layers, for the character data of input to be converted to vector;
Bi-LSTM layers, the LSTM units comprising some forward and reverses, for extracting the semantic relation of intercharacter;
Concatenate layers, the semantic information for the LSTM units of forward and reverse to be extracted is stitched together;
First DropOut layers, for preventing model over-fitting;
Conv layers, the semantic information for whole word to be extracted with current single character by LSTM takes out word spy Sign;
Second DropOut layers, for preventing model over-fitting;
SoftMax layers, for classifying to each character.
The name entity recognition method based on Bi-LSTM-CNN of the invention, using character level rather than the vector of word-level, Can be against the influence of the precision of word segmentation, while the problem of unregistered word can also be evaded;In addition using two-way shot and long term memory god Traditional algorithm is compared with the built-up pattern of convolutional neural networks CNN through network B i-LSTM, precision improves very much.
Brief description of the drawings
The step flow chart of Fig. 1 the method for the present invention.
Fig. 2 deep learning model schematics.
Fig. 3 .LSTM cell schematics.
Embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, below by specific embodiment and Attached drawing, is described in further details the present invention.
The invention discloses a kind of name entity recognition method based on Bi-LSTM-CNN.Such as know in corpus data Others' name, place name and institution term etc. name entity.The key problem of the present invention includes three:The effect of 1 name Entity recognition Rate, the precision of 2 name Entity recognitions, the accuracy of identification of 3 unregistered words.
In order to solve the problems, such as unregistered word, the present invention abandons traditional vocabulary method, but uses based on term vector Thought, and be the vector based on character, rather than word.In order to solve tradition name, entity recognition method precision is low asks Topic, the present invention use the thought of deep learning, utilize two-way shot and long term Memory Neural Networks model (Bi-LSTM) and convolutional Neural Network (CNN) model is combined to be named Entity recognition.Low in order to solve name Entity recognition efficiency, the present invention avoids word Frequency counts, and avoids string matching, but carry out Entity recognition by the way of similar Function Mapping.
The name entity recognition method flow chart of the present invention is as shown in Figure 1.This method is divided into two stages:Training stage, Forecast period.
(1) training stage:(left side dotted line frame of flow chart)
Step 1:Training corpus data with label are converted to the corpus data of character level.
Step 2:Deep learning model is trained using Adam gradient descent algorithms.In addition other Algorithm for Training can also be used Deep learning model, such as SGD, that is, stochastic gradient descent algorithm.
(2) forecast period:(the right dotted line frame of flow chart)
Step 1:The testing material data of no label are converted into the corpus data of character level.
Step 2:It is predicted using training stage trained deep learning model.
The specific implementation process in two stages is specifically described below.
(1) training stage:
Step 1-1:Initial data OrgData is converted into the data NewData of character level.
Specially:(it can also be used using the mark mode of BMESO (Begin, Middle, End, Single, Other) Other mark modes), each word with label in original language material data is subjected to character level cutting.If some word corresponds to Label be Label, then the character marking that should most start positioned at the word be Label_B, the character mark among the word Label_M is denoted as, the word positioned at the word end is labeled as Label_E, is labeled as if the word only has a character Label_S, is labeled as Other if the word does not have tape label or is not belonging to entity tag.
For example, initial data is:" [Zhang San]/pre [graduation]/o [in]/o [Harvard University]/org [.]/o ", then convert To be after the data of character level:"/tri-/pre_E of pre_B finish/o_B industry/o_E in/o_S Kazakhstan/org_B Buddhists/org_M it is big/org_ M/org_E./o_S”.
Step 1-2:The character set CharSet of NewData is counted, in order to avoid running into unknown character in prediction, A special symbol " null " is added in CharSet.And number each character according to natural number increasing, obtain character set The corresponding character number set CharID of CharSet.
Such as the example in step 1-1, the CharSet after statistics are:Null, three, finish, industry, in, breathe out, Buddhist, greatly, Learn,., punctuation mark can also count inside;CharID is:{null:0,:1, three:2, finish:3, industry:4, in:5, breathe out:6, Buddhist:7, greatly:8, learn:9,.:10}.
Tag set LabelSet is counted, each label is numbered, produces corresponding tag number set LabelID。
Such as the example in step 1-1, the LabelSet after statistics are:{pre_B,pre_E,o_B,o_E,o_s,org_ B,org_M,org_E};LabelID is:{pre_B:0,pre_E:1,o_B:2,o_E:3,o_s:4,org_B:5,org_M:6, org_E:7}。
Step 1-3:NewData is divided according to sentence length.
If liRepresent the sentence length of the i-th word, then incite somebody to action | li-lj| the sentence of < δ is included into one group, and wherein δ represents sentence length Degree interval.If the data after packet are GroupData, n groups are set to altogether.
Step 1-4:BatchSize data w are extracted in random certain group without the slave GroupData put back to, and it is corresponding Label y, and the data of extraction are converted to the data BatchData of regular length by CharID, and corresponding mark Label are converted to the label y of regular length by LabelIDID
The data by extraction are converted to the data BatchData of regular length by CharID, and corresponding Label the label y of regular length is converted to by LabelIDID, it is specially:
Step 1-4-1:The data w being drawn into is converted into numeral, namely is closed by the way that CharSet is corresponding with CharID System, corresponding numeral is converted into by each character in w.
Such as the data in step 1-1 be converted to CharID after be:[1,2,3,4,5,6,7,8,9,10]
Step 1-4-2:The corresponding label y of the data w of extraction are converted into numeral, namely by LabelSet with The correspondence of LabelID, corresponding numeral is converted into by each character in y.
Such as the label in step 1-1 be converted to LabelID after be:[0,1,2,3,4,5,6,6,7,4]
Step 1-4-3:Assuming that specific length is maxLen, as the data sentence length l < maxLen being drawn into, by sentence Son mends maxLen-l 0 below, obtains BatchData.And maxLen-l 0 will be mended behind the corresponding label y of w, obtain yID
Step 1-5:The data BatchData of step 1-4 is sent into deep learning model, produce loss function Cost (y ', yID)。
Deep learning model is as shown in Figure 2 in the name entity recognition method of the present invention.The wherein implication explanation of each several part It is as follows:
w1~wn:The each character that can be intuitively interpreted as in certain words, that is, the data w in step 1-4, but At incoming Embedding layers, it is necessary to first complete step 1-4.
y1~yn:Each character in certain words can be intuitively interpreted as and correspond to prediction label, will be used for and physical tags yIDCounting loss value.
Embedding layers:That is embeding layer, that is, the process of vectorization, for by the character data of input be converted to Amount.
Bi-LSTM layers:LSTM units comprising some forward and reverses, for extracting the semantic relation of intercharacter.
Concatenate layers:Semantic information for the LSTM units of forward and reverse to be extracted is stitched together.
First DropOut layers:That is filter layer, for preventing model over-fitting.
Conv layers:That is convolutional layer, for the semantic information for extracting whole word by LSTM with current single character Take out word feature.
Second DropOut layers:That is filter layer, for preventing model over-fitting.
SoftMax:Classify layer, for finally classifying to each character.
The training deep learning model, is specially:
Step 1-5-1:Incoming data BatchData is subjected to vectorization at Embedding layers, also i.e. by data Each character in each data in BatchData is converted into BatchVec by a vector table Char2Vec.
Step 1-5-2:BatchVec is passed to Bi-LSTM layers, is in detail:First vector in every data is passed to First positive LSTM unit, positive second vector are passed to second LSTM unit, and so on.Positive at the same time the The input of i LSTM unit is in addition to i-th of vector in every data, also comprising the defeated of the i-th -1 positive LSTM unit Go out.First vector in every data is passed to first reverse LSTM unit again, second reverse vector is passed to the Two LSTM units, and so on.The input of same i-th reverse of LSTM unit is except i-th of vector in every data Outside, the output of the i-th -1 reverse LSTM unit is also included.Note that the vector that each LSTM units once receive is not Only one, but BatchSize.
Fig. 3 is shown in more detailed LSTM units description.The implication of each symbol is described as follows in Fig. 3:
w:Character in input data (such as a word).
Ci-1, Ci:The semanteme that the semantic information and preceding i character that i-1 character is accumulated before representing respectively are accumulated Information.
hi-1, hi:The characteristic information of the characteristic information of the i-th -1 character of expression and i-th of character respectively.
f:Forget door, the accumulation semantic information (C for i-1 character before controllingi-1) retain how much.
i:Input gate, for control input data (w and hi-1) retain how much.
o:Out gate, how many characteristic information exported in the feature of i-th of character of output for controlling.
tanh:Hyperbolic tangent function
u:tanh:With controlling i-th of character how many characteristic information to be retained in C together with input gate ii-1In.
* ,+:Represent that step-by-step carries out multiplication and step-by-step carries out addition respectively.
Step 1-5-3:By the output of each LSTM units of forward and reverseWithIt is Concatenate layers incoming, namely It is that the output result of the LSTM units of forward and reverse is stitched together to be combined into
Step 1-5-4:Concatenate layers of output is passed to DropOut layers, that is to say random by hiMiddle η (0≤η ≤ 1) image watermarking falls, and does not allow its continuation to be transmitted backward.
Step 1-5-5:After the output of DropOut is passed to Conv convolutional layers progress convolution, ReLU activation primitives are usedAnd the output of convolutional layer is set to ci
Step 1-5-6:It is similar with step 1-5-3, by Conv layers of output ciIt is DropOut layers incoming, it that is to say random By ciThe image watermarking of middle η (0≤η≤1) falls, and does not allow its continuation to be transmitted backward.
Step 1-5-7:The output of DropOut is passed to SoftMax layers, and produces final penalty values Cost (y ', yID)。 Specific calculation formula is as follows:
Cost(y′,yID)=- yIDlog(y′)+(1-yID) log (1-y ') (formula 1)
Outputs of the wherein y ' expressions BatchData after deep learning category of model layer (SoftMax layers), corresponding to figure Y in 21,2,...,n。yIDRepresent corresponding true tag.
Step 1-6:The parameter of deep learning model is trained using Adam gradient descent algorithms,
Step 1-7:If Cost (y ', y that deep learning model producesID) no longer reduce, or reach greatest iteration time Number N, then terminate the training of deep learning model;Otherwise step 1-4 is jumped to.
Wherein, Cost 'i(y′,yID) penalty values before expression during i iteration, Cost (y ', yID) represent that current iteration produces Penalty values, which is meant that, if current penalty values and the difference of the average value of preceding M penalty values are less than threshold θ, Think no longer to reduce.
(2) forecast period:
Step 2-1:Data PreData to be predicted is converted into the data format with deep learning Model Matching PreMData.Specially:Numerical data by data conversion to be predicted into character level.
Step 2-2:PreMData is sent into training stage trained deep learning model, and obtains prediction result OrgResult。
Deep learning model described in forecast period step 2-2, is training stage trained deep learning model, no Cross in prediction, parameter η=1 for the DropOut layers being directed to, represents not hide any data, be all delivered to down One layer.
In the prior art, such as the method based on dictionary, it is to have no idea to solve unregistered word completely, that is to say, that not The discrimination of posting term is 0, and the accuracy of Statistics-Based Method or the method based on traditional artificial neural network probably exists 90%.The present invention 99.2% or so, significantly improves accuracy to the accuracy of test data.
The above embodiments are merely illustrative of the technical solutions of the present invention rather than is limited, the ordinary skill of this area Personnel can be to technical scheme technical scheme is modified or replaced equivalently, without departing from the spirit and scope of the present invention, sheet The protection domain of invention should be subject to described in claims.

Claims (10)

1. a kind of name entity recognition method based on Bi-LSTM-CNN, it is characterised in that comprise the following steps:
1) original language material data OrgData is converted into the corpus data NewData of character level;
2) character in NewData is counted, character set CharSet is obtained, each character is numbered, obtains character set The corresponding character number set CharID of CharSet;The label of character in NewData is counted, obtains tag set LabelSet, Each label is numbered, obtains the corresponding tag number set LabelID of tag set LabelSet;
3) NewData is grouped sentence according to sentence length, obtains the data acquisition system GroupData for including n group sentences;
4) BatchSize data w, and corresponding label y are extracted in random certain group without the slave GroupData put back to, and The data w of extraction is converted to the data BatchData of regular length by CharID, corresponding label is passed through into LabelID Be converted to the label y of regular lengthID
5) by data BatchData and label yIDIt is sent into the deep learning model based on Bi-LSTM-CNN, the training deep learning The parameter of model, when the penalty values satisfaction setting condition that deep learning model produces or reaches maximum iteration N, then terminates The training of the deep learning model;Otherwise step 4) is used to regenerate data with the training deep learning model;
6) data PreData to be predicted is converted into data PreMData with the deep learning Model Matching, and is sent Enter the trained deep learning model, obtain name Entity recognition result OrgResult.
2. the method as described in claim 1, it is characterised in that step 1) using BMESO mark mode to each character into Line flag:If the corresponding label of some word is Label, then the character marking most started positioned at the word is Label_B, positioned at this Character marking among word is Label_M, and the word positioned at the word end is labeled as Label_E, if the word only has one A character is then labeled as Label_S, and o is labeled as if the word does not have tape label or is not belonging to entity tag.
3. the method as described in claim 1, it is characterised in that in step 3), if liRepresent the sentence length of the i-th word, then will |li-lj| the sentence of < δ is included into one group, and wherein δ represents sentence length interval.
4. the method as described in claim 1, it is characterised in that step 4) includes:
The data w being drawn into 4-1) is converted into numeral, namely the correspondence by CharSet and CharID, will be every in w A character is converted into corresponding numeral;
The corresponding label y of the data w of extraction 4-2) are converted into numeral, namely are closed by the way that LabelSet is corresponding with LabelID System, corresponding numeral is converted into by each character in y;
4-3) assume that specific length is maxLen, as the data sentence length l < maxLen being drawn into, will be mended behind sentence MaxLen-l 0, BatchData is obtained, and maxLen-l 0 will be mended behind the corresponding label y of w, obtains yID
5. the method as described in claim 1, it is characterised in that the deep learning mould based on Bi-LSTM-CNN described in step 5) Type includes:
Embedding layers, for the character data of input to be converted to vector;
Bi-LSTM layers, the LSTM units comprising some forward and reverses, for extracting the semantic relation of intercharacter;
Concatenate layers, the semantic information for the LSTM units of forward and reverse to be extracted is stitched together;
First DropOut layers, for preventing model over-fitting;
Conv layers, the semantic information for whole word to be extracted with current single character by LSTM takes out word feature;
Second DropOut layers, for preventing model over-fitting;
SoftMax layers, for classifying to each character.
6. method as claimed in claim 5, it is characterised in that step 5) trains the step of deep learning model to include:
Incoming data BatchData 5-1) is subjected to vectorization at Embedding layers, also i.e. by data BatchData Each character in each data is converted into BatchVec by a vector table Char2Vec;
BatchVec 5-2) is passed to Bi-LSTM layers;
5-3) by the output of each LSTM units of forward and reverseWithIt is Concatenate layers incoming;
Concatenate layers of output 5-4) is passed to first DropOut layers;
First DropOut layers of output 5-4) is passed to Conv layers;
5-5) by Conv layers of output ciIt is second DropOut layers incoming;
Second DropOut layers of output 5-6) is passed to SoftMax layers, and produces final penalty values.
7. method as claimed in claim 6, it is characterised in that step 5-2) first vector in every data is passed to just To first LSTM unit, positive second vector be passed to second LSTM unit, and so on, while positive i-th The input of a LSTM units is in addition to i-th of vector in every data, also comprising the defeated of the i-th -1 positive LSTM unit Go out;First vector in every data is passed to first reverse LSTM unit again, second reverse vector is passed to the Two LSTM units, and so on, the input of same i-th reverse of LSTM unit is except i-th of vector in every data Outside, the output of the i-th -1 reverse LSTM unit is also included;The vector that each LSTM units once receive is BatchSize It is a.
8. method as claimed in claim 6, it is characterised in that the calculation formula of the penalty values is:
Cost(y′,yID)=- yIDlog(y′)+(1-yID) log (1-y '),
Outputs of the wherein y ' expressions BatchData after the SoftMax layers of deep learning model, yIDRepresent corresponding true mark Label.
9. method as claimed in claim 8, it is characterised in that if penalty values Cost (y ', yID) no longer reduce then termination deeply The training of learning model is spent, Cost (y ', y are judged using the following formulaID) no longer reduce:
<mrow> <mo>|</mo> <mi>C</mi> <mi>o</mi> <mi>s</mi> <mi>t</mi> <mrow> <mo>(</mo> <msup> <mi>y</mi> <mo>&amp;prime;</mo> </msup> <mo>,</mo> <msub> <mi>y</mi> <mrow> <mi>I</mi> <mi>D</mi> </mrow> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mfrac> <mrow> <msubsup> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mo>-</mo> <mi>M</mi> </mrow> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> <msubsup> <mi>Cost</mi> <mi>i</mi> <mo>&amp;prime;</mo> </msubsup> <mrow> <mo>(</mo> <msup> <mi>y</mi> <mo>&amp;prime;</mo> </msup> <mo>,</mo> <msub> <mi>y</mi> <mrow> <mi>I</mi> <mi>D</mi> </mrow> </msub> <mo>)</mo> </mrow> </mrow> <mi>M</mi> </mfrac> <mo>|</mo> <mo>&lt;</mo> <mi>&amp;theta;</mi> <mo>,</mo> </mrow>
Wherein, Costi′(y′,yID) penalty values before expression during i iteration, Cost (y ', yID) represent the damage that current iteration produces Mistake value, if current penalty values and the difference of the average value of preceding M penalty values are less than threshold θ, then it is assumed that penalty values no longer reduce.
10. the method as described in claim 1, it is characterised in that step 5) uses Adam gradient descent algorithms training depth Practise the parameter of model.
CN201710946531.3A 2017-10-12 2017-10-12 A kind of name entity recognition method based on Bi-LSTM-CNN Withdrawn CN107967251A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710946531.3A CN107967251A (en) 2017-10-12 2017-10-12 A kind of name entity recognition method based on Bi-LSTM-CNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710946531.3A CN107967251A (en) 2017-10-12 2017-10-12 A kind of name entity recognition method based on Bi-LSTM-CNN

Publications (1)

Publication Number Publication Date
CN107967251A true CN107967251A (en) 2018-04-27

Family

ID=61997607

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710946531.3A Withdrawn CN107967251A (en) 2017-10-12 2017-10-12 A kind of name entity recognition method based on Bi-LSTM-CNN

Country Status (1)

Country Link
CN (1) CN107967251A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829681A (en) * 2018-06-28 2018-11-16 北京神州泰岳软件股份有限公司 A kind of name entity extraction method and device
CN109284400A (en) * 2018-11-28 2019-01-29 电子科技大学 A kind of name entity recognition method based on Lattice LSTM and language model
CN109657230A (en) * 2018-11-06 2019-04-19 众安信息技术服务有限公司 Merge the name entity recognition method and device of term vector and part of speech vector
CN110197195A (en) * 2019-04-15 2019-09-03 深圳大学 A kind of novel deep layer network system and method towards Activity recognition
CN110737952A (en) * 2019-09-17 2020-01-31 太原理工大学 prediction method for residual life of key parts of mechanical equipment by combining AE and bi-LSTM
CN112052852A (en) * 2020-09-09 2020-12-08 国家气象信息中心 Character recognition method of handwritten meteorological archive data based on deep learning
CN112508441A (en) * 2020-12-18 2021-03-16 哈尔滨工业大学 Urban high-density outdoor thermal comfort evaluation method based on deep learning three-dimensional reconstruction
CN113255342A (en) * 2021-06-11 2021-08-13 云南大学 Method and system for identifying product name of 5G mobile service
CN114267337A (en) * 2022-03-02 2022-04-01 合肥讯飞数码科技有限公司 Voice recognition system and method for realizing forward operation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140236578A1 (en) * 2013-02-15 2014-08-21 Nec Laboratories America, Inc. Question-Answering by Recursive Parse Tree Descent
CN104899304A (en) * 2015-06-12 2015-09-09 北京京东尚科信息技术有限公司 Named entity identification method and device
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN106682220A (en) * 2017-01-04 2017-05-17 华南理工大学 Online traditional Chinese medicine text named entity identifying method based on deep learning
CN107203511A (en) * 2017-05-27 2017-09-26 中国矿业大学 A kind of network text name entity recognition method based on neutral net probability disambiguation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140236578A1 (en) * 2013-02-15 2014-08-21 Nec Laboratories America, Inc. Question-Answering by Recursive Parse Tree Descent
CN104899304A (en) * 2015-06-12 2015-09-09 北京京东尚科信息技术有限公司 Named entity identification method and device
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN106682220A (en) * 2017-01-04 2017-05-17 华南理工大学 Online traditional Chinese medicine text named entity identifying method based on deep learning
CN107203511A (en) * 2017-05-27 2017-09-26 中国矿业大学 A kind of network text name entity recognition method based on neutral net probability disambiguation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JASON P.C.CHIU等: "Named Entity Recognition with Bidirectional LSTM-CNNs", 《TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS》 *
ONUR KURU等: "CharNER:Character-Level Named Entity Recognition", 《THE 26TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS: TECHNICAL PAPERS》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829681B (en) * 2018-06-28 2022-11-11 鼎富智能科技有限公司 Named entity extraction method and device
CN108829681A (en) * 2018-06-28 2018-11-16 北京神州泰岳软件股份有限公司 A kind of name entity extraction method and device
CN109657230A (en) * 2018-11-06 2019-04-19 众安信息技术服务有限公司 Merge the name entity recognition method and device of term vector and part of speech vector
CN109284400A (en) * 2018-11-28 2019-01-29 电子科技大学 A kind of name entity recognition method based on Lattice LSTM and language model
CN110197195A (en) * 2019-04-15 2019-09-03 深圳大学 A kind of novel deep layer network system and method towards Activity recognition
CN110197195B (en) * 2019-04-15 2022-12-23 深圳大学 Novel deep network system and method for behavior recognition
CN110737952A (en) * 2019-09-17 2020-01-31 太原理工大学 prediction method for residual life of key parts of mechanical equipment by combining AE and bi-LSTM
CN112052852A (en) * 2020-09-09 2020-12-08 国家气象信息中心 Character recognition method of handwritten meteorological archive data based on deep learning
CN112052852B (en) * 2020-09-09 2023-12-29 国家气象信息中心 Character recognition method of handwriting meteorological archive data based on deep learning
CN112508441A (en) * 2020-12-18 2021-03-16 哈尔滨工业大学 Urban high-density outdoor thermal comfort evaluation method based on deep learning three-dimensional reconstruction
CN112508441B (en) * 2020-12-18 2022-04-29 哈尔滨工业大学 Urban high-density outdoor thermal comfort evaluation method based on deep learning three-dimensional reconstruction
CN113255342A (en) * 2021-06-11 2021-08-13 云南大学 Method and system for identifying product name of 5G mobile service
CN114267337A (en) * 2022-03-02 2022-04-01 合肥讯飞数码科技有限公司 Voice recognition system and method for realizing forward operation

Similar Documents

Publication Publication Date Title
CN107967251A (en) A kind of name entity recognition method based on Bi-LSTM-CNN
CN107832289A (en) A kind of name entity recognition method based on LSTM CNN
CN107908614A (en) A kind of name entity recognition method based on Bi LSTM
CN107797987B (en) Bi-LSTM-CNN-based mixed corpus named entity identification method
CN107885721A (en) A kind of name entity recognition method based on LSTM
CN111931506B (en) Entity relationship extraction method based on graph information enhancement
CN108304372B (en) Entity extraction method and device, computer equipment and storage medium
CN109543178B (en) Method and system for constructing judicial text label system
CN110134771A (en) A kind of implementation method based on more attention mechanism converged network question answering systems
CN106844741A (en) A kind of answer method towards specific area
CN107818164A (en) A kind of intelligent answer method and its system
CN110362819B (en) Text emotion analysis method based on convolutional neural network
CN107608999A (en) A kind of Question Classification method suitable for automatically request-answering system
CN107977353A (en) A kind of mixing language material name entity recognition method based on LSTM-CNN
CN109284400A (en) A kind of name entity recognition method based on Lattice LSTM and language model
CN109271524B (en) Entity linking method in knowledge base question-answering system
CN107797988A (en) A kind of mixing language material name entity recognition method based on Bi LSTM
CN113505200B (en) Sentence-level Chinese event detection method combined with document key information
CN111444704B (en) Network safety keyword extraction method based on deep neural network
CN111274804A (en) Case information extraction method based on named entity recognition
CN113869053A (en) Method and system for recognizing named entities oriented to judicial texts
CN107894975A (en) A kind of segmenting method based on Bi LSTM
CN106445917A (en) Bootstrap Chinese entity extracting method based on modes
CN107894976A (en) A kind of mixing language material segmenting method based on Bi LSTM
CN107992468A (en) A kind of mixing language material name entity recognition method based on LSTM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20180427