CN110348018A - The method for completing simple event extraction using part study - Google Patents
The method for completing simple event extraction using part study Download PDFInfo
- Publication number
- CN110348018A CN110348018A CN201910642480.4A CN201910642480A CN110348018A CN 110348018 A CN110348018 A CN 110348018A CN 201910642480 A CN201910642480 A CN 201910642480A CN 110348018 A CN110348018 A CN 110348018A
- Authority
- CN
- China
- Prior art keywords
- model
- mark
- input
- crf
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000000605 extraction Methods 0.000 title claims abstract description 24
- 238000012549 training Methods 0.000 claims description 25
- 238000005457 optimization Methods 0.000 claims description 12
- 210000002569 neuron Anatomy 0.000 claims description 11
- 239000000284 extract Substances 0.000 claims description 10
- 238000002372 labelling Methods 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 6
- 230000014509 gene expression Effects 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- 230000015654 memory Effects 0.000 claims description 4
- 230000009466 transformation Effects 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 3
- 238000012804 iterative process Methods 0.000 claims description 3
- 230000007787 long-term memory Effects 0.000 claims description 3
- 230000008043 neural expression Effects 0.000 claims description 3
- 230000007115 recruitment Effects 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 2
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 230000000694 effects Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- 235000021152 breakfast Nutrition 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of methods for completing simple event extraction using part study.A kind of method for completing simple event extraction using part study of the present invention, comprising: Marking Guidelines building process: according to the three classes under frame: dynamic guest, double acting word, other, provide specific event definition.Beneficial effects of the present invention: trial solves the problems, such as spill tag present in the data of remote supervisory acquisition and wrong mark, improves model for the recognition performance of name entity.
Description
Technical field
The present invention relates to simple event extraction fields, and in particular to a kind of to complete simple event extraction using part study
Method.
Background technique
Simple event is defined as the event that verb and its object are directly connected to, for describing scene.Such as: play basketball,
It plays soccer, have breakfast, make a phone call.Simple event extraction problem is converted name Entity recognition problem by we, knows from sentence
Not Chu predefined event argument class instance.
Entity recognition task has been achieved with good progress by studying for many years.Main Research Challenges are at present: not
In same domain and different application, novel entities classification is usually identified, be difficult corresponding rapid build high performance system.It is new real in building
When body classification recognition system, it usually needs there is mark corpus to carry out training pattern, and be at this moment difficult to make full and accurate accurate entity
Marking Guidelines, and labeled data is time-consuming and laborious.In addition, domain-adaptive problem is also a very distinct issues, i.e. entity
Identifying system marks performance on frontier text and declines by a big margin.
Currently, common entity recognition method can substantially be divided into: 1) rule-based and dictionary method;2) based on tradition
The method of machine learning model;3) based on the method for deep learning.On the basis of three kinds of methods, there are also the buildings of some systems to exist
On mixing between them.
Existing the relevant technologies:
1, data construct:
Expert's mark, i.e. data mark personnel are that the expert in place field or Marking Guidelines formulate personnel, are obtained with this
Take the labeled data of high quality.
Crowdsourcing mark.Crowdsourcing is a kind of distributed Resolving probiems and dimension model, by proposing data and Marking Guidelines
Supply layman.It is labeled after simple training, the data for having mark is finally supplied to crowdsourcing data publisher.
Often " trap " of setting unit in the process shows according to the mark of layman later, provides certain reward.
Remote supervisory.Assuming that in the case of there is a small amount of artificial labeled data and entity word table at the beginning, remote supervisory method
With the vocabulary on a large scale without mark corpus in matched, the character string matched is taken as correctly marking.
2, based on the entity recognition method of deep learning:
Most common model is BiLSTM-CRF model, model be chain structure be divided into Embedding layers (with to
Amount indicates the word or word of input), LSTM layers two-way (implicit indicate is extracted to the modeling of whole word on the basis of vector indicates), line
Property layer the mapping relations of character and label (series connection) and last CRF layer (mapping relations of connect label and label) composition.
The experimental results showed that BiLSTM-CRF obtains better effect, has reached or be more than the CRF mould based on feature-rich
Type.In characteristic aspect, which does not need particularly preferred Feature Engineering, can be reached very using term vector and character vector
Good effect.
There are following technical problems for traditional technology:
1, data construct:
1) expert's mark number is generally less, and mark speed is slow, can not obtain mark corpus on a large scale, be unable to satisfy reality
The application demand on border.
2) personnel of crowdsourcing mark do not have too many experience to data fields, need to formulate detailed mark rule before mark
Model, and need through training after a period of time.Different mark persons has different understanding and mark to practise for standardizing with corpus
It is used, cause to lead to labeled data poor quality in the presence of inconsistent or error label is largely marked in annotation results.
Example:
Mark person 1: packaging is tightly sent to and does not collide with.
Mark person 2:{ packs@EVENT } it is tightly sent to and does not collide with.
" packaging " is not expressed as simple event in the context of the words, belongs to and marks inconsistent example.
3) remote supervisory is limited to the scale and quality for the seed resource having been built up, and is much not logged in resource and is easy to be lost
Leakage.Data configuration depends on matching criterior and algorithm unduly, so the data that remote supervisory obtains have two --- spill tag
It is marked with mistake.
Example 1: I likes { the no longer hesitation@SONG } and goodbye ideal of Beyond.[spill tag]
Example 2: my { no longer hesitation@SONG } has directly gone to station.[mistake mark]
In example 1, " goodbye is ideal " is also a first song, due to not having in vocabulary, leads to spill tag.In example 2, " no longer still
Henan " is not title of the song, belongs to wrong mark.
4) for marking used Marking Guidelines, need to combine closely actual task and data, by constantly improve
It can finally decide.At present the event Marking Guidelines towards electric business field almost without.
2, Named Entity Extraction Model neural network based:
Neural network model has been widely used in multiple natural language processing tasks at present, achieves compared with conventional model
No small progress.But it also exposes many disadvantages:
1) data problem: neural network model can obtain good effect and be built upon on the basis of big data, with biography
The machine learning algorithm of system is compared, and neural network needs more data.Final modelling effect largely with offer
Data are related, and the quality of data is particularly important.
2) it is weaker that ability can be explained, do not have available feature to explain it result predicted.
3) it calculates in cost often costly than traditional algorithm, due to the increase of training data and the increasing of network depth
Add, needs more computing resources.
Summary of the invention
The technical problem to be solved in the present invention is to provide a kind of methods for completing simple event extraction using part study, lead to
It crosses and event extraction problem is converted into name Entity recognition problem.Then event-resources are enriched according to electric business field and provides simple thing
The definition of part goes out detailed entity Marking Guidelines according to the practical mark continuous iteration of situation.Using small-scale expert mark and greatly
Scale crowdsourcing mark, therefrom extracts event-resources list.Recycle remote supervisory method, on a large scale without labeled data into
Rower note.It attempts to solve the problems, such as spill tag present in the data of remote supervisory acquisition and wrong mark using local learning method, from
And improve entity recognition model neural network based.
In order to solve the above-mentioned technical problems, the present invention provides a kind of sides that simple event extraction is completed using part study
Method, comprising:
Marking Guidelines building process:
According to the three classes under frame: dynamic guest, double acting word, other, provide specific event definition.
The example for meeting definition is provided according to practical corpus on the basis of this, the place there are ambiguity is provided and pays attention to thing
?.
The building of specification needs continuous iteration, constantly improve according to the actual situation, it is clear and intuitive to ultimately form an orderliness
Clear document.
Remote supervisory corpus building process:
Simple event definition and Marking Guidelines are obtained first.
Recruitment mark personnel give training according to specification, the artificial labeled data of certain scale are then obtained, by this part
Entity in data extracts, and constructs entity vocabulary.
It is matched with the entity vocabulary not marking in text on a large scale, obtains remote supervisory data set.This part
It include a certain number of noises in data.
Target is exactly rationally to train the preferably simple thing of a performance using two parts data above as training data
Part identification model.
Identification model based on BiLSTM-CRF:
BiLSTM-CRF model is handled identification mission as sequence labelling task, chinese character sequence when mode input, defeated
It is sequence label out.In name Entity recognition task, BiLSTM-CRF has been achieved with good result, and element mark is converted into sequence
BIEO label is used when column mark, wherein B-XX indicates that first Chinese character of element XX, E-XX indicate the last one Chinese of element
Other Chinese character markings of word, element are I-XX, rather than element Chinese character is all labeled as O.
In BiLSTM-CRF model, for the chinese character sequence of input, neuron spy is constructed by two-way LSTM first
Sign, then combines these features and is input to CRF layers of progress Tag Estimation.Entire model is divided into three major parts: 1) word vector
It indicates: input word string is expressed as word vector, i.e., discrete type input is converted into the input of low-dimensional neuron;2) feature extraction: logical
It crosses two-way LSTM and linear transformation and word vector is converted into Neuron characteristics;3) entity marks: feature being input to CRF layers, is made
Entity tag is obtained with labeling module;
Word vector indicates: discrete type being inputted Chinese character by a neural expression layer and is converted into the input of low-dimensional neuron;Make
With a Looking-up table, the vector that each Chinese character is store in table is indicated.The initial value of vector can by random number into
Row initialization or in advance on a large scale without being trained on mark corpus with tool.During model training, vector owns
Parameter of the numerical value as model optimizes in company with other parameters together in an iterative process;In the word sequence of given Chinese sentence,
It crosses to table look-up and obtains corresponding word vector expression.
Feature extraction: based on input word sequence vector, we extract feature by two-way LSTM and linear layerThese features will be used for CRF entity labeling module.LSTM is shot and long term memory network, is a kind of circulation mind
Through network, natural language sentences can be modeled well.We extract two-way LSTM to sentence forward and reverse
Feature carries out the hidden layer expression that splicing obtains character
It is calculated by following equation
Wherein W and b is model parameter.Above formula is exactly that character is mapped on label, and final sequence is exactly
It is made of the label in tally set.
Entity mark: it is final to be decoded using CRF layers, enable model to learn to the dependence between label and label to close
System.
It is as follows to solve calculation formula:
In parameter training, penalty values are calculated using Log-likelihood.The probability of artificial annotated sequence are as follows:
Penalty values are as follows:
Trained optimization aim is to minimize this penalty values.
Based on the annotator locally learnt:
The basic ideas locally learnt are that the imperfect mark sentence in the labeled data of part is converted to multipath mark
Sentence is improved to above-mentioned CRF layers of optimization aim;Basic model is used as using based on BiLSTM-CRF model.
It is as follows for completely marking the definition of probability of sentence in one of the embodiments:
Wherein, X is the sentence word vector of input, YxIndicate the corresponding all legal set of paths of x.
It is as follows for the definition of probability of imperfect mark sentence in one of the embodiments:
Wherein, D indicates multipath annotated sequence.The conditional probability of namely one trained example is that multipath includes to own
The sum of the probability in path.Then parameter Estimation can be completed with solution mode identical with baseline system.
Wherein loss function is defined as follows in one of the embodiments:
Loss (θ, X, D)=- logp (D | X).
It in one of the embodiments, specifically, should if a word no specified label in the mark of part
The label of word is labeled as UKN, indicates that all labels are likely to.
In one of the embodiments, on this basis, the optimization object function based on multipath mark sentence is designed, from
And it efficiently uses part labeled data and carries out simulated training.
Model mainly has two states of training and prediction in one of the embodiments,.It needs to input in the training process
Data with mark, model need to constantly update parameter;Under optimization aim, so that the annotation results of output and true value are most
It may be consistent;This needs updates these parameters by ceaselessly loop iteration, so that the loss value in above-mentioned formula constantly subtracts
It is small, allow model that can acquire better parameter;Another state is prediction, and the input during predicting is the number without mark
According to what is at this moment used is exactly trained model.Undated parameter is not needed in this process, using the output of model as last
Prediction result.
In one of the embodiments,
A kind of computer equipment can be run on a memory and on a processor including memory, processor and storage
The step of computer program, the processor realizes any one the method when executing described program.
A kind of computer readable storage medium, is stored thereon with computer program, realization when which is executed by processor
The step of any one the method.
A kind of processor, the processor is for running program, wherein described program executes described in any item when running
Method.
Beneficial effects of the present invention:
It attempts to solve the problems, such as spill tag present in the data of remote supervisory acquisition and wrong mark, improves model for naming entity
Recognition performance.
Detailed description of the invention
Fig. 1 is that the present invention completes to be widely used in Entity recognition in the method for simple event extraction using part study
The BiLSTM-CRF model schematic of task.
Specific embodiment
The present invention will be further explained below with reference to the attached drawings and specific examples, so that those skilled in the art can be with
It more fully understands the present invention and can be practiced, but illustrated embodiment is not as a limitation of the invention.
The purpose of this patent is summarized, is exactly reduced present in the data that remote supervisory is got using local learning method
Spill tag problem.In order to accomplish this point, we define the simple event Marking Guidelines towards electric business field.It is marked by crowdsourcing
Data, obtain remote supervisory data, be added on the name entity recognition method based on deep learning local learning method come
Promote event recognition performance.
Marking Guidelines building process:
1, according to the three classes under frame: dynamic guest, double acting word, other, provide specific event definition.
2, the example for meeting definition is provided according to practical corpus on the basis of this, provides attention for the place there are ambiguity
Item.
3, the building standardized needs continuous iteration, constantly improve according to the actual situation, it is clearly straight to ultimately form an orderliness
See clear document.
Remote supervisory corpus building process:
1, simple event definition and Marking Guidelines are obtained first.
2, recruitment mark personnel give training according to specification, the artificial labeled data of certain scale are then obtained, by this portion
Entity in divided data extracts, and constructs entity vocabulary.
3, it is matched with the entity vocabulary in 2 not marking in text on a large scale, obtains remote supervisory data set.This portion
It include a certain number of noises in divided data.
Our target is exactly that it is preferable rationally to train a performance using two parts data above as training data
Simple event recognition model.
Identification model based on BiLSTM-CRF:
BiLSTM-CRF model is handled identification mission as sequence labelling task, chinese character sequence when mode input, defeated
It is sequence label out.In name Entity recognition task, BiLSTM-CRF has been achieved with good result, therefore we intend choosing
BiLSTM-CRF is benchmark model.Element mark is converted into using BIEO label when sequence labelling, and wherein B-XX indicates element XX
First Chinese character, E-XX indicates the last one Chinese character of element, and other Chinese character markings of element are I-XX, rather than element Chinese character
All it is labeled as O.
In BiLSTM-CRF model, for the chinese character sequence of input, we construct nerve by two-way LSTM first
Then first feature combines these features and is input to CRF layers of progress Tag Estimation.Entire model is divided into three major parts: 1) word
Vector indicates: input word string being expressed as word vector, i.e., discrete type input is converted into the input of low-dimensional neuron;2) feature is taken out
It takes: word vector is converted by two-way LSTM and linear transformation by Neuron characteristics;3) entity marks: feature is input to CRF
Layer obtains entity tag using labeling module.
Word vector indicates: discrete type being inputted Chinese character by a neural expression layer and is converted into the input of low-dimensional neuron.I
Use a Looking-up table, the vector that each Chinese character is store in table indicates.The initial value of vector can be by random
Number carries out initialization or in advance on a large scale without being trained on mark corpus with tool.During model training, vector
Parameter of all numerical value as model optimizes in company with other parameters together in an iterative process.In the word sequence of given Chinese sentence
When column, we obtain corresponding word vector expression by tabling look-up.
Feature extraction: based on input word sequence vector, we extract feature by two-way LSTM and linear layerThese features will be used for CRF entity labeling module.LSTM is shot and long term memory network, is a kind of circulation mind
Through network, natural language sentences can be modeled well.We extract two-way LSTM to sentence forward and reverse
Feature carries out the hidden layer expression that splicing obtains character
It is calculated by following equation
Wherein W and b is model parameter.Above formula is exactly that character is mapped on label, and final sequence is exactly
It is made of the label in tally set.
Entity mark: it is final to be decoded using CRF layers, enable model to learn to the dependence between label and label to close
System.
It is as follows to solve calculation formula:
In parameter training, we calculate penalty values using Log-likelihood.The probability of artificial annotated sequence are as follows:
Penalty values are as follows:
Trained optimization aim is to minimize this penalty values.
Based on the annotator locally learnt:
The basic ideas locally learnt are that the imperfect mark sentence in the labeled data of part is converted to multipath mark
Sentence is improved to above-mentioned CRF layers of optimization aim.Specifically, if a word is no specified in the mark of part
Label indicates that all labels are likely to then the label of the word is labeled as UKN.On this basis, we, which design, is based on multichannel
Diameter marks the optimization object function of sentence, so that efficiently using part labeled data carries out simulated training.We are retouched using upper section
That states is used as basic model based on BiLSTM-CRF model.It is as follows for the definition of probability for completely marking sentence:
Wherein, X is the sentence word vector of input, YxIndicate the corresponding all legal set of paths of x.And for imperfect mark
The definition of probability for infusing sentence is as follows:
Wherein, D indicates multipath annotated sequence.The conditional probability of namely one trained example is that multipath includes to own
The sum of the probability in path.Then parameter Estimation can be completed with solution mode identical with baseline system, wherein loss function
It is defined as follows:
Loss (θ, X, D)=- logp (D | X)
To some supplements of above scheme:
Model mainly has two states of training and prediction.It needs to input the data with mark, model in the training process
It needs to constantly update parameter.Under our optimization aim, so that the annotation results and true value of output are as consistent as possible.This is needed
These parameters are updated by ceaselessly loop iteration, so that the loss value in above-mentioned formula constantly reduces, allow model can
Acquire better parameter.Another state is prediction, and the input during predicting is the data without mark, is at this moment used just
It is trained model.Undated parameter is not needed in this process, using the output of model as last prediction result.
Embodiment described above is only to absolutely prove preferred embodiment that is of the invention and being lifted, protection model of the invention
It encloses without being limited thereto.Those skilled in the art's made equivalent substitute or transformation on the basis of the present invention, in the present invention
Protection scope within.Protection scope of the present invention is subject to claims.
Claims (10)
1. a kind of method for completing simple event extraction using part study characterized by comprising
Marking Guidelines building process:
According to the three classes under frame: dynamic guest, double acting word, other, provide specific event definition.
The example for meeting definition is provided according to practical corpus on the basis of this, provides points for attention for the place there are ambiguity.
The building of specification needs continuous iteration, constantly improve according to the actual situation, and it is clear and intuitive clear to ultimately form an orderliness
Document.
Remote supervisory corpus building process:
Simple event definition and Marking Guidelines are obtained first.
Recruitment mark personnel give training according to specification, the artificial labeled data of certain scale are then obtained, by this partial data
In entity extract, construct entity vocabulary.
It is matched with the entity vocabulary not marking in text on a large scale, obtains remote supervisory data set.This partial data
In include a certain number of noises.
Target is exactly rationally to train the preferably simple event of a performance using two parts data above as training data and know
Other model.
Identification model based on BiLSTM-CRF:
BiLSTM-CRF model is handled identification mission as sequence labelling task, chinese character sequence when mode input, and output is
Sequence label.In name Entity recognition task, BiLSTM-CRF has been achieved with good result, and element mark is converted into sequence mark
BIEO label is used when note, wherein B-XX indicates that first Chinese character of element XX, E-XX indicate the last one Chinese character of element,
Other Chinese character markings of element are I-XX, rather than element Chinese character is all labeled as O.
In BiLSTM-CRF model, for the chinese character sequence of input, Neuron characteristics are constructed by two-way LSTM first, so
After combine these features and be input to CRF layers of progress Tag Estimation.Entire model is divided into three major parts: 1) word vector indicates:
Input word string is expressed as word vector, i.e., discrete type input is converted into the input of low-dimensional neuron;2) feature extraction: by two-way
Word vector is converted into Neuron characteristics by LSTM and linear transformation;3) entity marks: feature being input to CRF layers, uses mark
Module obtains entity tag;
Word vector indicates: discrete type being inputted Chinese character by a neural expression layer and is converted into the input of low-dimensional neuron;Use one
A Looking-up table, the vector that each Chinese character is store in table indicate.The initial value of vector can be carried out just by random number
Beginningization or in advance on a large scale without being trained on mark corpus with tool.During model training, all numerical value of vector
As the parameter of model, optimize together in company with other parameters in an iterative process;In the word sequence of given Chinese sentence, crosses and look into
Table obtains corresponding word vector and indicates.
Feature extraction: based on input word sequence vector, we extract feature by two-way LSTM and linear layerThese features will be used for CRF entity labeling module.LSTM is shot and long term memory network, is a kind of circulation mind
Through network, natural language sentences can be modeled well.We extract two-way LSTM to sentence forward and reverse
Feature carries out the hidden layer expression that splicing obtains character
It is calculated by following equation
Wherein W and b is model parameter.Above formula is exactly that character is mapped on label, and final sequence is exactly by marking
The label composition that label are concentrated.
Entity mark: it is final to be decoded using CRF layers, so that model is learnt to the dependence between label and label.
It is as follows to solve calculation formula:
In parameter training, penalty values are calculated using Log-likelihood.The probability of artificial annotated sequence are as follows:
Penalty values are as follows:
Trained optimization aim is to minimize this penalty values.
Based on the annotator locally learnt:
The basic ideas locally learnt are that the imperfect mark sentence in the labeled data of part is converted to multipath mark sentence,
It is to be improved to above-mentioned CRF layers of optimization aim;Basic model is used as using based on BiLSTM-CRF model.
2. the method as described in claim 1 for completing simple event extraction using part study, which is characterized in that for complete
The definition of probability for marking sentence is as follows:
Wherein, X is the sentence word vector of input, YxIndicate the corresponding all legal set of paths of x.
3. the method as described in claim 1 for completing simple event extraction using part study, which is characterized in that and for not
The definition of probability of complete mark sentence is as follows:
Wherein, D indicates multipath annotated sequence.The conditional probability of namely one trained example is that multipath includes all paths
The sum of probability.Then parameter Estimation can be completed with solution mode identical with baseline system.
4. the method as described in claim 1 for completing simple event extraction using part study, which is characterized in that wherein lose
Function is defined as follows:
Loss (θ, X, D)=- logp (D | X).
5. the method as described in claim 1 for using part study to complete simple event extraction, which is characterized in that specific next
It says, if a word no specified label, label of the word in the mark of part are labeled as UKN, indicate all labels all
It is possible that.
6. the method as described in claim 1 for completing simple event extraction using part study, which is characterized in that basic herein
On, the optimization object function based on multipath mark sentence is designed, so that efficiently using part labeled data carries out simulated training.
7. the method as described in claim 1 for completing simple event extraction using part study, which is characterized in that model is main
There are two states of training and prediction.It needs to input the data with mark in the training process, model needs to constantly update parameter;
Under optimization aim, so that the annotation results and true value of output are as consistent as possible;This need by ceaselessly loop iteration come
These parameters are updated, so that the loss value in above-mentioned formula constantly reduces, allow model that can acquire better parameter;Another shape
State is prediction, and the input during predicting is the data without mark, and what is at this moment used is exactly trained model.In this mistake
Undated parameter is not needed in journey, using the output of model as last prediction result.
8. a kind of computer equipment including memory, processor and stores the meter that can be run on a memory and on a processor
Calculation machine program, which is characterized in that the processor realizes any one of claims 1 to 7 the method when executing described program
Step.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor
The step of any one of claims 1 to 7 the method is realized when row.
10. a kind of processor, which is characterized in that the processor is for running program, wherein right of execution when described program is run
Benefit requires 1 to 7 described in any item methods.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910642480.4A CN110348018A (en) | 2019-07-16 | 2019-07-16 | The method for completing simple event extraction using part study |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910642480.4A CN110348018A (en) | 2019-07-16 | 2019-07-16 | The method for completing simple event extraction using part study |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110348018A true CN110348018A (en) | 2019-10-18 |
Family
ID=68174810
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910642480.4A Pending CN110348018A (en) | 2019-07-16 | 2019-07-16 | The method for completing simple event extraction using part study |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110348018A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111611802A (en) * | 2020-05-21 | 2020-09-01 | 苏州大学 | Multi-field entity identification method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699689A (en) * | 2014-01-09 | 2014-04-02 | 百度在线网络技术(北京)有限公司 | Method and device for establishing event repository |
CN104572958A (en) * | 2014-12-29 | 2015-04-29 | 中国科学院计算机网络信息中心 | Event extraction based sensitive information monitoring method |
CN107122416A (en) * | 2017-03-31 | 2017-09-01 | 北京大学 | A kind of Chinese event abstracting method |
CN107797993A (en) * | 2017-11-13 | 2018-03-13 | 成都蓝景信息技术有限公司 | A kind of event extraction method based on sequence labelling |
CN108628970A (en) * | 2018-04-17 | 2018-10-09 | 大连理工大学 | A kind of biomedical event joint abstracting method based on new marking mode |
CN110633409A (en) * | 2018-06-20 | 2019-12-31 | 上海财经大学 | Rule and deep learning fused automobile news event extraction method |
-
2019
- 2019-07-16 CN CN201910642480.4A patent/CN110348018A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699689A (en) * | 2014-01-09 | 2014-04-02 | 百度在线网络技术(北京)有限公司 | Method and device for establishing event repository |
CN104572958A (en) * | 2014-12-29 | 2015-04-29 | 中国科学院计算机网络信息中心 | Event extraction based sensitive information monitoring method |
CN107122416A (en) * | 2017-03-31 | 2017-09-01 | 北京大学 | A kind of Chinese event abstracting method |
CN107797993A (en) * | 2017-11-13 | 2018-03-13 | 成都蓝景信息技术有限公司 | A kind of event extraction method based on sequence labelling |
CN108628970A (en) * | 2018-04-17 | 2018-10-09 | 大连理工大学 | A kind of biomedical event joint abstracting method based on new marking mode |
CN110633409A (en) * | 2018-06-20 | 2019-12-31 | 上海财经大学 | Rule and deep learning fused automobile news event extraction method |
Non-Patent Citations (3)
Title |
---|
YAOSHENG YANG等: "Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning", 《PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS》 * |
孟艳华: "《事件建构与现代汉语结果宾语句研究》", 30 September 2016, 北京:北京语言大学出版社 * |
沈思等: "基于深度学习的食品安全事件实体自动抽取模型研究", 《计算机工程应用技术》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111611802A (en) * | 2020-05-21 | 2020-09-01 | 苏州大学 | Multi-field entity identification method |
CN111611802B (en) * | 2020-05-21 | 2021-08-31 | 苏州大学 | Multi-field entity identification method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110633409B (en) | Automobile news event extraction method integrating rules and deep learning | |
Agirre et al. | Multilingual central repository version 3.0: upgrading a very large lexical knowledge base | |
CN110377903A (en) | A kind of Sentence-level entity and relationship combine abstracting method | |
CN110459282A (en) | Sequence labelling model training method, electronic health record processing method and relevant apparatus | |
CN107748757A (en) | A kind of answering method of knowledge based collection of illustrative plates | |
CN108228564B (en) | Named entity recognition method, device and readable storage medium for counterlearning on crowdsourced data | |
CN112800239B (en) | Training method of intention recognition model, and intention recognition method and device | |
CN109063164A (en) | A kind of intelligent answer method based on deep learning | |
CN110008467A (en) | A kind of interdependent syntactic analysis method of Burmese based on transfer learning | |
CN108021557A (en) | Irregular entity recognition method based on deep learning | |
CN111611802B (en) | Multi-field entity identification method | |
CN110427478A (en) | A kind of the question and answer searching method and system of knowledge based map | |
CN110647620A (en) | Knowledge graph representation learning method based on confidence hyperplane and dictionary information | |
Wang et al. | Fg-t2m: Fine-grained text-driven human motion generation via diffusion model | |
CN115438674B (en) | Entity data processing method, entity linking method, entity data processing device, entity linking device and computer equipment | |
CN111222318A (en) | Trigger word recognition method based on two-channel bidirectional LSTM-CRF network | |
Cao et al. | Research progress of zero-shot learning beyond computer vision | |
CN116245097A (en) | Method for training entity recognition model, entity recognition method and corresponding device | |
Su et al. | CSS-LM: A contrastive framework for semi-supervised fine-tuning of pre-trained language models | |
CN110348018A (en) | The method for completing simple event extraction using part study | |
CN116386895B (en) | Epidemic public opinion entity identification method and device based on heterogeneous graph neural network | |
CN117521792A (en) | Knowledge graph construction method based on man-machine cooperation type information extraction labeling tool | |
Yan et al. | Grape diseases and pests named entity recognition based on BiLSTM-CRF | |
US11710168B2 (en) | System and method for scalable tag learning in e-commerce via lifelong learning | |
Huang et al. | Prompt-based self-training framework for few-shot named entity recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191018 |
|
RJ01 | Rejection of invention patent application after publication |