CN109635280A - A kind of event extraction method based on mark - Google Patents
A kind of event extraction method based on mark Download PDFInfo
- Publication number
- CN109635280A CN109635280A CN201811400437.9A CN201811400437A CN109635280A CN 109635280 A CN109635280 A CN 109635280A CN 201811400437 A CN201811400437 A CN 201811400437A CN 109635280 A CN109635280 A CN 109635280A
- Authority
- CN
- China
- Prior art keywords
- event
- mark
- sentence
- entity
- corpus
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
Abstract
The invention belongs to information extraction technique fields, disclose a kind of event extraction method based on mark, event extraction is carried out with neural network in conjunction with mark, data processing is carried out first obtains the mark label of an event entity, then the data marked are trained with neural network, obtain the result of the event extraction of structuring;The entity tag of event is directly obtained by Entity recognition and event extraction model neural network based for the sentence of input;Entity tag of the method provided by the invention due to that can directly obtain event, not will cause error propagation, without redundancy, error rate is effectively reduced;On the other hand the event extraction method provided by the invention based on mark obtains loss function by way of improving the weight of classification, can be relieved the tendency of model caused by class imbalance and predicts more data category.
Description
Technical field
The invention belongs to information extraction technique fields, more particularly, to a kind of event extraction method based on mark.
Background technique
Information extraction technique is entity, the relationship, event that specified type is extracted from loose, structureless plain text
Equal factural informations, and the information of export structure.It for example, can be according to the book of time of readers ' reading in wechat reading software
Nationality obtains the interest preference of reader by extracting, and pushes books relevant to reader's interest.In News Field, to very long
News allows reader to understand the content that news is expressed by event extraction, with the shortest time.Event extraction can not only be applied to mutually
Networking arenas can be applied in other field, such as medical field, pass through diagnosis specification and the symptom table of sufferer
State, the disease event of sufferer can be locked quickly, can allow patient to the understanding of symptom definitely.
In information extraction field, event extraction (Event Extraction) is most chosen in information extraction research
One of the task of war property, what is mainly studied is how to go out the event information of structuring from non-structured Text Information Extraction.
Such as in a media event, the text information that time, place, participant of event etc. form a structuring is extracted.
The method of event extraction, which has, at present much has based on traditional machine learning method, such as hidden Markov
(Hidden Markov Model, HMM), condition random field (Conditional random field, CRF) etc., also there is base
In the deep learning method of neural network, main representative has convolutional neural networks (Convolution neural
Network, CNN), Recognition with Recurrent Neural Network (Recurrent neural network, RNN), long memory network (Long in short-term
And short memory network, LSTM).In conventional machines learning method, by carrying out subordinate sentence to text information, dividing
Word, Entity recognition, syntax and dependence, the meaning of a word for extracting the context of candidate word using the tool of natural language processing are special
Semantic feature, and construction feature vector seek peace as the input of classifier and carrys out predicted events generation with a classifier
Trigger word, and according to trigger word type, judge the affiliated type of the event.Deep learning method neural network based, to training
Corpus urtext carries out data prediction;The event sentence sequence indicated with term vector is passed in neural network, mind is utilized
It trains to obtain the semantic feature of each candidate trigger word through network;The event sentence sequence indicated with term vector is passed to volume
In product neural network, candidate trigger word is obtained in the global characteristics of event sentence using neural metwork training;According to candidate trigger word
Semantic feature and candidate trigger word sentence global characteristics, using Softmax come the prediction knot to each candidate trigger word
Fruit carries out randomization, then obtains classification results, and according to trigger word type, judges the affiliated type of the event.
Existing event extraction method be primarily present of both problem: first is that by event extraction be divided into entity extraction and
Event differentiates two stages, names the error of Entity recognition to will affect the differentiation of event, can bring cumulative errors;Second is that for
Specific field, generally requires to construct a large amount of manual features, the process cost of feature selecting be it is very big, as model is multiple
The raising of polygamy, maintainability can become worse and worse.
Summary of the invention
Aiming at the above defects or improvement requirements of the prior art, the present invention provides a kind of event extractions based on mark
Method extracts event extraction and entity joint, its object is to improve event extraction method to reduce cumulative errors;And
Artificial progress feature selecting is replaced using neural network, reduces the human cost of feature selection process.
To achieve the above object, according to one aspect of the present invention, a kind of event extraction side based on mark is provided
Method includes the following steps:
(1) corpus is constructed: using content to be extracted as corpus, language of the set of all corpus as event extraction
Expect library, segmentation subordinate sentence is carried out to the corpus of corpus;
Classify to the sentence of corpus, filters out the sentence comprising entity and event;
Wherein, the things that entity refers to objective reality and can be mutually distinguishable, event refer to generation in some specific time
The change of thing or state that section is made of one or more movements that one or more roles participate in;
(2) training set and test set are constructed: corpus text is randomly assigned to be formed two datasets, respectively training set,
Test set;In a preferred embodiment, the amount of text ratio of training set and test set is 4:1;
(3) corpus is labeled: the entity in sentence is marked in the form of { boundary position-event-entity }
Note: entity location information { B (entity starts), I (inside entity), E (entity ending), S (single entity) }, event type { root
Encoded according to the relationship type that corpus pre-defines }, entity type information, other parts in sentence all label for
“O”。
(4) neural network model is constructed: with Bi-LSTM (Bi-directional Long Short Term Memory)
As neural network model, the LSTM network that Bi-LSTM is identical by two structures but weight is not shared is constituted;
By the corpus forward direction that will have been marked and be inversely separately input in the two LSTM networks, respectively obtain it is positive and
The two feature vectors are stitched together to obtain contextual feature vector by reverse feature vector;Based on context feature to
Amount calculates the mark label probability of prediction, establishes loss function according to mark label probability;
(5) training neural network model: initializing network parameter, inputs training data to neural network model
Parameter optimize;Neural network model is assessed on test set, when the extraction accuracy rate closed in test set exists
In preset threshold section, determine that neural network model reaches convergence;
(6) event prediction is carried out to text to be extracted: by the trained neural network model of text input to be extracted,
The Tag Estimation result of each character of forecasting sequence;Text is spliced according to Tag Estimation result, obtains structuring
Event extraction result.
Preferably, the above-mentioned decimation in time method based on mark, in step (4), obtain the contextual feature of sequence S to
The method of amount includes:
(4.1) the sequence S=(x for being n by the length of input1,x2,…,xn), word vector be input to positive LSTM one by one
In network, positive feature vector is obtained
(4.2) the sequence S=(x for being n by the length of input1,x2,…,xn), word vector be input to reverse LSTM one by one
In network, reversed feature vector is obtained
(4.3) positive, reversed two feature vectors are stitched together, obtain the contextual feature vector of sequence S
(4.4) based on above-mentioned contextual feature vector, normalization layer (Softmax) by neural network is calculated
The mark label probability of prediction
Wherein, yt=Wyht+by;WyIt is softmax layers of parameter matrix, NtIt is the number of all labels, byRefer to linear
The biasing of layer;I refers to the index of label;T refers to the index of word in sentence;
(4.5) loss function is defined are as follows:
Wherein, | D | refer to the size of training set, LjRefer to the length of sentence, LjtRefer in jth word t-th of word
Cross entropy;T refers to the index of word in sentence;J refers to the index of sentence in training set;Refer to j-th of word in i-th of sentence
Label;Refer to the probability distribution of j-th of Word prediction label in i-th of sentence;α refers to the mark other weight of tag class,
Influence of the bigger expression of weight to classification is bigger, and the influence to model is bigger;Label is in ' O ' step (3) for in sentence
Specified label is marked in other parts.
Use cross entropy different as loss function from existing classification problem and sequence labelling problem, it is right in this step
Different event categories is assigned with different weights, by regulating and controlling the training weight validity event classification of each event category not
Balance the influence to prediction result.
Preferably, the above-mentioned decimation in time method based on mark, in step (1), in order to guarantee the generalization of extraction model
Can, some sentences not comprising entity or event are also added in corpus text, the corpus text an of Sentence-level is obtained,
To increase the noise of neural network model, the generalization ability of neural network is improved.
Preferably, the above-mentioned decimation in time method based on mark, in step (1), the sentence not comprising event is in corpus text
Accounting in this is up to 10%.
Preferably, the above-mentioned decimation in time method based on mark, with tables of data Storage Estimation as a result, obtaining structuring
Text information.
In general, through the invention it is contemplated above technical scheme is compared with the prior art, can obtain down and show
Beneficial effect:
(1) the event extraction method provided by the invention based on mark carries out event pumping with neural network in conjunction with mark
It takes, progress data processing first obtains the mark label of an event entity, and the number marked is then trained with neural network
According to obtaining the result of the event extraction of structuring;For the sentence of input, pass through Entity recognition and thing neural network based
Part extraction model directly obtains the entity tag of event;
The prior art is extracted using the method for assembly line, Entity recognition is first named to the sentence of input, to knowledge
Not Chu Lai entity and event carry out combination of two, then relationship classification is carried out, finally there are the sentence of event entity relationship works
For input;The mode of this assembly line can make the erroneous effects of Entity recognition module to relationship classification performance;And ignore two
Existing relationship between a subtask, and classify again due to being matched two-by-two to the event and entity that identify, that
A little not related entity occurrences promote error rate to that can bring redundant information;
In comparison, entity tag of the method provided by the invention due to that can directly obtain event, will not make
Error rate is effectively reduced without redundancy at error propagation;
(2) the event extraction method provided by the invention based on mark is obtained by way of improving the weight of classification
Loss function can be relieved the tendency of model caused by class imbalance and predict more data category;Class imbalance problem is certainly
The common problem of right Language Processing, the prior art are the modes for increasing data set or reducing data set, are lost there are information
Problem;In comparison, this method of the invention can preferably alleviate the problem of class imbalance;
(3) the event extraction method provided by the invention based on mark, neural network structure are easy to use.
Detailed description of the invention
Fig. 1 is the building of the corpus in one embodiment of the event extraction method provided by the invention based on mark
Process schematic;
Fig. 2 is showing for the mark corpus in one embodiment of the event extraction method provided by the invention based on mark
It is intended to;
Fig. 3 is the structure of the Bi-LSTM in one embodiment of the event extraction method provided by the invention based on mark
Schematic diagram.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments,
The present invention will be described in further detail.It should be appreciated that specific embodiment described herein is only used to explain this hair
It is bright, it is not intended to limit the present invention.In addition, technology involved in the various embodiments of the present invention described below is special
Sign can be combined with each other as long as they do not conflict with each other.
Event extraction method provided by the invention based on mark, comprising: obtain corpus, the corpus of acquisition is marked
Infuse and be input to the pretreatment before neural network model;Neural network parameter is set and is obtained with neural network effective
Feature, by constantly Optimal Parameters being trained to make neural network model generalization ability constantly and enhance;Use normalized function
Softmax obtains prediction probability, and the result of prediction is stored into the tables of data of structuring.Below in conjunction with Fig. 1~3 and
Embodiment is further described.
The event extraction method based on mark provided using embodiment is to the method for the nearly event extraction of news, including such as
Lower step:
(1) corpus constructs: obtain corpus on news website, and the news corpus of acquisition with " n " symbol and "." point
Every to carry out segmentation subordinate sentence, and remove the information useless to event extraction, such as the author reported, source of news;To news language
Sentence in material is classified to filter out the sentence comprising entity and event;In a preferred embodiment, it is added to 10%
Sentence not comprising event improves the generalization ability of neural network to increase the noise of neural network model.
News corpus is commonly present the indefinite information of reference relationship since contextual relation is closer, such as " public
Department ", " the said firm ", " its " etc. refer to noun, will refer to noun in a preferred embodiment and are converted to actual entity.Such as
" on December 20th, 2017, * * science and technology complete A wheel financing, and 100,000,000 yuans of the amount of money, this is that the said firm finances for the first time ", will
" the said firm " therein is converted to " * * science and technology ";In embodiment, actual Business Name is replaced with one or more " * ".
(2) training set and test set are constructed: the sentence screened is randomly divided into two data according to the ratio of 4:1
Collection, respectively as training set and test set.
(3) corpus labeling:
(3.1) event type is determined, such as " financing ", " investment ", " marriage ";
(3.2) entity class, such as " time ", " name ", " company name ", " mechanism name ", " amount of money ", " round " are determined;
(3.3) according to the corresponding stamp methods of each word, sentence is labeled;It must include event in sentence,
At least one entity, except noise text information.
Referring to Fig. 2: " July 23 message, * * * * education are announced to obtain 1.3 hundred million yuan of A wheel financing.", the words includes four kinds
Entity, time are " July 23 ", and company is entitled " * * * * education ", and the amount of money is " 1.3 hundred million yuan ", and round is " A wheel ";The thing for including
Part is " financing ";
Provider location, event title, entity type are marked out when marking entity;In embodiment, by " * * * * education "
It is labeled as " B-RZ-GS I-RZ-GS I-RZ-GS E-RZ-GS ", " July 23 " is labeled as " B-RZ-SJ I-RZ-SJ I-
RZ-SJ E-RZ-SJ ", and " 1.3 hundred million yuan " are labeled as " B-RZ-JE I-RZ-JE I-RZ-JE I-RZ-JE E-RZ-JE ", most
" A wheel " is labeled as " B-RZ-LC E-RZ-LC " afterwards, and other parts content is all labeled as " O ";The event class for including in example
Type is financing, is indicated when mark with " RZ ", event type having time, company name, the amount of money, the round for including, mark when
Standby " SJ ", " GS ", " JE ", " LC " indicate that provider location is indicated with { B, I, E, S }.
(4) neural network structure model is constructed;
Embodiment uses Bi-LSTM network;It is the sentence X=(w of n with a length1,w2,…,wt,…wn) as mind
Input through network, wherein wtIt is t-th of word of sentence, the corresponding word vector of sentence is X=(x1,x2,…,xn).By sentence
In word vector be input to the length of single layer one by one in short-term in memory network, obtain each word wtFeature vector aboveThis reality
Apply be input in example the word vector of LSTM network calculating it is as follows:
it=δ (Wxixt+Whiht-1+bi)
ft=δ (Wxfxt+Whfht-1+bf)
ot=δ (Wxoxt+Whoht-1+bo)
ht=ottanh(ct)
Wherein, δ is sigmoid function, W()And b()For the parameter of neural network;It is in the same way that sentence is reverse
It is input to another to grow in memory network in short-term, calculates word wtFollowing feature vector
Wherein, first LSTM network is to LSTM (forward LSTM) before, and second LSTM network is after to LSTM
(backward LSTM).The training method of the two LSTM, parameter W()、 b()Meaning is the same, but each nerve
The parameter of member is not shared, therefore the parameter W of each neuron()And b()Value be different.To the word vector w of inputt,
Available feature vector is distinguished to LSTM and backward LSTM by precedingWithBy the output of the two LSTM be it is positive and
Reverse features vector is stitched together, and obtains word wtContextual feature vector
Feature vector h based on contextt, the mark label probability of prediction is calculated by Softmax layers:
yt=Wyht+by
Wherein wtIt is Softmax layers of parameter matrix, NtIt is the number of all labels;byRefer to the biasing of linear layer;I is
The index of index label;T refers to the index of word in sentence;
Objective function, that is, loss function is defined as:
Wherein, | D | refer to the size of training set, LjRefer to the length of sentence, LjtRefer in jth word t-th of word
Cross entropy;T refers to the index of word in sentence;J refers to the index of sentence in training set;Refer in i-th of sentence j-th
The label of word;Refer to the probability distribution of j-th of Word prediction label in j-th of sentence;α refers to the mark other power of tag class
Weight, influence of the bigger expression of weight to classification is bigger, and the influence to model is bigger;O refers to the label of " other " classification.At this
In step, alleviate the class imbalance problem of training data by distributing to different weights.
(5) training neural network model:
To the parameter of neural network, the weight random initializtion of Embedding, LSTM and linear layer parameter are also random
Initialization;It attempts the parameter groups such as different dropout, hidden layers, learning rate and merges observation training result to obtain
The parameter combination of optimization.
(6) event prediction is carried out to text to be extracted: by in the trained neural network of text input extracted, obtained
To the annotation results of each character of text.Then the corresponding meaning represented of label is read out, forms text information,
These text informations are spliced to form a text sentence, obtain the text information of structuring;Or it is stored separately on a data
In structure.
In a preferred embodiment, the result extracted using tables of data storage text;Corresponding entity and event
There are also other informations to be stored in tables of data;By tables of data, the final result that can be extracted with open-and-shut identification events.
Referring to following table 1, the result that media event extracts is shown with tables of data, can be perfectly clear by tables of data
Solve the basic content of media event.
1 media event of table extracts result data table
Company name | Time | Event | Round | The amount of money |
* science and technology | 2017-12-20 | Financing | A wheel | 100000000 people member |
* * * education | July 23 | Financing | A wheel | 1.3 hundred million yuan |
* small beauty | 2017-08-13 | Financing | … | 30000000 yuan |
Data processing, is obtained an event entity first by the event extraction method provided by the invention based on mark
Label is marked, the data marked are then trained with neural network, finally obtain the result of the event extraction of structuring;It mentions
Go out a kind of new notation methods and carries out event extraction in conjunction with the method for neural network;Construct the language of News Field
Expect library, the task of event extraction is converted into the task of classification and sequence labelling, is eventually converted into a similar name entity
Identify the task of (NER);On the other hand, by doing step processing to loss function in this method, by regulating and controlling each event class
Influence of other trained weight validity event class imbalance to prediction result, can effectively reduce data category imbalance to pre-
Survey the influence of result.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, not to
The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should all wrap
Containing within protection scope of the present invention.
Claims (6)
1. a kind of event extraction method based on mark, which comprises the steps of:
(1) corpus is constructed: using content to be extracted as corpus, corpus of the set of all corpus as event extraction,
Segmentation subordinate sentence is carried out to the corpus of corpus;Classify to the sentence of corpus, filters out the sentence comprising entity and event;
(2) corpus text is randomly assigned to form two datasets, respectively training set, test set;
(3) corpus is labeled: the entity in sentence is labeled in the form of { boundary position-event-entity }: real
Body position information { B (entity starts), I (inside entity), E (entity ending), S (single entity) }, event type are { according to corpus
The relationship type pre-defined is encoded }, entity type information, the other parts in sentence all mark the mark for being
Label.
(4) using Bi-LSTM as neural network model, by the corpus forward direction marked and inversely it is separately input to Bi-LSTM's
In two LSTM networks, positive and reverse feature vector is respectively obtained, the two feature vectors are stitched together to obtain
Following traits vector;Based on context feature vector calculates the mark label probability of prediction, establishes damage according to mark label probability
Lose function;
(5) network parameter is initialized, input training data optimizes the parameter of neural network model;In test set
On neural network model is assessed, when the extraction accuracy rate closed in test set is in preset threshold section, determine nerve
Network model reaches convergence;
(6) by the trained neural network model of text input to be extracted, the Tag Estimation knot of each character of forecasting sequence
Fruit;Text is spliced according to Tag Estimation result, obtains the event extraction result of structuring.
2. the decimation in time method based on mark as described in claim 1, which is characterized in that the entity refers to objective reality
And the things that can be mutually distinguishable, event refer to one for occurring to be participated in by one or more roles in some special time period or
The change of the thing or state of multiple movement compositions.
3. the decimation in time method based on mark as claimed in claim 1 or 2, which is characterized in that in step (4), in acquisition
The method of following traits vector includes:
(4.1) the sequence S=(x for being n by the length of input1,x2,…xj…,xn), word vector be input to positive LSTM net one by one
In network, positive feature vector is obtained
(4.2) the sequence S=(x for being n by the length of input1,x2,…xj…,xn), word vector be input to reverse LSTM net one by one
In network, reversed feature vector h ' is obtainedt;
(4.3) positive, reversed two feature vectors are stitched together, obtain the contextual feature vector of sequence S
(4.4) it is based on contextual feature vector, normalization layer (Softmax) by neural network calculates the mark mark of prediction
Sign probability
Wherein, yt=Wyht+by;WyIt is softmax layers of parameter matrix, NtIt is the number of all labels, byRefer to the inclined of linear layer
It sets;I refers to that sentence is numbered;
(4.5) loss function is defined are as follows:
Wherein, | D | it is the size of training set, LjIt is the length of sentence,Indicate the label of j-th of word in sentence,It indicates
The probability that Tag Estimation comes out, LjtIt is a switching performance function, for indicating objective function and label is the relationship of ' O ', mark
Label are in ' O ' step (3) for other parts in sentence to be marked with specified label.
4. the decimation in time method based on mark as claimed in claim 1 or 2, which is characterized in that, will be some in step (1)
Sentence not comprising entity or event is also added in corpus text, the corpus text an of Sentence-level is obtained, to increase nerve
The noise of network model improves the generalization ability of neural network.
5. the decimation in time method based on mark as claimed in claim 4, which is characterized in that in step (1), do not include event
Accounting of the sentence in corpus text up to 10%.
6. the decimation in time method based on mark as claimed in claim 1 or 2, which is characterized in that use tables of data Storage Estimation
As a result, obtaining the text information of structuring.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811400437.9A CN109635280A (en) | 2018-11-22 | 2018-11-22 | A kind of event extraction method based on mark |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811400437.9A CN109635280A (en) | 2018-11-22 | 2018-11-22 | A kind of event extraction method based on mark |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109635280A true CN109635280A (en) | 2019-04-16 |
Family
ID=66068948
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811400437.9A Pending CN109635280A (en) | 2018-11-22 | 2018-11-22 | A kind of event extraction method based on mark |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109635280A (en) |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110060247A (en) * | 2019-04-18 | 2019-07-26 | 深圳市深视创新科技有限公司 | Cope with the robust deep neural network learning method of sample marking error |
CN110134959A (en) * | 2019-05-15 | 2019-08-16 | 第四范式(北京)技术有限公司 | Named Entity Extraction Model training method and equipment, information extraction method and equipment |
CN110209721A (en) * | 2019-06-04 | 2019-09-06 | 南方科技大学 | Judgement document transfers method, apparatus, server and storage medium |
CN110489514A (en) * | 2019-07-23 | 2019-11-22 | 成都数联铭品科技有限公司 | Promote system and method, the event extraction method and system of event extraction annotating efficiency |
CN110569506A (en) * | 2019-09-05 | 2019-12-13 | 清华大学 | Medical named entity recognition method based on medical dictionary |
CN110765265A (en) * | 2019-09-06 | 2020-02-07 | 平安科技(深圳)有限公司 | Information classification extraction method and device, computer equipment and storage medium |
CN111159336A (en) * | 2019-12-20 | 2020-05-15 | 银江股份有限公司 | Semi-supervised judicial entity and event combined extraction method |
CN111325020A (en) * | 2020-03-20 | 2020-06-23 | 北京百度网讯科技有限公司 | Event argument extraction method and device and electronic equipment |
CN111353306A (en) * | 2020-02-22 | 2020-06-30 | 杭州电子科技大学 | Entity relationship and dependency Tree-LSTM-based combined event extraction method |
CN111414482A (en) * | 2020-03-20 | 2020-07-14 | 北京百度网讯科技有限公司 | Event argument extraction method and device and electronic equipment |
CN111460831A (en) * | 2020-03-27 | 2020-07-28 | 科大讯飞股份有限公司 | Event determination method, related device and readable storage medium |
CN111581387A (en) * | 2020-05-09 | 2020-08-25 | 电子科技大学 | Entity relation joint extraction method based on loss optimization |
CN111694924A (en) * | 2020-06-17 | 2020-09-22 | 合肥中科类脑智能技术有限公司 | Event extraction method and system |
CN111859903A (en) * | 2020-07-30 | 2020-10-30 | 苏州思必驰信息科技有限公司 | Event co-fingering model training method and event co-fingering resolution method |
CN111881303A (en) * | 2020-07-28 | 2020-11-03 | 内蒙古众城信息科技有限公司 | Graph network structure method for classifying urban heterogeneous nodes |
CN111950199A (en) * | 2020-08-11 | 2020-11-17 | 杭州叙简科技股份有限公司 | Earthquake data structured automation method based on earthquake news event |
CN111985237A (en) * | 2020-06-29 | 2020-11-24 | 联想(北京)有限公司 | Entity extraction method, device and equipment |
CN112036183A (en) * | 2020-08-31 | 2020-12-04 | 湖南星汉数智科技有限公司 | Word segmentation method and device based on BilSTM network model and CRF model, computer device and computer storage medium |
CN112052646A (en) * | 2020-08-27 | 2020-12-08 | 安徽聚戎科技信息咨询有限公司 | Text data labeling method |
CN112257449A (en) * | 2020-11-13 | 2021-01-22 | 腾讯科技(深圳)有限公司 | Named entity recognition method and device, computer equipment and storage medium |
CN112257441A (en) * | 2020-09-15 | 2021-01-22 | 浙江大学 | Named entity identification enhancement method based on counterfactual generation |
CN112269901A (en) * | 2020-09-14 | 2021-01-26 | 合肥中科类脑智能技术有限公司 | Fault distinguishing and reasoning method based on knowledge graph |
CN112269949A (en) * | 2020-10-19 | 2021-01-26 | 杭州叙简科技股份有限公司 | Information structuring method based on accident disaster news |
WO2021139239A1 (en) * | 2020-07-28 | 2021-07-15 | 平安科技(深圳)有限公司 | Mechanism entity extraction method, system and device based on multiple training targets |
CN113157978A (en) * | 2021-01-15 | 2021-07-23 | 浪潮云信息技术股份公司 | Data label establishing method and device |
CN113392967A (en) * | 2020-03-11 | 2021-09-14 | 富士通株式会社 | Training method of domain confrontation neural network |
CN116245139A (en) * | 2023-04-23 | 2023-06-09 | 中国人民解放军国防科技大学 | Training method and device for graph neural network model, event detection method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106095928A (en) * | 2016-06-12 | 2016-11-09 | 国家计算机网络与信息安全管理中心 | A kind of event type recognition methods and device |
CN107122416A (en) * | 2017-03-31 | 2017-09-01 | 北京大学 | A kind of Chinese event abstracting method |
CN107797993A (en) * | 2017-11-13 | 2018-03-13 | 成都蓝景信息技术有限公司 | A kind of event extraction method based on sequence labelling |
CN108304911A (en) * | 2018-01-09 | 2018-07-20 | 中国科学院自动化研究所 | Knowledge Extraction Method and system based on Memory Neural Networks and equipment |
CN108519976A (en) * | 2018-04-04 | 2018-09-11 | 郑州大学 | The method for generating extensive sentiment dictionary based on neural network |
-
2018
- 2018-11-22 CN CN201811400437.9A patent/CN109635280A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106095928A (en) * | 2016-06-12 | 2016-11-09 | 国家计算机网络与信息安全管理中心 | A kind of event type recognition methods and device |
CN107122416A (en) * | 2017-03-31 | 2017-09-01 | 北京大学 | A kind of Chinese event abstracting method |
CN107797993A (en) * | 2017-11-13 | 2018-03-13 | 成都蓝景信息技术有限公司 | A kind of event extraction method based on sequence labelling |
CN108304911A (en) * | 2018-01-09 | 2018-07-20 | 中国科学院自动化研究所 | Knowledge Extraction Method and system based on Memory Neural Networks and equipment |
CN108519976A (en) * | 2018-04-04 | 2018-09-11 | 郑州大学 | The method for generating extensive sentiment dictionary based on neural network |
Non-Patent Citations (1)
Title |
---|
钱忠等: "基于双向LSTM网络的不确定和否定作用范围识别", 《软件学报》 * |
Cited By (46)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110060247A (en) * | 2019-04-18 | 2019-07-26 | 深圳市深视创新科技有限公司 | Cope with the robust deep neural network learning method of sample marking error |
CN110134959B (en) * | 2019-05-15 | 2023-10-20 | 第四范式(北京)技术有限公司 | Named entity recognition model training method and equipment, and information extraction method and equipment |
CN110134959A (en) * | 2019-05-15 | 2019-08-16 | 第四范式(北京)技术有限公司 | Named Entity Extraction Model training method and equipment, information extraction method and equipment |
CN110209721A (en) * | 2019-06-04 | 2019-09-06 | 南方科技大学 | Judgement document transfers method, apparatus, server and storage medium |
CN110489514A (en) * | 2019-07-23 | 2019-11-22 | 成都数联铭品科技有限公司 | Promote system and method, the event extraction method and system of event extraction annotating efficiency |
CN110489514B (en) * | 2019-07-23 | 2023-05-23 | 成都数联铭品科技有限公司 | System and method for improving event extraction labeling efficiency, event extraction method and system |
CN110569506A (en) * | 2019-09-05 | 2019-12-13 | 清华大学 | Medical named entity recognition method based on medical dictionary |
CN110765265A (en) * | 2019-09-06 | 2020-02-07 | 平安科技(深圳)有限公司 | Information classification extraction method and device, computer equipment and storage medium |
CN110765265B (en) * | 2019-09-06 | 2023-04-11 | 平安科技(深圳)有限公司 | Information classification extraction method and device, computer equipment and storage medium |
CN111159336A (en) * | 2019-12-20 | 2020-05-15 | 银江股份有限公司 | Semi-supervised judicial entity and event combined extraction method |
CN111159336B (en) * | 2019-12-20 | 2023-09-12 | 银江技术股份有限公司 | Semi-supervised judicial entity and event combined extraction method |
CN111353306A (en) * | 2020-02-22 | 2020-06-30 | 杭州电子科技大学 | Entity relationship and dependency Tree-LSTM-based combined event extraction method |
CN113392967A (en) * | 2020-03-11 | 2021-09-14 | 富士通株式会社 | Training method of domain confrontation neural network |
US11880397B2 (en) | 2020-03-20 | 2024-01-23 | Beijing Baidu Netcom Science Technology Co., Ltd. | Event argument extraction method, event argument extraction apparatus and electronic device |
CN111414482B (en) * | 2020-03-20 | 2024-02-20 | 北京百度网讯科技有限公司 | Event argument extraction method and device and electronic equipment |
CN111325020A (en) * | 2020-03-20 | 2020-06-23 | 北京百度网讯科技有限公司 | Event argument extraction method and device and electronic equipment |
CN111414482A (en) * | 2020-03-20 | 2020-07-14 | 北京百度网讯科技有限公司 | Event argument extraction method and device and electronic equipment |
CN111325020B (en) * | 2020-03-20 | 2023-03-31 | 北京百度网讯科技有限公司 | Event argument extraction method and device and electronic equipment |
CN111460831B (en) * | 2020-03-27 | 2024-04-19 | 科大讯飞股份有限公司 | Event determination method, related device and readable storage medium |
CN111460831A (en) * | 2020-03-27 | 2020-07-28 | 科大讯飞股份有限公司 | Event determination method, related device and readable storage medium |
CN111581387A (en) * | 2020-05-09 | 2020-08-25 | 电子科技大学 | Entity relation joint extraction method based on loss optimization |
CN111581387B (en) * | 2020-05-09 | 2022-10-11 | 电子科技大学 | Entity relation joint extraction method based on loss optimization |
CN111694924B (en) * | 2020-06-17 | 2023-05-26 | 合肥中科类脑智能技术有限公司 | Event extraction method and system |
CN111694924A (en) * | 2020-06-17 | 2020-09-22 | 合肥中科类脑智能技术有限公司 | Event extraction method and system |
CN111985237A (en) * | 2020-06-29 | 2020-11-24 | 联想(北京)有限公司 | Entity extraction method, device and equipment |
CN111881303A (en) * | 2020-07-28 | 2020-11-03 | 内蒙古众城信息科技有限公司 | Graph network structure method for classifying urban heterogeneous nodes |
WO2021139239A1 (en) * | 2020-07-28 | 2021-07-15 | 平安科技(深圳)有限公司 | Mechanism entity extraction method, system and device based on multiple training targets |
CN111859903A (en) * | 2020-07-30 | 2020-10-30 | 苏州思必驰信息科技有限公司 | Event co-fingering model training method and event co-fingering resolution method |
CN111859903B (en) * | 2020-07-30 | 2024-01-12 | 思必驰科技股份有限公司 | Event same-index model training method and event same-index resolution method |
CN111950199A (en) * | 2020-08-11 | 2020-11-17 | 杭州叙简科技股份有限公司 | Earthquake data structured automation method based on earthquake news event |
CN112052646A (en) * | 2020-08-27 | 2020-12-08 | 安徽聚戎科技信息咨询有限公司 | Text data labeling method |
CN112052646B (en) * | 2020-08-27 | 2024-03-29 | 安徽聚戎科技信息咨询有限公司 | Text data labeling method |
CN112036183A (en) * | 2020-08-31 | 2020-12-04 | 湖南星汉数智科技有限公司 | Word segmentation method and device based on BilSTM network model and CRF model, computer device and computer storage medium |
CN112036183B (en) * | 2020-08-31 | 2024-02-02 | 湖南星汉数智科技有限公司 | Word segmentation method, device, computer device and computer storage medium based on BiLSTM network model and CRF model |
CN112269901A (en) * | 2020-09-14 | 2021-01-26 | 合肥中科类脑智能技术有限公司 | Fault distinguishing and reasoning method based on knowledge graph |
CN112269901B (en) * | 2020-09-14 | 2021-11-05 | 合肥中科类脑智能技术有限公司 | Fault distinguishing and reasoning method based on knowledge graph |
CN112257441A (en) * | 2020-09-15 | 2021-01-22 | 浙江大学 | Named entity identification enhancement method based on counterfactual generation |
CN112257441B (en) * | 2020-09-15 | 2024-04-05 | 浙江大学 | Named entity recognition enhancement method based on counterfactual generation |
CN112269949B (en) * | 2020-10-19 | 2023-09-22 | 杭州叙简科技股份有限公司 | Information structuring method based on accident disaster news |
CN112269949A (en) * | 2020-10-19 | 2021-01-26 | 杭州叙简科技股份有限公司 | Information structuring method based on accident disaster news |
CN112257449B (en) * | 2020-11-13 | 2023-01-03 | 腾讯科技(深圳)有限公司 | Named entity recognition method and device, computer equipment and storage medium |
CN112257449A (en) * | 2020-11-13 | 2021-01-22 | 腾讯科技(深圳)有限公司 | Named entity recognition method and device, computer equipment and storage medium |
CN113157978B (en) * | 2021-01-15 | 2023-03-28 | 浪潮云信息技术股份公司 | Data label establishing method and device |
CN113157978A (en) * | 2021-01-15 | 2021-07-23 | 浪潮云信息技术股份公司 | Data label establishing method and device |
CN116245139B (en) * | 2023-04-23 | 2023-07-07 | 中国人民解放军国防科技大学 | Training method and device for graph neural network model, event detection method and device |
CN116245139A (en) * | 2023-04-23 | 2023-06-09 | 中国人民解放军国防科技大学 | Training method and device for graph neural network model, event detection method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109635280A (en) | A kind of event extraction method based on mark | |
Er et al. | Attention pooling-based convolutional neural network for sentence modelling | |
Swathi et al. | An optimal deep learning-based LSTM for stock price prediction using twitter sentiment analysis | |
CN109460473B (en) | Electronic medical record multi-label classification method based on symptom extraction and feature representation | |
CN108090049B (en) | Multi-document abstract automatic extraction method and system based on sentence vectors | |
Jang et al. | Recurrent neural network-based semantic variational autoencoder for sequence-to-sequence learning | |
CN104408153B (en) | A kind of short text Hash learning method based on more granularity topic models | |
CN111738003B (en) | Named entity recognition model training method, named entity recognition method and medium | |
Dashtipour et al. | Exploiting deep learning for Persian sentiment analysis | |
CN109325112B (en) | A kind of across language sentiment analysis method and apparatus based on emoji | |
CN107608956A (en) | A kind of reader's mood forecast of distribution algorithm based on CNN GRNN | |
CN112434535B (en) | Element extraction method, device, equipment and storage medium based on multiple models | |
CN111966917A (en) | Event detection and summarization method based on pre-training language model | |
CN113254610B (en) | Multi-round conversation generation method for patent consultation | |
Firdaus et al. | A multi-task hierarchical approach for intent detection and slot filling | |
CN111222318A (en) | Trigger word recognition method based on two-channel bidirectional LSTM-CRF network | |
CN110046356A (en) | Label is embedded in the application study in the classification of microblogging text mood multi-tag | |
CN116127056A (en) | Medical dialogue abstracting method with multi-level characteristic enhancement | |
CN112966117A (en) | Entity linking method | |
CN114936266A (en) | Multi-modal fusion rumor early detection method and system based on gating mechanism | |
CN111145914B (en) | Method and device for determining text entity of lung cancer clinical disease seed bank | |
Cao et al. | Knowledge guided short-text classification for healthcare applications | |
CN115906835B (en) | Chinese question text representation learning method based on clustering and contrast learning | |
He et al. | Distant supervised relation extraction via long short term memory networks with sentence embedding | |
CN113901172A (en) | Case-related microblog evaluation object extraction method based on keyword structure codes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |