CN106599032B - Text event extraction method combining sparse coding and structure sensing machine - Google Patents
Text event extraction method combining sparse coding and structure sensing machine Download PDFInfo
- Publication number
- CN106599032B CN106599032B CN201610955220.9A CN201610955220A CN106599032B CN 106599032 B CN106599032 B CN 106599032B CN 201610955220 A CN201610955220 A CN 201610955220A CN 106599032 B CN106599032 B CN 106599032B
- Authority
- CN
- China
- Prior art keywords
- event
- word
- text
- training
- extracting
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 61
- 238000000034 method Methods 0.000 claims abstract description 35
- 239000013598 vector Substances 0.000 claims abstract description 25
- 238000013528 artificial neural network Methods 0.000 claims abstract description 7
- 230000008569 process Effects 0.000 claims description 14
- 238000004458 analytical method Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 5
- 230000008878 coupling Effects 0.000 claims description 3
- 238000010168 coupling process Methods 0.000 claims description 3
- 238000005859 coupling reaction Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 3
- 230000036651 mood Effects 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 4
- 230000003044 adaptive effect Effects 0.000 abstract 1
- 238000012360 testing method Methods 0.000 description 10
- 238000003058 natural language processing Methods 0.000 description 5
- 238000011160 research Methods 0.000 description 4
- 238000002372 labelling Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 238000009411 base construction Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/313—Selection or weighting of terms for indexing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a text event extraction method combining sparse coding and a structure sensing machine. The method comprises the following steps: 1) marking and constructing text data into a training sample according to an ACE (adaptive communication interface) or Richter standard; 2) taking the extracted entity as a candidate entity of an event trigger word and an event parameter, and extracting text characteristics; 3) further extracting text distributed word vector characteristics and learning sparse coding characteristics; 4) training a structure perceptron classifier by using training samples and extracted text characteristics, and identifying trigger words and parameters related to events in the text; 5) and (3) inputting the new text data into a structure sensing machine classifier after the step 1, and extracting text event information. The method utilizes the sparse coding expression of the distributed word vector characteristics based on the neural network to strengthen the text characteristics, and on the other hand, the structural perceptron model is used for learning the recognition of the event trigger words and the event participants at the same time, so that a better event extraction effect is obtained.
Description
Technical Field
The invention relates to event extraction, in particular to a text event extraction method combining sparse coding and a structure perceptron.
Background
An event is something that occurs or appears, an event involving an entity (person, item, etc.) that participates in or is affected by the event, as well as aspects of space-time. It is very important to know events and their descriptions in text data, and event extraction is often also an application key component part of machine reading, news summarization, information retrieval, knowledge base construction, and the like.
Generally, the goal of the event extraction task is to extract event-related trigger words and participants (people or things) in the text. Current leading-edge methods for event extraction generally include three steps: firstly, extracting entities such as people, mechanisms, positions and the like from a text by using a pre-trained named entity recognition tool, and then completing trigger word recognition and classification and event parameter recognition and classification step by step. An obvious drawback of this pipelined event extraction method is that errors occurring upstream are gathered and propagated downstream, and downstream steps cannot correct the upstream error selection. Therefore, the research considers that the event extraction is converted into the structure prediction problem, so that the identification and classification of event trigger words and parameters are realized simultaneously, and the idea is similar to the method used in other common natural language processing tasks such as POS tagging and Chunking.
Similar to other machine learning applications, it is often necessary to extract text features for model training and testing in natural language processing tasks, and these features can be largely classified into two categories, lexical features and contextual features. The lexical characteristics mainly refer to Part-of-Speech labels, entity information, morphology (stem, dynamic noun form) and the like, and the characteristics are used for acquiring semantic information of words; the context characteristics mainly refer to characteristics of syntactic dependency analysis and semantic role labeling, and the characteristics can reserve the structural relationship of grammar and semantics of the text. However, most of the characteristics require manual intervention, and the extraction process is time-consuming and not universal. In recent years, neural networks and deep learning technologies become research hotspots, unsupervised distributed word vector extraction methods in the field of NLP are also increasingly common, the learning method of the distributed word vectors is simple and general, and does not need manual labeling data, but the learning method has the defect that the interpretability and the flexibility of common sparse expression characteristics are not available, so that the research is provided for converting the distributed word vectors into a sparse expression form which is convenient to use in the traditional NLP problem.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a text event extraction method combining sparse coding and a structure sensing machine.
The text event extraction method combining sparse coding and a structure sensing machine comprises the following steps:
1) constructing the text data into a training sample according to the Automatic Content Extraction and/or the Rich Entity relationship event specification label;
2) taking the extracted entity as a candidate entity of an event trigger word and an event parameter, and extracting text characteristics;
3) further extracting text distributed word vector characteristics and learning sparse coding characteristics;
4) training a structure perceptron classifier by using training samples and extracted text characteristics, and identifying trigger words and parameters related to events in the text;
5) and (3) inputting the new text data into a structure sensing machine classifier after the step 1), and extracting text event information.
The steps of the invention can adopt the following preferred specific implementation modes:
the step 1) sub-step comprises:
1.1) inputting an input English original prediction into a manual or rule system after preprocessing, and marking a certain amount of training samples according to Automatic Content Extraction and/or Rich Entity relationship Event specifications; if the training sample corresponding to the original corpus exists, skipping the step; the preprocessing comprises removing stop words and mood auxiliary words;
1.2) according to Automatic Content Extraction and/or Rich Entity relationship Event specification, the Event Extraction task comprises extracting Event trigger words, predicting the type of an Event and extracting Event participants, wherein each Event participant corresponds to a certain Event role; for each document, all event references included therein constitute a set of training samples, each training sample consisting of a set of entities, one or more trigger words, and a set of event parameters, denoted { { e { (E) }1,…,es},{t1,…,tr},{a1,…,an}},e1,…,es1 to s entities, t1,…,trRespectively 1 st to r trigger words, a1,…,an1 st to n th event parameters respectively; and one role corresponding to each parameter is represented by the sequence coupling relation between the trigger words and the parameters.
The step 2) sub-step comprises:
2.1) Each training sample in document C corresponds to a sentence S in the document, each document CjCorresponding to a set of training samples S1,…,Si}; for each training sample SiPerforming word segmentation (tokenization) to obtain a group of corresponding words { T }1,…,Tk}; extracting text features for each word (English is expressed as token, which is not limited to words, but can be phrases);
2.2) for each word, firstly extracting basic characteristics including a stem and a noun-to-verb, and roughly predicting the event type possibly corresponding to the word according to a pre-constructed rule by using the basic characteristics;
2.3) extracting a Part-of-Speech label, a WordNet similar meaning word and a Brown clustering category for each word in sequence;
2.4) carrying out syntactic dependency analysis on each sentence by using a stanford parser, and taking the dependency of the word in the syntactic dependency tree as a characteristic, namely the parent node and the child node of the word in the dependency tree; meanwhile, the dependency relationship in the dependency relationship tree is also used as the characteristic of the dependency relationship between the event trigger word and the event parameter;
2.5) if the word corresponds to a certain entity, using information such as entity corresponding type as word feature.
The substep of step 3) comprises:
3.1) constructing a language model by utilizing a neural network, taking all documents as training linguistic data, training the language model to obtain distributed word vector expression x corresponding to wordsi;
3.2) expressing x for distributed word vectorsiConverted into sparse representation y by sparse codingiThe transformation requires optimization of the objective function as in equation (2):
where D is the random initialization model parameter and A is all yiA matrix of compositions; the latter two terms in the formula (2) are regularization terms;
3.3) optimizing an objective function in the formula (2) by using an adagradad random gradient descent algorithm, and defining:
wherein g ist,i,jIs a gradient; etatIs the learning rate at time t; lambda is the hyper-parameter of the model;the parameter updating method is as follows:
wherein y ist+1,i,jRepresenting sparse vector representation yiThe updated value of the jth element at time t.
The substep of step 4) comprises:
4.1) for each training sample, i.e. sentence instance S ═ SiTaking an entity conforming to the event parameter type as an event parameter candidate value, and converting the prediction process of the structure perceptron into a decoding problem of finding the optimal configuration z epsilon gamma corresponding to the model parameter w
z=argmaxz′∈γ(s)w·f(s,z′) (5)
Wherein f (s, z ') represents a feature vector of the instance s in the configuration z'; y(s) represents the set of all possible configurations under the corresponding instance s; the configuration (configuration) is to describe the result of assigning event trigger words and event parameters in sentence instances.
4.2) for each training sample (s, y'), finding the optimal configuration corresponding to s according to the formula (5) in each iteration in the training process, and if the found optimal configuration does not accord with the grund-truth, updating the parameters according to the following rules:
w=w+f(s,y′)-f(s,z) (6)
the decoding problem is solved by using the beam-search strategy based on early-update, and the model decoding process comprises two sub-steps: first enumerate the words in the sentenceCalculating the score w · f (s, z ') of each possible configuration z' according to formula (5) by using the possible trigger word labels, and then selecting the top p configurations with the highest scores, wherein p is used as beam size; then go through each configuration in the beam, once the sample-conforming word s is foundiThe corresponding trigger word label is searched for { e }1,…,esThe role the entity may play in the event, at which point the configuration score is again computed and p best results are selected to join the beam.
The step 5) sub-step comprises: firstly, extracting entities (entity detection) contained in the document, then extracting text features based on the steps 1) -4), and inputting a structure perceptron model obtained by training to obtain an extraction event result.
Compared with the background technology, the invention has the beneficial effects that:
1) compared with the traditional method for extracting events based on the production line, the method for extracting the events based on the flow line classifies the trigger words of the text and extracts the event parameters, and the method provided by the invention extracts the trigger words and the event parameters simultaneously based on the structure perceptron model, so that the error transmission effect in the flow line method is avoided, namely the error in the previous step is transmitted to the next step, and the information acquired in the next step cannot correct the error in the previous step.
2) The invention utilizes abundant text characteristics, not only applies traditional common text characteristics such as word stems, verb-to-noun conversion, part of speech tagging and the like, but also applies an expert system such as near-meaning words, upper and lower-level words and the like in WordNet, simultaneously utilizes sentence structure information extracted by syntactic dependency analysis, and combines the structure information with the structure prediction problem extracted by events; on the other hand, the invention also utilizes a neural network method of hot time to extract the distributed vector expression of the words, trains a Sparse Coding model for improving the utilization convenience and the interpretability of the word vectors, and learns to obtain the Sparse vector characteristics of the words.
Drawings
FIG. 1 is a schematic diagram of sparse representation learning based on word2 vec.
Detailed Description
The invention is further described with reference to the following figures and detailed description.
The method utilizes the sparse coding expression of the distributed word vector characteristics based on the neural network, strengthens the text characteristics, and learns the identification of event trigger words and event participants by using a structural perceptron model at the same time, thereby realizing the event extraction. The text event extraction method combining sparse coding and a structure sensing machine comprises the following steps:
1) constructing the text data as a training sample according to Automatic Content Extraction (ACE) or Rich energy relationship Event (Richter) specification marking;
2) taking the extracted entity as a candidate entity of an event trigger word and an event parameter, and extracting text characteristics (part of speech tagging, dependency syntactic analysis and the like);
3) further extracting text distributed word vector characteristics and learning sparse coding (sparse coding) characteristics;
4) training a structure perceptron classifier by using training samples and extracted text characteristics, and identifying trigger words and parameters related to events in the text;
5) and (3) inputting the new text data into a structure sensing machine classifier after the step 1, and extracting text event information.
The step 1) comprises the following steps:
1.1) inputting an input English original forecast into a manual or rule system (such as a JET system) after preprocessing such as removing stop words, mood-assisted words and the like, and labeling a certain amount of training samples according to an ACE/Richter standard; if there is a training sample corresponding to the original corpus, this step can be omitted;
1.2) according to the ACE/Richter specification, the event extraction task includes extracting event triggers (generally verbs), while predicting the type of event and extracting event participants (i.e., event parameters), each event participant corresponding to a certain event role. Thus, for each document, all event references included therein constitute a set of training examples, each training example consisting of a set of entities (entity), one or more triggers (trigger), and a set of event parameters (argument). Is expressed as { { e { {1,…,es},{t1,…,tr},{a1,…,an}},e1,…,es1 to s entities, t1,…,trRespectively 1 st to r trigger words, a1,…,an1 st to n th event parameters respectively; and one role corresponding to each parameter is represented by the sequence coupling relation between the trigger words and the parameters. The step 2) comprises the following steps:
2.1) Each training sample in document C corresponds to a sentence S in the document, each document CjCorresponding to a set of training samples S1,…,Si}; for each training sample SiPerforming word segmentation (token), and obtaining a group of corresponding words tokens { T }1,…,Tk}; extracting text features for each word token
2.2) for each token, firstly extracting basic characteristics such as a stem, a noun rotation word and the like, and roughly predicting the type between events possibly corresponding to the token by using the basic characteristics according to a pre-constructed rule;
2.3) extracting text characteristics such as a Part-of-Speech (POS) label, a WordNet synonym, a Brown cluster category and the like from each token in sequence;
2.4) carrying out syntactic dependency analysis on each sentence by using a stanford parser, and taking the dependency relationship of the token in a syntactic dependency tree as a characteristic, namely, the parent node and the child node of the token in the dependency tree, wherein the relationship is easy to obtain in a Universal dependences format output by the stanford parser; meanwhile, the dependency relationship in the dependency relationship tree can also be used as the characteristic of the dependency relationship between the event trigger word and the event parameter;
2.5) if the token corresponds to a certain entity, taking information such as entity corresponding type as token characteristics. The step 3) comprises the following steps:
3.1) constructing a language model by utilizing a neural network, taking all documents as training linguistic data, training the language model, and simultaneously obtaining distributed word vector expression x corresponding to wordsi. The language model may take the form of equation (1). Of course, other existing forms can be adopted, and x can be obtainediAnd (4) finishing.
3.2) since the distributed word vector does not have the simplicity and interpretability of the utilization of sparse features in common natural language processing tasks, it can be converted into sparse representation features. For distributed word vector xiConverted into sparse representation y by sparse codingiThe transformation requires optimization of the objective function as in equation (2):
where D is the random initialization model parameter and A is all yiA matrix of compositions; the last two terms in equation (2) are regularization terms to prevent over-fitting of the model.
3.3) optimizing an objective function in the formula (2) by using an adagradad random gradient descent algorithm, and defining:
wherein g ist,i,jIs a gradient; etatIs the learning rate at time t; lambda is the hyper-parameter of the model;the parameter updating method is as follows:
wherein y ist+1,i,jRepresenting sparse vector representation yiThe updated value of the jth element at time t.
The step 4) comprises the following steps:
4.1) the structure perceptron is an extension used for structure prediction based on a standard linear perceptron, and the idea of simultaneously extracting event trigger words and event parameters by using the structure perceptron is as follows: for each training sample, i.e. sentence instance S ═ SiIncluding event parametersThe candidate values (entities that conform to the event parameter type), the prediction process of the structure perceptron is transformed into a decoding problem, i.e. the optimal configuration z epsilon gamma corresponding to the model parameter w is found,
z=argmaxz′∈γ(s)w·f(s,z′) (5)
wherein f (s, z ') represents a feature vector of the instance s in the configuration z'; y(s) represents the set of all possible configurations under the corresponding instance s; the configuration describes the assignment results of event trigger words and event parameters in the sentence instances.
4.2) training process: the training process of the structure sensing machine can be executed in online, for each training sample (s, y'), in each iteration in the training process, the optimal configuration corresponding to s is found according to a formula (5), and if the found optimal configuration does not accord with the grund-truth, the parameters are updated according to the following rules:
w=w+f(s,y)-f(s,z) (6)
the most key step in the training process and the testing process of the model is to find the optimal configuration corresponding to the sentence example under the current parameters, and the beam-search strategy based on early-update is used for solving the decoding problem in the invention. This process can be summarized as algorithm 1:
algorithm 1 structure perceptron training algorithm
And (3) outputting: model parameter w
Step 1, initializing a model parameter w to be 0;
step 2, repeatedly executing the step 2-6 for a T round;
step 3, executing steps 4-6 on each training sample in the set D;
and 4, searching the optimal configuration z corresponding to the current sentence example by using the beam-search.
And 5, if z is not equal to y, updating model parameters:
w=w+f(s,y[1:p])-f(s,z) (7)
4.3) definition of s ═ s<(s1,s2,…,sn),ε>Is a training example, where siFor the ith token of the sentence s,for the event parameter candidate entity, the group-route configuration corresponding to the instance is represented as:
y′=(t1,a1,1,…,a1,m,…,tn,an,1,…,an,m) (8)
wherein t isiRepresenting tokensiCorresponding event trigger assignment (i.e., whether the token is a trigger, and corresponding type), ai,kDenotes siAnd candidate entity ekEvent role relationships between.
The model decoding process comprises two sub-steps: enumerating all possible trigger word labels (representing event types) for a current token in a sentence, calculating the score w · f (s, z ') of each possible configuration z' according to a formula (5), and then selecting the top p configurations with the highest scores, wherein p is used as a beam size; then go through each configuration in the beam, once the sample-conforming word s is foundiThe corresponding trigger word label is searched for { e }1,…,esThe role the entity may play in the event, at which point the configuration score is again computed and p best results are selected to join the beam.
The step 5) comprises the following steps: firstly, extracting entities (entity detection) contained in the document, then extracting text features based on the steps, inputting a structure perceptron model obtained by training, and obtaining an extraction event result.
To verify the effect of the present invention, the method proposed in the present invention was tested using ACE 2005 corpus. Similar to the related research using the ACE 2005 corpus, 40 english news articles in the corpus were used as a test set (containing 672 sentences in total), the remaining 30 documents were randomly selected as a verification set, and the remaining documents were used as a training set (14840 sentences in total). The test uses precision (P), Recall (R), F-measure (F1) as an evaluation index. For the test result, when the type and subtype of the trigger word are matched with the group-route, the trigger word is regarded as correct identification; and when the type and the subtype of the event parameter are matched with the group-route, the event parameter is regarded as correctly identified, and the event role corresponding to the event parameter is correctly identified, so that the event parameter is correctly classified. Because the ACE 2005 corpus contains the manual annotation results of entities and events, step 1) in the invention is not executed during testing.
On the other hand, in order to verify the effect of the invention on other data sets, the method of the invention is also applied to a TAC 2016Event alignment and Linking Task test set, and the test set comprises 30k English documents. Training data still uses ACE 2005, and due to differences in test data and training data, experimental results can be found to be much worse than those directly on the ACE 2005 data set. The test results are shown in the following table.
% | Precision | Recall | F1 |
ACE 2005 | 64.7 | 44.4 | 52.7 |
TAC 2016 | 26.6 | 5.2 | 8.7 |
Claims (5)
1. A text event extraction method combining sparse coding and a structure sensing machine is characterized by comprising the following steps:
1) constructing the text data into a training sample according to the Automatic Content Extraction and/or the Rich Entity relationship event specification label;
2) taking the extracted entity as a candidate entity of an event trigger word and an event parameter, and extracting text characteristics;
3) further extracting text distributed word vector characteristics and learning sparse coding characteristics;
4) training a structure perceptron classifier by using training samples and extracted text characteristics, and identifying trigger words and parameters related to events in the text;
5) inputting new text data into a structure sensing machine classifier after the step 1), and extracting text event information;
the step 3) comprises the following steps:
3.1) constructing a language model by utilizing a neural network, taking all documents as training linguistic data, training the language model to obtain distributed word vector expression x corresponding to wordsi;
3.2) expressing x for distributed word vectorsiConverted into sparse representation y by sparse codingiThe transformation requires optimization of the objective function as in equation (2):
where D is the random initialization model parameter and A is all yiA matrix of compositions; the latter two terms in the formula (2) are regularization terms;
3.3) optimizing an objective function in the formula (2) by using an adagradad random gradient descent algorithm, and defining:
wherein g ist,i,jIs a gradient; etatIs the learning rate at time t; lambda is the hyper-parameter of the model;the parameter updating method is as follows:
wherein y ist+1,i,jRepresenting sparse vector representation yiThe updated value of the jth element at time t.
2. The method for extracting text events by combining sparse coding and structure sensing according to claim 1, wherein the step 1) comprises:
1.1) inputting an input English original prediction into a manual or rule system after preprocessing, and marking a certain amount of training samples according to an AutomaticContent Extraction and/or Rich Entity relationship Event specification; if the training sample corresponding to the original corpus exists, skipping the step; the preprocessing comprises removing stop words and mood auxiliary words;
1.2) according to Automatic Content Extraction and/or Rich Entity relationship Event specification, the Event Extraction task comprises extracting Event trigger words, predicting the type of an Event and extracting Event participants, wherein each Event participant corresponds to a certain Event role; for each document, all event references included therein constitute a set of training samples, each training sample consisting of a set of entities, one or more trigger words, and a set of event parameters, denoted { { e { (E) }1,…,es},{t1,…,tr},{a1,…,an}},e1,…,es1 to s entities, t1,…,trRespectively 1 st to r trigger words, a1,…,an1 st to n th event parameters respectively; and one role corresponding to each parameter is represented by the sequence coupling relation between the trigger words and the parameters.
3. The method for extracting text events by combining sparse coding and structure sensing according to claim 1, wherein the step 2) comprises:
2.1) Each training sample in document C corresponds to a sentence S in the document, each document CjCorresponding to a set of training samples S1,…,Si}; for each training sample SiPerforming word segmentation to obtain a group of corresponding words { T }1,…,Tk}; extracting text features for each word;
2.2) for each word, firstly extracting basic characteristics including a stem and a noun-to-verb, and roughly predicting the event type possibly corresponding to the word according to a pre-constructed rule by using the basic characteristics;
2.3) extracting a Part-of-Speech label, a WordNet similar meaning word and a Brown clustering category for each word in sequence;
2.4) carrying out syntactic dependency analysis on each sentence by using a stanford parser, and taking the dependency of the word in the syntactic dependency tree as a characteristic, namely the parent node and the child node of the word in the dependency tree; meanwhile, the dependency relationship in the dependency relationship tree is also used as the characteristic of the dependency relationship between the event trigger word and the event parameter;
2.5) if the word corresponds to a certain entity, taking the information comprising the entity corresponding type as the word characteristic.
4. The method for extracting text events by combining sparse coding and structure sensing according to claim 1, wherein the step 4) comprises:
4.1) for each training sample, i.e. sentence instance S ═ SiTaking an entity conforming to the event parameter type as an event parameter candidate value, and converting the prediction process of the structure perceptron into a decoding problem of finding the optimal configuration z epsilon gamma corresponding to the model parameter w
z=argmaxz′∈Υ(s)w·f(s,z′) (5)
Wherein f (s, z ') represents a feature vector of the instance s in the configuration z'; y(s) represents the set of all possible configurations under the corresponding instance s; the configuration is used for describing the assignment result of event trigger words and event parameters in sentence instances;
4.2) for each training sample (s, y'), finding the optimal configuration corresponding to s according to the formula (5) in each iteration in the training process, and if the found optimal configuration does not accord with the grund-truth, updating the parameters according to the following rules:
w=w+f(s,y′)-f(s,z) (6)
the decoding problem is solved by using the beam-search strategy based on early-update, and the model decoding process comprises two sub-steps: enumerating all possible trigger word labels for the current word in the sentence, calculating the score w · f (s, z ') of each possible configuration z' according to formula (5), and then selecting the top p configurations with the highest scores, wherein p is used as the beam size; then go through each configuration in the beam, once the sample-conforming word s is foundiThe corresponding trigger word label is searched for { e }1,…,esThe role the entity may play in the event, at which point the configuration score is again computed and p best results are selected to join the beam.
5. The method for extracting text events by combining sparse coding and structure sensing according to claim 1, wherein the step 5) comprises: firstly extracting entities contained in the document, then extracting text features based on the steps 1) -4), and inputting the trained structure perceptron classifier to obtain an extraction event result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610955220.9A CN106599032B (en) | 2016-10-27 | 2016-10-27 | Text event extraction method combining sparse coding and structure sensing machine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610955220.9A CN106599032B (en) | 2016-10-27 | 2016-10-27 | Text event extraction method combining sparse coding and structure sensing machine |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106599032A CN106599032A (en) | 2017-04-26 |
CN106599032B true CN106599032B (en) | 2020-01-14 |
Family
ID=58590466
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610955220.9A Active CN106599032B (en) | 2016-10-27 | 2016-10-27 | Text event extraction method combining sparse coding and structure sensing machine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106599032B (en) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107908642B (en) * | 2017-09-29 | 2021-11-12 | 江苏华通晟云科技有限公司 | Industry text entity extraction method based on distributed platform |
CN107818141B (en) * | 2017-10-10 | 2020-07-14 | 大连理工大学 | Biomedical event extraction method integrated with structured element recognition |
CN110309256A (en) * | 2018-03-09 | 2019-10-08 | 北京国双科技有限公司 | The acquisition methods and device of event data in a kind of text |
CN110309168B (en) * | 2018-03-09 | 2021-08-17 | 北京国双科技有限公司 | Judgment document searching method and device |
CN108647525B (en) * | 2018-05-09 | 2022-02-01 | 西安电子科技大学 | Verifiable privacy protection single-layer perceptron batch training method |
CN110135457B (en) * | 2019-04-11 | 2021-04-06 | 中国科学院计算技术研究所 | Event trigger word extraction method and system based on self-encoder fusion document information |
CN110609896B (en) * | 2019-07-19 | 2022-03-22 | 中国人民解放军国防科技大学 | Military scenario text event information extraction method and device based on secondary decoding |
CN111581954B (en) * | 2020-05-15 | 2023-06-09 | 中国人民解放军国防科技大学 | Text event extraction method and device based on grammar dependency information |
CN112069819A (en) * | 2020-09-10 | 2020-12-11 | 杭州中奥科技有限公司 | Model training method, model training device, and event extraction method |
CN112183030A (en) * | 2020-10-10 | 2021-01-05 | 深圳壹账通智能科技有限公司 | Event extraction method and device based on preset neural network, computer equipment and storage medium |
CN112597366B (en) * | 2020-11-25 | 2022-03-18 | 中国电子科技网络信息安全有限公司 | Encoder-Decoder-based event extraction method |
CN112612871B (en) * | 2020-12-17 | 2023-09-15 | 浙江大学 | Multi-event detection method based on sequence generation model |
CN112906391B (en) * | 2021-03-16 | 2024-05-31 | 合肥讯飞数码科技有限公司 | Meta event extraction method, meta event extraction device, electronic equipment and storage medium |
CN113987163B (en) * | 2021-09-27 | 2024-06-07 | 浙江大学 | Lifelong event extraction method based on ontology guidance |
US20240143633A1 (en) * | 2021-09-28 | 2024-05-02 | Zhejiang University | Generative event extraction method based on ontology guidance |
CN114510928B (en) * | 2022-01-12 | 2022-09-23 | 中国科学院软件研究所 | Universal information extraction method and system based on unified structure generation |
CN114677749A (en) * | 2022-05-05 | 2022-06-28 | 南京大学 | Face recognition countermeasure sample generation method based on limited search space |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104965819A (en) * | 2015-07-12 | 2015-10-07 | 大连理工大学 | Biomedical event trigger word identification method based on syntactic word vector |
CN105512209A (en) * | 2015-11-28 | 2016-04-20 | 大连理工大学 | Biomedicine event trigger word identification method based on characteristic automatic learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7261695B2 (en) * | 2004-03-09 | 2007-08-28 | General Electric Company | Trigger extraction from ultrasound doppler signals |
-
2016
- 2016-10-27 CN CN201610955220.9A patent/CN106599032B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104965819A (en) * | 2015-07-12 | 2015-10-07 | 大连理工大学 | Biomedical event trigger word identification method based on syntactic word vector |
CN105512209A (en) * | 2015-11-28 | 2016-04-20 | 大连理工大学 | Biomedicine event trigger word identification method based on characteristic automatic learning |
Also Published As
Publication number | Publication date |
---|---|
CN106599032A (en) | 2017-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106599032B (en) | Text event extraction method combining sparse coding and structure sensing machine | |
CN113011533B (en) | Text classification method, apparatus, computer device and storage medium | |
CN108363790B (en) | Method, device, equipment and storage medium for evaluating comments | |
Jung | Semantic vector learning for natural language understanding | |
CN111931506B (en) | Entity relationship extraction method based on graph information enhancement | |
CN108628828B (en) | Combined extraction method based on self-attention viewpoint and holder thereof | |
CN108549658B (en) | Deep learning video question-answering method and system based on attention mechanism on syntax analysis tree | |
CN109684642B (en) | Abstract extraction method combining page parsing rule and NLP text vectorization | |
CN111353306B (en) | Entity relationship and dependency Tree-LSTM-based combined event extraction method | |
CN107180026B (en) | Event phrase learning method and device based on word embedding semantic mapping | |
CN109101490B (en) | Factual implicit emotion recognition method and system based on fusion feature representation | |
CN110457690A (en) | A kind of judgment method of patent creativeness | |
CN107818173B (en) | Vector space model-based Chinese false comment filtering method | |
CN112818118A (en) | Reverse translation-based Chinese humor classification model | |
CN114818717A (en) | Chinese named entity recognition method and system fusing vocabulary and syntax information | |
CN117236338B (en) | Named entity recognition model of dense entity text and training method thereof | |
CN111339772B (en) | Russian text emotion analysis method, electronic device and storage medium | |
CN114997288A (en) | Design resource association method | |
CN115510230A (en) | Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism | |
CN117251524A (en) | Short text classification method based on multi-strategy fusion | |
CN117474703B (en) | Topic intelligent recommendation method based on social network | |
Yan et al. | Implicit emotional tendency recognition based on disconnected recurrent neural networks | |
CN114282592A (en) | Deep learning-based industry text matching model method and device | |
Hathout | Acquisition of morphological families and derivational series from a machine readable dictionary | |
Tolegen et al. | Voted-perceptron approach for Kazakh morphological disambiguation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20170426 Assignee: TONGDUN HOLDINGS Co.,Ltd. Assignor: ZHEJIANG University Contract record no.: X2021990000612 Denomination of invention: A text event extraction method combining sparse coding and structure aware machine Granted publication date: 20200114 License type: Common License Record date: 20211012 |
|
EE01 | Entry into force of recordation of patent licensing contract |