CN111274790B - Chapter-level event embedding method and device based on syntactic dependency graph - Google Patents

Chapter-level event embedding method and device based on syntactic dependency graph Download PDF

Info

Publication number
CN111274790B
CN111274790B CN202010090488.7A CN202010090488A CN111274790B CN 111274790 B CN111274790 B CN 111274790B CN 202010090488 A CN202010090488 A CN 202010090488A CN 111274790 B CN111274790 B CN 111274790B
Authority
CN
China
Prior art keywords
event
positive
syntactic dependency
weight
dependency graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010090488.7A
Other languages
Chinese (zh)
Other versions
CN111274790A (en
Inventor
杨鹏
季冬
李幼平
纪雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202010090488.7A priority Critical patent/CN111274790B/en
Publication of CN111274790A publication Critical patent/CN111274790A/en
Application granted granted Critical
Publication of CN111274790B publication Critical patent/CN111274790B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification

Abstract

The invention discloses a chapter level event embedding method and device based on a syntactic dependency graph. Firstly, carrying out syntactic dependency analysis on each news text by using a natural language processing tool to construct a syntactic dependency graph; then, calculating the weight of each node word in the syntactic dependency graph by using an iterative updating algorithm; then, constructing positive and negative training samples by adopting a negative sampling technology based on the syntactic dependency graph; then, respectively constructing and training event element weights and a relation prediction model to obtain a low-dimensional dense vector representation of the chapter-level event; finally, the event embedded vector is input into a machine learning model and applied to related tasks such as event classification, clustering and the like. The invention adopts an unsupervised mode to learn the generated vector representation based on the syntactic dependency graph, can solve the problems of high-dimensional sparseness, semantics and grammar structure deletion of the event representation based on the conventional word bag model, and further improves the effect of downstream event analysis related tasks.

Description

Chapter-level event embedding method and device based on syntactic dependency graph
Technical Field
The invention belongs to the technical field of event embedding, and particularly relates to a chapter-level event embedding method and device based on a syntactic dependency graph.
Background
The event is an important knowledge unit of the human cognitive world, takes the event as a basic unit, processes and analyzes information, and is beneficial to high-efficiency and intelligent application of the information, such as dialogue understanding, information recommendation and the like. There are a large number of text messages describing events such as news, microblogs, referee documents, electronic medical records, etc. in the internet.
Event features are critical to event analysis. In the field of natural language processing, a bag-of-words model is the most common feature representation method, and has the characteristics of simplicity and easiness in implementation. In the field of chapter-level text event analysis, researchers can perform special processing according to event characteristics, such as noun and verb screening according to parts of speech, keyword extraction, named entity extraction and the like. However, the bag of words model ignores semantic information of words, and features represent high dimensions and sparsity. Even two words that are semantically similar are considered to be completely different words. Thus, for two documents that describe related events in different ways, the event feature representation based on the bag of words model may not be able to efficiently characterize the semantic association between them.
Embedding techniques (also known as representation learning techniques) aim to learn vectors that are continuous in low dimensions to represent each discrete object, through which the relationships between the discrete objects can be characterized. In the field of natural language processing, low-dimensional vector representations may be learned for different semantic units, such as words, sentences, paragraphs, documents, and the like. In terms of Word embedding, common methods are Word2vec, glove, fasttext, elmo, bert, etc. Chapter-level events can be handled generally as a document, and document embedding techniques such as Doc2vec, XLNet, etc. can be utilized; or based on the word bag model, the word id is replaced by the corresponding word vector, and then pooling operations such as average pooling, maximum pooling and the like are performed.
However, in the NLP field, the existing embedding techniques mostly train low-dimensional vector representations of words or documents based on language model ideas, by modeling contextual semantic information to predict target words, while ignoring the semantic structure information that is displayed. In the field of event analysis, the explanation of event related entities and the relation among the entities are important for analyzing and understanding different chapter-level events and the relation thereof. The event feature representation can capture semantic information of event related entity words and trigger words, and can also characterize semantic relations among the entities, so that deeper analysis is facilitated.
Disclosure of Invention
The invention aims to: in order to solve the problems of chapter-level event feature representation in the prior art, the invention provides a chapter-level event embedding method and device based on a syntactic dependency graph.
The technical scheme is as follows: the chapter-level event embedding method based on the syntactic dependency graph comprises the following steps:
(1) Acquiring event document corpus, sequentially performing word segmentation, part-of-speech tagging, entity identification, reference resolution and syntactic dependency analysis on each document by using a natural language processing tool, and constructing a vocabulary;
(2) Constructing an initial syntactic dependency graph based on the syntactic dependency analysis result; giving initial weights to nodes in the graph, and iteratively updating weights of all the nodes to generate a final syntactic dependency graph;
(3) Based on the syntactic dependency graph, respectively constructing an event element weight positive and negative sample and an event element relation positive and negative sample by adopting a negative sampling method, wherein the event element weight sample comprises an event id, a target word and a target word weight, and the event element relation sample comprises the event id, a subject, an object, a predicate, the target word and a label;
(4) Constructing an event element weight prediction model based on a Skip-Gram framework, and training the feature representation of an event and elements thereof by utilizing positive and negative samples of the event element weight;
(5) Constructing an event element relation prediction model based on a CBOW architecture, and training the feature representation of an event and elements thereof by utilizing positive and negative samples of the event element relation;
(6) Generating a corresponding event embedded vector for a newly input text based on the trained event element weight prediction model and the event element relation prediction model;
(7) Based on the event embedded vector, the event embedded vector is used as input of a machine learning algorithm to carry out event classification or clustering.
Further, in the step (2), an initial syntax dependency graph is constructed according to the syntax analysis result, specifically:
each word is used as a node, and the dependency relationship among the words represents directed edges among the corresponding nodes; except for verbs, the same words are combined into the same node, and all the dependency relations of the words are reserved; and combining a plurality of words under the same named entity into a node, eliminating the dependency relationship among the words, and reserving all the dependency relationship among the words and other words.
Further, in the step (2), an initial syntax dependency graph is constructed based on the syntax analysis result; giving initial weights to nodes in the graph, iteratively updating weights of all the nodes in the graph, and generating a final syntactic dependency graph, wherein the specific steps are as follows:
(2-1) for each node v in the syntactic dependency i Giving initial weightW 0 (v i ) The method comprises the steps of carrying out a first treatment on the surface of the The maximum iteration number is K;
(2-2) updating each node v i Weight of (2):
W n+1 (v i )=f(G,W n ,v i )
where f is a weight update function, G is a constructed syntactic dependency, W n Is node weight mapping function after the nth iteration, W n+1 (v i ) Is node v after the n+1th iteration i Weighting;
(2-3) if the weights of all nodes of the syntactic dependency graph are updated by the absolute value difference of |W before and after update n+1 (v i )-W n (v i ) If the i is smaller than the threshold a, or the iteration number reaches the maximum iteration number, the final node weight W (v) i )=W n+1 (v i ) The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, executing the step (2-2).
Further, in the step (3), based on the syntactic dependency graph, a negative sampling method is adopted to respectively construct positive and negative samples of event element weights and positive and negative samples of event element relationships, and the specific steps are as follows:
(3-1) constructing positive and negative samples of event element weights, each sample having the format: (event id, target word weight); selecting all noun and verb nodes from the sentence dependency graph according to the part-of-speech labeling result, and carrying out normalization processing on weights of the noun and verb nodes to be used as a regression positive sample set; randomly selecting L nouns and M verbs which are not in the regression positive sample set from the vocabulary, and giving a weight of 0 to be used as the regression negative sample set;
(3-2) constructing positive and negative samples of the event element relationship, each sample having the format: (event id, subject, predicate, object, target word, tag); for each verb in the dependency graph, selecting a direct subject and an object thereof to form a triplet (subject, predicate, object); each element in the triples is selected as a target word, the element is replaced by a set MASK character string [ MASK ], a positive sample with a label of 1 is constructed, and a classification positive sample set is added; and for each positive sample, randomly selecting N words with the same part of speech and different from the target word from the vocabulary according to the part of speech of the target word to replace the target word in the positive sample, constructing N negative samples with the label of 0, and adding the negative samples into the classification negative sample set.
Further, in the step (4), an event element weight prediction model based on Skip-Gram architecture is constructed, and the feature representation of the training event and the elements thereof is performed by using positive and negative samples of the event element weight, which comprises the following specific steps:
(4-1) for event id, d-dimensional embedded vector v is obtained by means of lookup table e The method comprises the steps of carrying out a first treatment on the surface of the For target words, a pre-trained word embedding tool is used for embedding to obtain k-dimensional word vectors
Figure BDA0002383544380000031
/>
(4-2) v e And
Figure BDA0002383544380000041
respectively performing linear transformation to obtain->
Figure BDA0002383544380000042
And->
Figure BDA0002383544380000043
Figure BDA0002383544380000044
And->
Figure BDA0002383544380000045
Is the same as the dimension of:
Figure BDA0002383544380000046
Figure BDA0002383544380000047
wherein W is e And W is t Is a trainable parameter matrix;
(4-3) calculation
Figure BDA0002383544380000048
And->
Figure BDA0002383544380000049
As the predicted target word weight; the mean square error is used as an objective function and is formalized as:
Figure BDA00023835443800000410
loss=(y-u) 2
(4-4) optimizing the objective function with the gradient descent algorithm, updating the event embedded representation v e Parameter matrix W e And W is t And a target word vector
Figure BDA00023835443800000411
Further, in the step (5), an event element relation prediction model based on a CBOW architecture is constructed, and the feature representation of the training event and the elements thereof is performed by utilizing positive and negative samples of the event element relation, wherein the specific steps are as follows:
(5-1) for event id, d-dimensional embedded vector v is obtained by means of look-up tables e The method comprises the steps of carrying out a first treatment on the surface of the For the main and the object words and the target words, respectively utilizing an open source tool fastatex to embed the main and the object words to obtain k-dimensional word vectors
Figure BDA00023835443800000412
And->
Figure BDA00023835443800000413
(5-2) v e
Figure BDA00023835443800000414
And->
Figure BDA00023835443800000415
Respectively performing linear transformation to obtain +.>
Figure BDA00023835443800000416
Figure BDA00023835443800000417
Figure BDA00023835443800000418
And
Figure BDA00023835443800000419
Figure BDA00023835443800000420
wherein W is e ,W s ,W p ,W o And W is t Is a trainable parameter matrix;
(5-3) will
Figure BDA00023835443800000421
Summing and averaging to obtain a context vector +.>
Figure BDA00023835443800000422
Calculate->
Figure BDA00023835443800000423
And->
Figure BDA00023835443800000424
Calculating the output probability through a sigmoid function; the cross entropy loss function is used as an objective function and is formalized as:
Figure BDA00023835443800000425
Figure BDA0002383544380000051
loss=-ylog(p t )-(1-y)log(1-p t )
wherein p is t The output probability distribution of the target word is that y is the real label of the sample;
(5-4) optimizing the objective function with the gradient descent algorithm, updating the event feature representation v e Parameter matrix, W e ,W s ,W p ,W o And W is t And a main predicate vector
Figure BDA0002383544380000052
And target word vector->
Figure BDA0002383544380000053
Further, in the step (6), based on the trained event element weight prediction model and the trained event element relation prediction model, a corresponding event embedding vector is generated for the newly input text, and the specific steps are as follows:
(6-1) generating positive and negative samples of the weight of the constructed event element and positive and negative samples of the relation of the event element of the current text according to the step (3);
(6-2) training an event element weight prediction model based on the event element weight training samples according to the step (4), and updating an event embedded vector; in the training process, except for event embedding vectors, all other parameters are fixed;
(6-3) training an event element relation prediction model based on the event element relation training sample according to the step (5), and updating an event embedded vector; in the training process, all other parameters are fixed except the event embedded vector.
Based on the same inventive concept, the chapter level event embedding device based on the syntax dependency graph comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the chapter level event embedding method based on the syntax dependency graph is realized when the computer program is loaded to the processor.
The beneficial effects are that: according to the invention, the embedding technology is utilized to model the entity importance, action importance and relationship among entities described in the event text in a display manner, and the event elements and the structural information thereof can be captured in a deeper level through the low-dimensional event vector representation obtained through training, so that the problems of high-dimensional sparsity, semantics and grammar structure deletion existing in the event feature representation based on the conventional word bag model are effectively solved, and further, the effects of downstream tasks such as event classification and clustering are improved.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment of the invention.
FIG. 2 is a syntactic dependency analysis diagram according to an embodiment of the present invention.
FIG. 3 is a final syntactic dependency diagram according to an embodiment of the present invention.
Fig. 4 is a view of an event element weight prediction model based on Skip-Gram architecture according to an embodiment of the present invention.
Fig. 5 is a CBOW architecture-based event element relation prediction model diagram according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent.
As shown in fig. 1, the chapter level event embedding method based on the syntactic dependency graph disclosed by the embodiment of the invention comprises the following steps:
(1) Acquiring event document corpus, sequentially performing word segmentation, part-of-speech tagging, entity identification, reference resolution and syntactic dependency analysis on each document by using a natural language processing tool, and constructing a vocabulary;
(2) Constructing an initial syntactic dependency graph based on the syntactic dependency analysis result; giving initial weights to nodes in the graph, and iteratively updating weights of all the nodes to generate a final syntactic dependency graph;
(3) Based on the syntactic dependency graph, respectively constructing an event element weight positive and negative sample and an event element relation positive and negative sample by adopting a negative sampling method, wherein the event element weight sample comprises an event id, a target word and a target word weight, and the event element relation sample comprises the event id, a subject, an object, a predicate, the target word and a label;
(4) Constructing an event element weight prediction model based on a Skip-Gram framework, and training the feature representation of an event and elements thereof by utilizing positive and negative samples of the event element weight;
(5) Constructing an event element relation prediction model based on a CBOW architecture, and training the feature representation of an event and elements thereof by utilizing positive and negative samples of the event element relation;
(6) Generating a corresponding event embedded vector for a newly input text based on the two types of prediction models of the element weights and the element relations of the event after training;
(7) Based on the event embedded vector, the event embedded vector is used as input of a general machine learning algorithm to carry out event classification and clustering.
In an alternative embodiment of the present invention, step (1) above downloads a dog search news data set from the internet containing 18 channels of news data from domestic, international, sports, social, entertainment, etc. during the period 6 months to 7 months of 2012. The document information of the data set is partially shown in Table 1.
Table 1 two example news texts
Figure BDA0002383544380000071
In an alternative embodiment of the present invention, step (1) above uses a stanford CoreNLP natural language processing tool to perform word segmentation, part-of-speech tagging, entity recognition, reference resolution, and syntactic analysis tasks, and adds all nouns and verbs extracted in the dataset to a vocabulary, each term in the vocabulary being in the form of a (word, part-of-speech collection), where the word serves as a key.
The analysis result obtained by the document 1 through step 1 is shown in fig. 2.
In an optional embodiment of the present invention, the step (2) constructs an initial syntactic dependency graph according to the syntactic dependency analysis result, specifically:
each word is used as a node, and the dependency relationship among the words represents directed edges among the corresponding nodes; except for verbs, the same words are combined into the same node, and all the dependency relations of the words are reserved; and combining a plurality of words under the same named entity into a node, eliminating the dependency relationship among the words, and reserving all the dependency relationship among the words and other words.
In an optional embodiment of the present invention, in the step (2), based on the initial syntax dependency graph, the weights of the nodes are iteratively updated by using a PageRank algorithm, so as to generate a final syntax dependency graph, and the specific steps are as follows:
(2-1) for each node v in the syntactic dependency i Giving an initial weight W 0 (v i ) =1.0; the maximum number of iterations is k=100;
(2-2) updating weights of the nodes in the graph, wherein the weights update formula:
Figure BDA0002383544380000072
wherein d is a damping coefficient, the value is 0.85, in (v i ) Is directed to node v i Is set for all nodes, out (v) i ) Is node v i All node sets pointed to; in (v) In undirected graph i )=Out(v i );
(2-3) if the weights of all nodes of the syntactic dependency graph are updated by the absolute value difference of |W before and after update n+1 (v i )-W n (v i ) If the i is smaller than the threshold a, or the iteration number reaches the maximum iteration number, the final node weight W (v) i )=W n+1 (v i ) The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, executing the step (2-2).
The final syntactic dependency of one of the documents is shown in FIG. 3.
In an optional embodiment of the present invention, in the step (3), based on the syntactic dependency graph, positive and negative samples of the event element weights and positive and negative samples of the event element relationships are respectively constructed by adopting a negative sampling method, and the specific steps include: (3-1) constructing positive and negative samples of event element weights, each sample having the format: (event id, target word weight); selecting all noun and verb nodes from the sentence dependency graph according to the part-of-speech labeling result, and carrying out normalization processing on weights of the noun and verb nodes to be used as a regression positive sample set; randomly selecting L nouns and M verbs which are not in the regression positive sample set from the vocabulary, and giving a weight of 0 to be used as the regression negative sample set;
(3-2) constructing positive and negative samples of the event element relationship, each sample having the format: (event id, subject, predicate, object, target word, tag); for each verb in the dependency graph, selecting a direct subject and an object thereof to form a triplet (subject, predicate, object); each element in the triples is selected as a target word, the element is replaced by a set MASK character string [ MASK ], a positive sample with a label of 1 is constructed, and a classification positive sample set is added; for each positive sample, according to the part of speech of the target word, randomly selecting N words with the same part of speech and different from the target word from the vocabulary to replace the target word in the positive sample, and constructing a negative sample with N labels of 0.
In an optional embodiment of the invention, in the step (4), an event element weight prediction model is constructed by using Skip-Gram architecture, and the weights w of the predicted entity words or verbs are represented according to the event features i The model structure is shown in fig. 4, and the training process is specifically as follows:
(4-1) for event id, d-dimensional (e.g., 100-dimensional) embedded vector v is obtained by look-up table means e The method comprises the steps of carrying out a first treatment on the surface of the For target words, the K-dimensional (such as 300-dimensional) word vector is obtained by embedding the target words by using an open source tool fastatex
Figure BDA0002383544380000081
(4-2) v e And
Figure BDA0002383544380000082
respectively performing linear transformation to obtain->
Figure BDA0002383544380000083
And->
Figure BDA0002383544380000084
And->
Figure BDA0002383544380000085
Is the same (e.g., 256 dimensions each):
Figure BDA0002383544380000086
Figure BDA0002383544380000087
wherein W is e And W is t Is a trainable parameter matrix;
(4-3) calculation
Figure BDA0002383544380000091
And->
Figure BDA0002383544380000092
As the predicted target word weight; the mean square error is used as an objective function and is formalized as:
Figure BDA0002383544380000093
loss=(y-u) 2
(4-4) optimizing the objective function with the gradient descent algorithm, updating the event embedded representation v e Parameter matrix W e And W is t And a target word vector
Figure BDA0002383544380000094
In an optional embodiment of the present invention, in the step (5), an event element relationship prediction model is constructed by using a CBOW architecture, according to an event feature representation, two entities are given, their relationships are predicted, or one entity and its associated verb are given, the other entity is predicted, and a learning chapter-level event and its element vector representation are predicted, as shown in fig. 5, which shows the model structure, and the training process specifically includes:
(5-1) for event id, d-dimensional (e.g., 100-dimensional) embedded vector v is obtained by look-up table means e The method comprises the steps of carrying out a first treatment on the surface of the For the main and the object words and the target words, respectively utilizing an open source tool fastatex to embed to obtain a k-dimensional (such as 300-dimensional) word vector
Figure BDA0002383544380000095
And
Figure BDA0002383544380000096
(5-2) v e
Figure BDA0002383544380000097
And->
Figure BDA0002383544380000098
Respectively performing linear transformation to obtain +.>
Figure BDA0002383544380000099
Figure BDA00023835443800000910
Figure BDA00023835443800000911
And
Figure BDA00023835443800000912
the dimension after transformation is 256:
Figure BDA00023835443800000913
wherein W is e ,W s ,W p ,W o And W is t Is a trainable parameter matrix;
(5-3) will
Figure BDA00023835443800000914
Summing and averaging to obtain a context vector +.>
Figure BDA00023835443800000915
Calculate->
Figure BDA00023835443800000916
And->
Figure BDA00023835443800000917
Calculating the output probability through a sigmoid function; the cross entropy loss function is used as an objective function to be formalized as:
Figure BDA00023835443800000918
Figure BDA00023835443800000919
loss=-ylog(p t )-(1-y)log(1-p t )
wherein p is t The output probability distribution of the target word is that y is the real label of the sample;
(5-4) optimizing the objective function with a random gradient descent algorithm, updating the event feature representation v e Parameter matrix W e ,W s ,W p ,W o And W is t And a main predicate vector
Figure BDA0002383544380000101
And target word vector->
Figure BDA0002383544380000102
In an optional embodiment of the present invention, in the step (6), based on the two types of models that are trained, 2000 sports news stories with the same period are selected from news corpus, and a corresponding event embedding vector is generated for each news text, and the specific steps are as follows:
(6-1) producing positive and negative samples of the construction event element weight and positive and negative samples of the event element relation of the current text according to the step (3);
(6-2) training an event element weight prediction model based on the event element weight training samples according to the step (4), and updating an event embedded vector; in the training process, except for event embedding vectors, all other parameters are fixed;
(6-3) training an event element relation prediction model based on the event element relation training sample according to the step (5), and updating an event embedded vector; in the training process, all other parameters are fixed except the event embedded vector.
In an optional embodiment of the present invention, in the step (7), based on the event embedding vector, the event embedding vector is used as input of a Single-Pass clustering algorithm, and event clustering is performed on 2000 news texts, and clustering effects of event feature representation based on TF-IDF are compared; wherein the distance measure selects cosine similarity, and the similarity threshold is set to 0.8.
Based on the same inventive concept, the chapter level event embedding device based on the syntax dependency graph disclosed by the embodiment of the invention comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the chapter level event embedding method based on the syntax dependency graph is realized when the computer program is loaded to the processor.
Those of ordinary skill in the art will recognize that the embodiments described herein are for the purpose of aiding the reader in understanding the principles of the present invention and should be understood that the scope of the invention is not limited to such specific statements and embodiments. Those of ordinary skill in the art can make various other specific modifications and combinations from the teachings of the present disclosure without departing from the spirit thereof, and such modifications and combinations remain within the scope of the present disclosure.

Claims (7)

1. The chapter-level event embedding method based on the syntactic dependency graph is characterized by comprising the following steps:
(1) Acquiring event document corpus, sequentially performing word segmentation, part-of-speech tagging, entity identification, reference resolution and syntactic dependency analysis on each document by using a natural language processing tool, and constructing a vocabulary;
(2) Constructing an initial syntactic dependency graph based on the syntactic dependency analysis result; giving initial weights to nodes in the graph, and iteratively updating weights of all the nodes to generate a final syntactic dependency graph;
(3) Based on the syntactic dependency graph, respectively constructing positive and negative samples of event element weights and positive and negative samples of event element relationships by adopting a negative sampling method; the event element weight sample comprises an event id, a target word and a target word weight, and the event element relation sample comprises an event id, a subject, an object, a predicate, a target word and a label;
(4) Constructing an event element weight prediction model based on a Skip-Gram framework, and training the feature representation of an event and elements thereof by utilizing positive and negative samples of the event element weight;
(5) Constructing an event element relation prediction model based on a CBOW architecture, and training the feature representation of an event and elements thereof by utilizing positive and negative samples of the event element relation;
(6) Generating a corresponding event embedded vector for a newly input text based on the trained event element weight prediction model and the event element relation prediction model; comprising the following steps: (6-1) generating positive and negative samples of the weight of the constructed event element and positive and negative samples of the relation of the event element of the current text according to the step (3); (6-2) training an event element weight prediction model based on the event element weight training samples according to the step (4), and updating an event embedded vector; in the training process, except for event embedding vectors, all other parameters are fixed; (6-3) training an event element relation prediction model based on the event element relation training sample according to the step (5), and updating an event embedded vector; in the training process, except for event embedding vectors, all other parameters are fixed;
(7) Based on the event embedded vector, the event embedded vector is used as input of a machine learning algorithm to carry out event classification or clustering.
2. The chapter level event embedding method based on syntactic dependency according to claim 1, wherein in the step (2), an initial syntactic dependency is constructed according to syntactic dependency analysis result, specifically:
each word is used as a node, and the dependency relationship among the words represents directed edges among the corresponding nodes; except for verbs, the same words are combined into the same node, and all the dependency relations of the words are reserved; and combining a plurality of words under the same named entity into a node, eliminating the dependency relationship among the words, and reserving all the dependency relationship among the words and other words.
3. The chapter level event embedding method based on syntax dependency graph according to claim 2, wherein in the step (2) initial weights are given to nodes in the graph, weights of all nodes in the initial syntax dependency graph are updated iteratively, and a final syntax dependency graph is generated, and the specific steps are as follows:
(2-1) for each node v in the syntactic dependency i Giving an initial weight W 0 (v i ) The method comprises the steps of carrying out a first treatment on the surface of the The maximum iteration number is K;
(2-2) updating each node v i Weight of (2):
W n+1 (v i )=f(G,W n ,v i )
where f is a weight update function, G is a constructed syntactic dependency, W n Is node weight mapping function after the nth iteration, W n+1 (v i ) Is node v after the n+1th iteration i Weighting;
(2-3) if the weights of all nodes of the syntactic dependency graph are updated by the absolute value difference of |W before and after update n+1 (v i )-W n (v i ) If the i is smaller than the threshold a, or the iteration number reaches the maximum iteration number, the final node weight W (v) i )=W n+1 (v i ) The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, executing the step (2-2).
4. The chapter level event embedding method based on a syntactic dependency graph according to claim 1, wherein in the step (3), based on the syntactic dependency graph, a negative sampling method is adopted to construct positive and negative samples of event element weights and positive and negative samples of event element relationships, respectively, and the specific steps are as follows:
(3-1) constructing positive and negative samples of event element weights: selecting all noun and verb nodes from the sentence dependency graph according to the part-of-speech labeling result, and carrying out normalization processing on weights of the noun and verb nodes to be used as a regression positive sample set; randomly selecting L nouns and M verbs which are not in the regression positive sample set from the vocabulary, and giving a weight of 0 to be used as the regression negative sample set;
(3-2) constructing positive and negative samples of event element relation: for each verb in the dependency graph, selecting a direct subject and an object thereof to form a triplet (subject, predicate, object); each element in the triples is selected as a target word, the element is replaced by a set mask character string, a positive sample with a label of 1 is constructed, and a classification positive sample set is added; and for each positive sample, randomly selecting N words with the same part of speech and different from the target word from the vocabulary according to the part of speech of the target word to replace the target word in the positive sample, constructing N negative samples with the label of 0, and adding the negative samples into the classification negative sample set.
5. The chapter level event embedding method based on syntactic dependency according to claim 1, wherein in the step (4), an event element weight prediction model based on Skip-Gram architecture is constructed, and the feature representation of the training event and its elements is performed by using positive and negative event element weights, which comprises the following specific steps:
(4-1) for event id, d-dimensional embedded vector v is obtained by means of lookup table e The method comprises the steps of carrying out a first treatment on the surface of the For target words, a pre-trained word embedding tool is used for embedding to obtain k-dimensional word vectors
Figure FDA0004059199600000031
(4-2) v e And
Figure FDA0004059199600000032
respectively performing linear transformation to obtain->
Figure FDA0004059199600000033
And->
Figure FDA0004059199600000034
Figure FDA0004059199600000035
And->
Figure FDA0004059199600000036
Is the same as the dimension of:
Figure FDA0004059199600000037
Figure FDA0004059199600000038
wherein W is e And W is t Is a trainable parameter matrix;
(4-3) calculation
Figure FDA0004059199600000039
And->
Figure FDA00040591996000000310
As the predicted target word weight; the real target word weight is y; the mean square error is used as an objective function and is formalized as:
Figure FDA00040591996000000311
loss=(y-u) 2
(4-4) optimizing the objective function with the gradient descent algorithm, updating the event embedded representation v e Parameter matrix W e And W is t And a target word vector
Figure FDA00040591996000000312
6. The chapter level event embedding method based on syntactic dependency according to claim 1, wherein in the step (5), an event element relation prediction model based on a CBOW architecture is constructed, and the feature representation of the training event and its elements is performed by using positive and negative samples of the event element relation, which comprises the following specific steps:
(5-1) for event id, d-dimensional embedded vector v is obtained by means of look-up tables e The method comprises the steps of carrying out a first treatment on the surface of the For the main and the object words and the target words, respectively utilizing a pre-trained word embedding tool to embed the main and the object words to obtain k-dimensional word vectors
Figure FDA00040591996000000313
And->
Figure FDA00040591996000000314
(5-2) v e
Figure FDA00040591996000000315
And->
Figure FDA00040591996000000316
Respectively performing linear transformation to obtain +.>
Figure FDA00040591996000000317
Figure FDA00040591996000000318
And
Figure FDA00040591996000000319
wherein W is e ,W s ,W p ,W o And W is t Is a trainable parameter matrix;
(5-3) will
Figure FDA00040591996000000320
Summing and averaging to obtainThe following vectors->
Figure FDA00040591996000000321
Calculate->
Figure FDA00040591996000000322
And->
Figure FDA00040591996000000323
Calculating the output probability through a sigmoid function; the cross entropy loss function is used as an objective function and is formalized as:
Figure FDA0004059199600000041
Figure FDA0004059199600000042
loss=-ylog(p t )-(1-y)log(1-p t )
wherein p is t The output probability distribution of the target word is that y is the real label of the sample;
(5-4) optimizing the objective function using the gradient descent algorithm, updating the event embedding vector v e Parameter matrix W e ,W s ,W p ,W o And W is t And a main predicate vector
Figure FDA0004059199600000043
And target word vector->
Figure FDA0004059199600000044
7. Chapter level event embedding device based on syntactic dependency, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that said computer program, when loaded to the processor, implements a chapter level event embedding method based on syntactic dependency according to any one of claims 1-6.
CN202010090488.7A 2020-02-13 2020-02-13 Chapter-level event embedding method and device based on syntactic dependency graph Active CN111274790B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010090488.7A CN111274790B (en) 2020-02-13 2020-02-13 Chapter-level event embedding method and device based on syntactic dependency graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010090488.7A CN111274790B (en) 2020-02-13 2020-02-13 Chapter-level event embedding method and device based on syntactic dependency graph

Publications (2)

Publication Number Publication Date
CN111274790A CN111274790A (en) 2020-06-12
CN111274790B true CN111274790B (en) 2023-05-16

Family

ID=71000232

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010090488.7A Active CN111274790B (en) 2020-02-13 2020-02-13 Chapter-level event embedding method and device based on syntactic dependency graph

Country Status (1)

Country Link
CN (1) CN111274790B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111783461A (en) * 2020-06-16 2020-10-16 北京工业大学 Named entity identification method based on syntactic dependency relationship
CN111611409B (en) * 2020-06-17 2023-06-02 中国人民解放军国防科技大学 Case analysis method integrated with scene knowledge and related equipment
CN111738008B (en) * 2020-07-20 2021-04-27 深圳赛安特技术服务有限公司 Entity identification method, device and equipment based on multilayer model and storage medium
CN112036439B (en) * 2020-07-30 2023-09-01 平安科技(深圳)有限公司 Dependency relationship classification method and related equipment
CN113312922B (en) * 2021-04-14 2023-10-24 中国电子科技集团公司第二十八研究所 Improved chapter-level triple information extraction method
CN113515624B (en) * 2021-04-28 2023-07-21 乐山师范学院 Text classification method for emergency news

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10073834B2 (en) * 2016-02-09 2018-09-11 International Business Machines Corporation Systems and methods for language feature generation over multi-layered word representation
CN105930318B (en) * 2016-04-11 2018-10-19 深圳大学 A kind of term vector training method and system
CN108628834B (en) * 2018-05-14 2022-04-15 国家计算机网络与信息安全管理中心 Word expression learning method based on syntactic dependency relationship
CN109815497B (en) * 2019-01-23 2023-04-18 四川易诚智讯科技有限公司 Character attribute extraction method based on syntactic dependency
CN110705612A (en) * 2019-09-18 2020-01-17 重庆邮电大学 Sentence similarity calculation method, storage medium and system with mixed multi-features

Also Published As

Publication number Publication date
CN111274790A (en) 2020-06-12

Similar Documents

Publication Publication Date Title
CN109472024B (en) Text classification method based on bidirectional circulation attention neural network
CN111274790B (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN108363743B (en) Intelligent problem generation method and device and computer readable storage medium
CN108897857B (en) Chinese text subject sentence generating method facing field
CN106980683B (en) Blog text abstract generating method based on deep learning
CN110210037B (en) Syndrome-oriented medical field category detection method
CN110321563B (en) Text emotion analysis method based on hybrid supervision model
CN113239700A (en) Text semantic matching device, system, method and storage medium for improving BERT
CN112347268A (en) Text-enhanced knowledge graph joint representation learning method and device
CN111143576A (en) Event-oriented dynamic knowledge graph construction method and device
CN111177383B (en) Text entity relation automatic classification method integrating text grammar structure and semantic information
CN112232087B (en) Specific aspect emotion analysis method of multi-granularity attention model based on Transformer
CN111078833A (en) Text classification method based on neural network
CN112765952A (en) Conditional probability combined event extraction method under graph convolution attention mechanism
CN113168499A (en) Method for searching patent document
CN111581368A (en) Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network
CN113196277A (en) System for retrieving natural language documents
Ren et al. Detecting the scope of negation and speculation in biomedical texts by using recursive neural network
CN113515632B (en) Text classification method based on graph path knowledge extraction
CN113343690B (en) Text readability automatic evaluation method and device
CN111858898A (en) Text processing method and device based on artificial intelligence and electronic equipment
CN111881256B (en) Text entity relation extraction method and device and computer readable storage medium equipment
CN111753088A (en) Method for processing natural language information
CN115858750A (en) Power grid technical standard intelligent question-answering method and system based on natural language processing
CN113220865B (en) Text similar vocabulary retrieval method, system, medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant