CN110674642A - Semantic relation extraction method for noisy sparse text - Google Patents

Semantic relation extraction method for noisy sparse text Download PDF

Info

Publication number
CN110674642A
CN110674642A CN201910806205.1A CN201910806205A CN110674642A CN 110674642 A CN110674642 A CN 110674642A CN 201910806205 A CN201910806205 A CN 201910806205A CN 110674642 A CN110674642 A CN 110674642A
Authority
CN
China
Prior art keywords
semantic
participle
vector
layer
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910806205.1A
Other languages
Chinese (zh)
Other versions
CN110674642B (en
Inventor
赵翔
庞宁
谭真
郭爱博
殷风景
唐九阳
葛斌
肖卫东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201910806205.1A priority Critical patent/CN110674642B/en
Publication of CN110674642A publication Critical patent/CN110674642A/en
Application granted granted Critical
Publication of CN110674642B publication Critical patent/CN110674642B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a semantic relation extraction method for noisy sparse texts, which comprises the following steps of: establishing a training sample set; constructing a semantic relation extraction model; training a semantic relation extraction model; establishing a data set of semantics to be extracted; and extracting semantic relations from the data set of the semantics to be extracted by using the trained semantic relation extraction model. The method adopts different convolutional neural networks to respectively extract the word segmentation sequence and the characteristics of the corresponding dependency path, avoids error accumulation, and has obvious effect improvement compared with the traditional method for extracting the relationship based on the characteristics and the kernel; the two information representations of the relationship example are fully utilized, and are effectively combined through the feature fusion layer, so that more comprehensive information is provided for accurately predicting the semantic relationship of the target entity pair; and a multi-instance learning method is added for noise suppression under the condition of sample sparsity, and compared with an attention mechanism, the mechanism has no under-fitting problem and is more suitable for the semantic relation extraction problem under the sparse sample.

Description

Semantic relation extraction method for noisy sparse text
Technical Field
The invention belongs to the field of extraction of semantic relations of Chinese texts, and particularly relates to a method for extracting entity semantic relations in sparse Chinese texts containing noise.
Background
In recent years, the knowledge graph plays an extremely important role in a series of knowledge-driven applications, such as machine translation, a recommendation system, a question-answering system and the like, and the relation extraction technology is a key ring for automatically constructing the knowledge graph and has important practical significance. The relation extraction is a process of obtaining the semantic relation of the labeled entity pair by understanding the semantic information contained in the unstructured text. Currently, the mainstream relational extraction method is a supervised and remote supervised based relational extraction method.
In order to avoid the problem that the traditional supervised relationship extraction method is influenced by error accumulation in a natural language processing tool, a neural network is widely used for embedding and representing texts, and the semantic features of the texts are automatically extracted. The supervision method needs definite manual annotation of texts, and the annotation process is time-consuming and labor-consuming. To solve this problem, an alternative paradigm, remote supervision, is proposed. The paradigm provides oversight with the existing knowledge graph, Freebase, heuristically aligning text with Freebase to generate large amounts of weakly annotated data. It is clear that this heuristic alignment method can introduce noisy data, which can seriously affect the performance of the relationship extractor.
To solve the problem of wrong annotation, a multi-instance learning method is proposed which can be used to alleviate the problem of wrong annotation under remote supervision, and in addition, a selective attention mechanism has trainable parameters, and by learning, probability distribution on noise is fitted, and noise instance influence is dynamically de-weakened. However, in the case of sparse data, the conventional attention mechanism and multi-instance learning do not fit well to the probability distribution on the noisy data, so that the semantic relation is not extracted from the noisy sparse text ideally. In addition, the existing relation extraction method is advanced in development of English corpus, and the relation extraction research of Chinese corpus is relatively lagged behind.
Disclosure of Invention
In view of the above, the present invention provides a semantic relationship extraction method for a noisy sparse text, which is used for extracting structured knowledge from an unstructured corpus, and in particular, extracting semantic relationships from a noisy sparse chinese text.
Based on the above purpose, the semantic relationship extraction method for the noisy sparse text provided by the invention comprises the following steps:
step 1, establishing a Chinese text training sample set;
step 2, constructing a semantic relation extraction model;
step 3, training a semantic relation extraction model;
step 4, establishing a data set of semantics to be extracted;
and 5, extracting the semantic relation from the data set of the semantics to be extracted by using the trained semantic relation extraction model.
The training sample set is data which is weakly labeled by using linguistic data on a knowledge graph remote supervision Wikipedia, and each training instance comprises a target entity pair, a word segmentation sequence, a dependency path and a weak supervision label;
the dependency path is the shortest dependency path and is defined as: shortest paths between pairs of entities in the syntactic analysis dependency tree.
Furthermore, the semantic relation extraction model comprises an input layer, an embedded layer, a convolutional layer, a feature fusion layer and a full connection layer, wherein the input layer is sequentially connected with the embedded layer, and provides an input interface for describing an example packet formed by all word segmentation sequences of an entity pair and corresponding dependency paths; the embedded layer maps the input word segmentation sequence and the corresponding dependency path to a low-dimensional vector space in a representation learning mode; the convolution layers are two independent convolution networks and are respectively used for extracting semantic features of all participle sequences and all corresponding dependency paths in the example package; the feature fusion layer fuses complementary semantic features from two aspects of a word sequence and a corresponding dependency path; and the full connection layer maps the instances to the defined relation set to obtain the semantic relation between the entity pairs.
Furthermore, the semantic relation extraction model also comprises a multi-instance learning mechanism module, wherein the multi-instance learning mechanism module is used for acquiring data from the full-connection layer, feeding back a learning result to the convolutional layer and guiding the calculation operation of the convolutional layer; the multi-instance learning mechanism module selects the best instance in the instance packet as a training and predicting instance in the model learning process, discards other instances and inhibits the influence of noise instances.
Specifically, in step 3, the process of training the semantic relationship extraction model is as follows: after initialization, the cross entropy is used as a loss function, a random gradient descent method is adopted to iteratively update model parameters of the semantic relationship extraction model through a multi-instance learning method, the gradient is checked once every iteration to find the optimal solution of the weight and the bias of each network layer, and the optimal semantic relationship extraction model of the training is obtained after iteration is carried out for multiple times.
Thus, in step 5, the trained semantic relationship extraction model is used to extract the semantic relationship of the noisy Chinese text, and structured knowledge is obtained from the unstructured text data.
Compared with the prior art, the invention has the following advantages and beneficial effects:
(1) the invention adopts different convolutional neural networks to respectively extract the characteristics of the word segmentation sequence and the corresponding dependency path, automatically generates the embedded representation, avoids error accumulation, and has obvious effect improvement compared with the traditional method for extracting the relationship based on the characteristics and the kernel.
(2) The invention fully utilizes two information representations of the relationship example, namely the word segmentation sequence and the dependency path, and effectively combines the word segmentation sequence and the dependency path through the characteristic fusion layer, thereby providing more comprehensive information for accurately predicting the semantic relationship of the target entity pair.
(3) On the basis of a model, a multi-instance learning method is added for noise suppression under the condition of sparse Chinese samples, and compared with an attention mechanism, the mechanism has no under-fitting problem and is more suitable for semantic relation extraction under sparse samples.
The method respectively provides specific solutions for solving the problems that data construction depends on manpower, a denoising method is under-fitted under the condition of sparse Chinese samples and semantic information is not fully utilized in the prior art, so that noise influence can be effectively reduced, the semantic information can be more fully acquired, the relation can be more accurately predicted, and the reliability is high.
Drawings
FIG. 1 is a schematic overall flow chart of an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of the semantic relationship extraction model of the present invention.
Detailed Description
The invention is further described with reference to the accompanying drawings, but the invention is not limited in any way, and any alterations or substitutions based on the teaching of the invention are within the scope of the invention.
As shown in fig. 1, a semantic relationship extraction method for noisy sparse text includes the following steps:
step 1, establishing a Chinese text training sample set;
step 2, constructing a semantic relation extraction model;
step 3, training a semantic relation extraction model;
step 4, establishing a data set of semantics to be extracted;
and 5, extracting the semantic relation from the data set of the semantics to be extracted by using the trained semantic relation extraction model.
The training sample set is data which is weakly labeled by utilizing linguistic data on a knowledge graph remote supervision Wikipedia, and each training instance comprises a target entity pair, a word segmentation sequence, a dependency path and a weak supervision label. For each Chinese text, entity pairs contained in the Chinese text are predetermined, a word segmentation sequence of the original text is obtained through a word segmentation tool, a syntactic analysis tree is obtained through a syntactic analysis tool, and a dependency path is extracted from the syntactic analysis tree. And putting the instances of the same entity pair together to form an instance packet, and preparing data for denoising of a subsequent multi-instance learning mechanism. The dependency path is the shortest dependency path and is defined as: shortest paths between pairs of entities in the syntactic analysis dependency tree.
As shown in fig. 2, the semantic relationship extraction model includes an input layer, an embedded layer, a convolutional layer, a feature fusion layer, and a full connection layer, which are connected in sequence, where the input layer provides an input interface for describing an example package composed of all word segmentation sequences of an entity pair and corresponding dependency paths; the embedded layer maps the input word segmentation sequence and the corresponding dependency path to a low-dimensional vector space in a representation learning mode; the convolution layers are two independent convolution networks and are respectively used for extracting semantic features of all participle sequences and all corresponding dependency paths in the example package; the feature fusion layer fuses complementary semantic features from two aspects of a word sequence and a corresponding dependency path; and the full connection layer maps the instances to the defined relation set to obtain the semantic relation between the entity pairs.
The semantic relation extraction model also comprises a multi-instance learning mechanism module, wherein the multi-instance learning mechanism module is used for acquiring data from the full-connection layer, feeding back a learning result to the convolutional layer and guiding the calculation operation of the convolutional layer; the multi-instance learning mechanism module selects the best instance in the instance packet as a training and predicting instance in the model learning process, discards other instances and inhibits the influence of noise instances.
Specifically, the input layer provides an input interface for describing an instance package composed of all the segmentation sequences and corresponding dependency paths of an entity pair, in this embodiment, the number of the input interfaces is 2, which respectively correspond to the segmentation sequences and the dependency paths, and the input definition of each instance is as follows:
Figure BDA0002184599790000052
wherein, x represents the input word segmentation sequence,
Figure BDA0002184599790000053
representing the ith participle in the participle sequence, s representing the input dependency path,
Figure BDA0002184599790000054
representing the ith participle on the dependent path, m and n are set to fixed values of 100 and 40 in this embodiment.
Specifically, the embedding layer maps the input Word segmentation sequence and the corresponding dependency path to a low-dimensional vector space in a representation learning manner, and the layer maps each Word segmentation on the input Word segmentation sequence and the dependency path to a vector representation, in this embodiment, the vector representation of each Word segmentation includes a Word vector, a position vector and a part-of-speech tagging vector, where the Word vector is obtained by training in advance through a Word2Vec algorithm and includes semantic information of the Word segmentation, the dimension is 50, the position vector is obtained by random initialization and includes position information of the Word segmentation in the Word segmentation sequence or the dependency path, the dimension is 10, the part-of-speech tagging vector is expressed as a unit vector and includes part-of-speech information of the Word, and the dimension is 15. Therefore, any participle in the participle sequence or the dependency path can be represented by the following vector: w is ai=[vword:vposition:vtag]Wherein v isword,vpositionAnd vtagWord vectors, position vectors and part-of-speech tagging vectors, w, representing participles, respectivelyiK, which in this embodiment is 75.
Horizontally connecting each participle vector representation according to the order of the participle sequence and the dependency path to obtain the vector representation of the participle sequence and the dependency path, wherein the vector representation is represented as follows:
Figure BDA0002184599790000061
Figure BDA0002184599790000062
wherein X represents the vector representation of the participle sequence after passing through the embedding layer, Wi xRepresenting the vector representation of the ith participle in the participle sequence, S representing the vector representation after the dependency path passes through the embedded layer, Wi sA vector representation representing the ith participle in the dependency path.
The convolution layer is two independent convolution networks which are respectively used for extracting semantic features of all participle sequences and all corresponding dependency paths in the example package. Since the two convolutional networks have the same operation mechanism, the definition and operation of the layer under this embodiment are only illustrated by the word segmentation sequence. To obtain more useful information from the data, each convolution network is provided with a plurality of convolution filters, denoted as
Figure BDA0002184599790000063
In this embodiment, the number of convolution filters d is set to 230, the window size w is set to 3, and the convolution operation is defined as:
Figure BDA0002184599790000064
while
Figure BDA0002184599790000065
Wherein i is more than 1 and less than d, j is more than or equal to 1 and less than or equal to m-w +1,
Figure BDA0002184599790000066
for the ith convolution filter, si:jFor the horizontal concatenation of the ith participle to jth participle vector representations,
Figure BDA0002184599790000067
expressing the dot product operation of the matrix, and finally generating an intermediate feature vector by each convolution filterThus, the intermediate eigenvector sequence generated by the full convolution filter is C ═ C1,c2,…,cd}. After convolution, maximum pooling is used to extract the most significant features in each dimension, defined as:cijis the element of the corresponding position in C. Finally generating a feature vector of each participle sequence
Figure BDA0002184599790000072
Similarly, a feature vector may be generated for each dependency path
Figure BDA0002184599790000073
The feature fusion layer fuses complementary semantic features from the word segmentation sequence and the corresponding dependency path, and essentially, the feature fusion layer is a weighted sum of feature vectors from the word segmentation sequence and the corresponding dependency path, and is defined as: p ═ α px+(1-α)psWhere α is the weight sparseness, and in this embodiment, the value is 0.5.
The fully-connected layer maps the instances onto a defined set of relationships, obtaining semantic relationships between pairs of entities, defined as: o ═ Up + v, where,
Figure BDA0002184599790000074
in the form of a matrix of coefficients,in order to be offset,
Figure BDA0002184599790000076
is a confidence score corresponding to all relationship types, where nrIs the number of all relationships, set to 5 in this embodiment, the relationship with the highest confidence score is considered the semantic relationship between the pair of entities.
The multi-instance learning mechanism module selects the best instance in the instance packet as a training and predicting instance in the model learning process, discards other instances and inhibits the influence of noise instances. The training data has a series of example packets, denoted as B ═ B1,B2,…,BN}. Any one of the example packages BiIn which contains | BiI instances, under this mechanism, the loss function is defined as:
Figure BDA0002184599790000078
wherein the content of the first and second substances,
Figure BDA0002184599790000079
as example bag BiAn example of (1), okrAs an exampleConfidence score of corresponding relation r, okjAs an example
Figure BDA00021845997900000711
And (4) computing and summing the confidence scores of the corresponding relations j, wherein theta is all parameters in the model. The principle of θ update is:
Figure BDA00021845997900000712
wherein η is the learning rate.
Therefore, in step 3, the process of training the semantic relation extraction model is as follows: after initialization, the cross entropy is used as a loss function, a random gradient descent method is adopted to iteratively update model parameters of the semantic relationship extraction model through a multi-instance learning method, the gradient is checked once every iteration to find the optimal solution of the weight and the bias of each network layer, and the optimal semantic relationship extraction model of the training is obtained after iteration is carried out for multiple times.
Because the model is trained by the stochastic gradient descent method under different initialization conditions, the prediction results are different every time, the predictions of the model trained under different initialization conditions can be taken as the output of the whole system after being statistically averaged, and finally the prediction system of the semantic relationship is obtained.
Specifically, the specific steps of training the semantic relationship extraction model are as follows:
step 301, writing the instance packet in the training sample data set into a data file, wherein the data format of the data file conforms to the read-in data interface of the semantic relation extraction model;
step 302, setting training parameters: reading a file path, iteration times and a learning rate, setting the dimension and size of each network layer, and setting an initial training weight and a training bias;
step 303, loading a training file: loading a training set consisting of a semantic relation extraction model definition file, a network layer parameter definition file and training data;
304, by a multi-instance learning method, carrying out iteration updating on the semantic relationship extraction model by adopting a random gradient descent method, checking the gradient once every iteration to find the optimal solution of the weight and the bias of each network layer, and iterating for multiple times to obtain the optimal semantic relationship extraction model of the training;
and 305, taking 30% of data in the sample set as a test sample set, adopting the same preprocessing mode as the training sample set for the test sample set, and testing the data in the test sample set by using the obtained semantic relation prediction system.
The existing relation extraction method is developed more advanced on English corpus, and the relation extraction research on Chinese corpus is relatively lagged behind. The existing supervised relation extraction method relies on manual annotation data, the manual annotation process is time-consuming and labor-consuming, and aiming at the problem, the invention adopts a remote supervision technology to heuristically align the unmarked text with the knowledge graph and automatically generate weak annotation data. The existing relationship extraction method based on remote supervision generally utilizes an attention mechanism to suppress the influence of an error labeling example on an extraction result, and the attention mechanism essentially obtains probability distribution on noise data through learning on a large amount of data so as to dynamically remove noise. In fact, the knowledge graph in the Chinese field is slow in development and small in scale, so that training data constructed by utilizing remote supervision is relatively few and is not enough to enable an attention mechanism to be fully fitted, and therefore, aiming at the problem that the attention mechanism is not fit enough, the multi-instance learning method is adopted, the mechanism does not need to learn parameters, and the method is more suitable for the situation that samples are sparse. In addition, current methods of extracting relationships employ a single input, a word sequence or a dependency path, and in fact, the two have a complementary relationship, the word sequence provides supplemental information for the dependency path, and the dependency path removes noise participles in the word sequence. The invention utilizes knowledge map in Chinese entertainment field and weak labeled data of Chinese Wikipedia structure, and combines the improved scheme after preprocessing such as word segmentation and syntactic analysis, thereby solving the existing problems.
The above embodiment is an implementation manner of the method in noisy sparse chinese text, but the implementation manner of the present invention is not limited by the above embodiment, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be regarded as equivalent substitutions, and are included within the scope of the present invention.

Claims (9)

1. A semantic relation extraction method for noisy sparse text is characterized by comprising the following steps:
step 1, establishing a Chinese text training sample set;
step 2, constructing a semantic relation extraction model;
step 3, training a semantic relation extraction model;
step 4, establishing a data set of semantics to be extracted;
and 5, extracting the semantic relation from the data set of the semantics to be extracted by using the trained semantic relation extraction model.
The training sample set is data which is weakly labeled by using linguistic data on a knowledge graph remote supervision Wikipedia, and each training instance comprises a target entity pair, a word segmentation sequence, a dependency path and a weak supervision label;
the dependency path is the shortest dependency path and is defined as: shortest paths between pairs of entities in the syntactic analysis dependency tree.
2. The semantic relationship extraction method according to claim 1, wherein the semantic relationship extraction model comprises an input layer, an embedded layer, a convolutional layer, a feature fusion layer and a full connection layer, the input layer is connected in sequence, and the input layer provides an input interface for an example package composed of all word segmentation sequences describing a certain entity pair and corresponding dependency paths; the embedded layer maps the input word segmentation sequence and the corresponding dependency path to a low-dimensional vector space in a representation learning mode; the convolution layers are two independent convolution networks and are respectively used for extracting semantic features of all participle sequences and all corresponding dependency paths in the example package; the feature fusion layer fuses complementary semantic features from two aspects of a word sequence and a corresponding dependency path; and the full connection layer maps the instances to the defined relation set to obtain the semantic relation between the entity pairs.
3. The semantic relationship extraction method according to claim 2, wherein the semantic relationship extraction model further comprises a multi-instance learning mechanism module, which acquires data from the fully-connected layer, feeds back a learning result to the convolutional layer, and guides a calculation operation of the convolutional layer; the multi-instance learning mechanism module selects the best instance in the instance packet as a training and predicting instance in the model learning process, discards other instances and inhibits the influence of noise instances.
4. The semantic relationship extraction method according to claim 3, wherein the process of training the semantic relationship extraction model is as follows: after initialization, the cross entropy is used as a loss function, a random gradient descent method is adopted to iteratively update model parameters of the semantic relationship extraction model through a multi-instance learning method, the gradient is checked once every iteration to find the optimal solution of the weight and the bias of each network layer, and the optimal semantic relationship extraction model of the training is obtained after iteration is carried out for multiple times.
5. The semantic relation extraction method according to claim 2 or 3, wherein the number of input interfaces of the input layer is 2, and the input interfaces respectively correspond to the participle sequence and the dependency path, and the input of each instance is defined as follows:
Figure FDA0002184599780000021
wherein, x represents the input word segmentation sequence,
Figure FDA0002184599780000022
representing the ith participle in the participle sequence, s representing the input dependency path,
Figure FDA0002184599780000023
representing the ith participle on the dependency path;
the embedded layer respectively maps each participle on an input participle sequence and a dependency path into vector representation, the vector representation of each participle comprises a Word vector, a position vector and a part-of-speech tagging vector, wherein the Word vector is obtained by pre-training through a Word2Vec algorithm and comprises semantic information of the participle, the position vector is obtained by random initialization and comprises position information of the participle in the participle sequence or the dependency path, and the part-of-speech tagging vector is represented as a unit vector and comprises part-of-speech information of the participle; any participle in the participle sequence or the dependency path can be represented by the following vector: w is ai=[vword:vposition:vtag]Wherein v isword,vpositionAnd vtagWord vectors, position vectors and part-of-speech tagging vectors, w, representing participles, respectivelyiHas a dimension of k;
representing each participle vector according to a participle sequence and according toThe sequences in the storage paths are horizontally connected in sequence to obtain the vector representation of the word segmentation sequence and the dependency path, and the vector representation is as follows:
Figure FDA0002184599780000031
wherein X represents the vector representation of the participle sequence after passing through the embedding layer, Wi xRepresenting the vector representation of the ith participle in the participle sequence, S representing the vector representation after the dependency path passes through the embedded layer, Wi sA vector representation representing the ith participle in the dependency path.
6. The semantic relationship extraction method according to claim 5, wherein the convolution layer has the same operation mechanism for two independent convolution networks, and each convolution network is provided with a plurality of convolution filters represented asThe number of convolution filters is d, the window size is w, and the convolution operation is defined as:
Figure FDA0002184599780000033
while
Figure FDA0002184599780000034
Wherein i is more than 1 and less than d, j is more than or equal to 1 and less than or equal to m-w +1,
Figure FDA0002184599780000035
for the ith convolution filter, si:jFor the horizontal concatenation of the ith participle to jth participle vector representations,
Figure FDA0002184599780000036
expressing the dot product operation of the matrix, and finally generating an intermediate feature vector by each convolution filter
Figure FDA0002184599780000037
The intermediate eigenvector sequence generated by all convolution filters is C ═ C1,c2,…,cdMaximum pooling, which is used to extract the most significant features in each dimension, is defined as:
Figure FDA0002184599780000038
cijfinally generating a feature vector of each participle sequence for the elements at the corresponding positions in C
Figure FDA0002184599780000039
7. The semantic relationship extraction method according to claim 6, wherein the weighted summation of the feature vectors from the word segmentation sequence and the corresponding dependency path by the feature fusion layer is defined as: p ═ α px+(1-α)psWhere α is sparse weight, psFor the feature vector of each dependency path, pxA feature vector for each sequence of participles.
8. The semantic relationship extraction method according to claim 7, wherein the fully-connected layer maps the instances to the defined set of relationships to obtain the semantic relationship between the entity pairs, which is defined as: o ═ Up + v, where,
Figure FDA00021845997800000310
in the form of a matrix of coefficients,
Figure FDA00021845997800000311
in order to be offset,
Figure FDA00021845997800000312
is a confidence score corresponding to all relationship types, where nrIs the number of all relationships, the relationship with the highest confidence score is considered the semantic relationship between the pair of entities.
9. The semantic relationship extraction method according to claim 8, wherein the training data in the multi-instance learning mechanism module has a series of instance packages, denoted as B ═ B1,B2,…,BNAny instance packet BiIn which contains | BiI instances, under this mechanism, the loss function is defined as:
Figure FDA0002184599780000041
Figure FDA0002184599780000042
wherein the content of the first and second substances,
Figure FDA0002184599780000043
as example bag BiAn example of (1), okrAs an example
Figure FDA0002184599780000044
The confidence scores of the corresponding relations r, theta is all parameters in the model, and the principle of updating theta is as follows:
Figure FDA0002184599780000045
wherein eta is the learning rate, and the process of training the semantic relation extraction model is as follows: after initialization, the cross entropy is used as a loss function, a random gradient descent method is adopted to iteratively update model parameters of the semantic relationship extraction model through a multi-instance learning method, the gradient is checked once every iteration to find the optimal solution of the weight and the bias of each network layer, and the optimal semantic relationship extraction model of the training is obtained after iteration is carried out for multiple times.
CN201910806205.1A 2019-08-29 2019-08-29 Semantic relation extraction method for noisy sparse text Active CN110674642B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910806205.1A CN110674642B (en) 2019-08-29 2019-08-29 Semantic relation extraction method for noisy sparse text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910806205.1A CN110674642B (en) 2019-08-29 2019-08-29 Semantic relation extraction method for noisy sparse text

Publications (2)

Publication Number Publication Date
CN110674642A true CN110674642A (en) 2020-01-10
CN110674642B CN110674642B (en) 2023-04-18

Family

ID=69076445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910806205.1A Active CN110674642B (en) 2019-08-29 2019-08-29 Semantic relation extraction method for noisy sparse text

Country Status (1)

Country Link
CN (1) CN110674642B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753546A (en) * 2020-06-23 2020-10-09 深圳市华云中盛科技股份有限公司 Document information extraction method and device, computer equipment and storage medium
CN113392216A (en) * 2021-06-23 2021-09-14 武汉大学 Remote supervision relation extraction method and device based on consistency text enhancement
CN117095825A (en) * 2023-10-20 2023-11-21 鲁东大学 Human immune state prediction method based on multi-instance learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009017464A1 (en) * 2007-07-31 2009-02-05 Agency For Science, Technology And Research Relation extraction system
US20190005026A1 (en) * 2016-10-28 2019-01-03 Boe Technology Group Co., Ltd. Information extraction method and apparatus
CN109408642A (en) * 2018-08-30 2019-03-01 昆明理工大学 A kind of domain entities relation on attributes abstracting method based on distance supervision
US20190122145A1 (en) * 2017-10-23 2019-04-25 Baidu Online Network Technology (Beijing) Co., Ltd. Method, apparatus and device for extracting information
CN109783799A (en) * 2018-12-13 2019-05-21 杭州电子科技大学 A kind of relationship extracting method based on semantic dependency figure

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009017464A1 (en) * 2007-07-31 2009-02-05 Agency For Science, Technology And Research Relation extraction system
US20190005026A1 (en) * 2016-10-28 2019-01-03 Boe Technology Group Co., Ltd. Information extraction method and apparatus
US20190122145A1 (en) * 2017-10-23 2019-04-25 Baidu Online Network Technology (Beijing) Co., Ltd. Method, apparatus and device for extracting information
CN109408642A (en) * 2018-08-30 2019-03-01 昆明理工大学 A kind of domain entities relation on attributes abstracting method based on distance supervision
CN109783799A (en) * 2018-12-13 2019-05-21 杭州电子科技大学 A kind of relationship extracting method based on semantic dependency figure

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753546A (en) * 2020-06-23 2020-10-09 深圳市华云中盛科技股份有限公司 Document information extraction method and device, computer equipment and storage medium
CN111753546B (en) * 2020-06-23 2024-03-26 深圳市华云中盛科技股份有限公司 Method, device, computer equipment and storage medium for extracting document information
CN113392216A (en) * 2021-06-23 2021-09-14 武汉大学 Remote supervision relation extraction method and device based on consistency text enhancement
CN113392216B (en) * 2021-06-23 2022-06-17 武汉大学 Remote supervision relation extraction method and device based on consistency text enhancement
CN117095825A (en) * 2023-10-20 2023-11-21 鲁东大学 Human immune state prediction method based on multi-instance learning
CN117095825B (en) * 2023-10-20 2024-01-05 鲁东大学 Human immune state prediction method based on multi-instance learning

Also Published As

Publication number Publication date
CN110674642B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN110134757B (en) Event argument role extraction method based on multi-head attention mechanism
CN110633467B (en) Semantic relation extraction method based on improved feature fusion
WO2018028077A1 (en) Deep learning based method and device for chinese semantics analysis
WO2020211720A1 (en) Data processing method and pronoun resolution neural network training method
CN111325029B (en) Text similarity calculation method based on deep learning integrated model
CN111709242B (en) Chinese punctuation mark adding method based on named entity recognition
CN111274394A (en) Method, device and equipment for extracting entity relationship and storage medium
CN110674642B (en) Semantic relation extraction method for noisy sparse text
CN111966812B (en) Automatic question answering method based on dynamic word vector and storage medium
CN112270196A (en) Entity relationship identification method and device and electronic equipment
CN112905795A (en) Text intention classification method, device and readable medium
CN112507039A (en) Text understanding method based on external knowledge embedding
CN110968725B (en) Image content description information generation method, electronic device and storage medium
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN113434683B (en) Text classification method, device, medium and electronic equipment
CN107357785A (en) Theme feature word abstracting method and system, feeling polarities determination methods and system
CN111475622A (en) Text classification method, device, terminal and storage medium
WO2023137911A1 (en) Intention classification method and apparatus based on small-sample corpus, and computer device
CN111709225B (en) Event causal relationship discriminating method, device and computer readable storage medium
CN115374845A (en) Commodity information reasoning method and device
CN114880307A (en) Structured modeling method for knowledge in open education field
CN113761875B (en) Event extraction method and device, electronic equipment and storage medium
CN114091406A (en) Intelligent text labeling method and system for knowledge extraction
CN113779966A (en) Mongolian emotion analysis method of bidirectional CNN-RNN depth model based on attention
US20240028828A1 (en) Machine learning model architecture and user interface to indicate impact of text ngrams

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant