CN112860882B - Book concept front-rear order relation extraction method based on neural network - Google Patents

Book concept front-rear order relation extraction method based on neural network Download PDF

Info

Publication number
CN112860882B
CN112860882B CN202110061782.XA CN202110061782A CN112860882B CN 112860882 B CN112860882 B CN 112860882B CN 202110061782 A CN202110061782 A CN 202110061782A CN 112860882 B CN112860882 B CN 112860882B
Authority
CN
China
Prior art keywords
concept
chapter
word
nodes
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110061782.XA
Other languages
Chinese (zh)
Other versions
CN112860882A (en
Inventor
鲁伟明
贾程皓
庄越挺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202110061782.XA priority Critical patent/CN112860882B/en
Publication of CN112860882A publication Critical patent/CN112860882A/en
Application granted granted Critical
Publication of CN112860882B publication Critical patent/CN112860882B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a book concept front-to-back order relation extraction method based on a neural network. First, a graph structure containing concept nodes and chapter nodes is constructed based on book text. And extracting the features between the concept pairs by using the book text. And then, training a neural network model by using the graph structure and the concept pair features constructed by the book text to form a concept front-rear sequence relation extractor. And finally, constructing book texts of which the front-rear order relation of the concepts needs to be extracted into a graph structure and inputting the graph structure into an extractor, so that the front-rear order relation between the concepts in the book can be obtained. The method has good expandability, and aiming at different fields, only fewer books in different fields are needed to be used, the graph structure is constructed, the concept interword characteristics are extracted, and the graph structure and the concept interword characteristics are used as input training extractors, so that the automatic judgment of the front-back order relation among concepts in different fields can be realized.

Description

Book concept front-rear order relation extraction method based on neural network
Technical Field
The invention relates to the field of book concept extraction, in particular to a book concept front-rear order relation extraction method based on a neural network.
Background
With the popularization of internet education resources, people can contact with very rich education resources, especially book resources. Many concepts are often involved in any one area. In order to improve the learning efficiency and optimize the learning path, the artificial intelligence technology can be used for assisting in extracting the front-back order relation among the concepts so as to help people to understand the dependency relation among the concepts in a certain field more quickly and better.
However, the book text has many chapters and also contains many concept words. There are various and complex relationships between concepts and chapters, and there is also abundant information. On the other hand, there are a lot of statistical features between concepts, which can be used to assist in determining the front-to-back order relationship between concepts.
In view of the above, on one hand, a graph structure is constructed based on book texts, interactive relations between concepts and between chapters are integrated by using a graph convolution network, and the context relation of the concepts is predicted by using a twin network. On the other hand, we also extract and use statistical features between concepts to help judge the pre-post order relationship between concepts.
Disclosure of Invention
The invention aims to provide a book concept front-rear order relation extraction method based on a neural network, so that people can conveniently understand the dependency relation between concepts in a certain field more quickly and better.
The technical scheme adopted by the invention for solving the technical problems is as follows: a book concept front-rear order relation extraction method based on a neural network comprises the following steps:
1) construction of graph Structure: based on the book text, a graph structure containing concept nodes and chapter nodes is constructed. And taking the concepts and the chapters in the book text as the vertexes of the graph structure, and respectively calculating the PMI value between the concept pairs, the TF-IDF value of the concept in the chapter, and the distance between the chapter pair and the book as the weight of the edge between the corresponding vertexes in the graph structure to obtain the graph structure.
2) Extracting the features by concept: and extracting semantic and structural features between the concept pairs by using the book text, and expressing the semantic and structural features into a feature vector for training a neural network model later.
3) Constructing a neural network model: and (3) taking the graph structure obtained in the step 1) and the feature vector extracted in the step 2) as the input of the neural network model. Firstly, graph convolution operation is carried out to obtain hidden layer vectors of chapter nodes and concept nodes. And inputting the hidden layer vector of each pair of concept nodes into the twin network, obtaining the prediction probability with the front-back order relation between each pair of concept nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-back order relation of the concept pairs as a first partial loss function. And inputting the hidden layer vector of each pair of chapter nodes into a twin network, obtaining the prediction probability with the front-back order relationship between each pair of chapter nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-back order relationship of the chapter pairs as a second partial loss function. Passing the feature vectors between the concept pairs extracted in the step 2) through a full connection layer, obtaining the prediction probability with the front-rear sequence relation between each pair of concepts by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-rear sequence relation of the concept pairs as a third partial loss function. And weighting and summing the three part loss functions to obtain the loss function of the neural network model.
4) Extracting concept front-rear order relation: and 3) training the neural network model constructed in the step 3) to obtain an extractor of the concept context relationship. The method comprises the steps of preprocessing a book text by word segmentation, stop word removal and the like, then constructing a graph structure corresponding to the book text, extracting a feature vector between concept pairs of the book text, then inputting the feature vector as an input into an extractor, and judging whether the concept pairs of the output book text have a front-rear order relation or not by the extractor.
Further, the step 1) is specifically as follows:
firstly, preprocessing the book text such as word segmentation and word stop. And then, constructing a graph structure by using the preprocessed texts. The graph contains two types of nodes: concept nodes and chapter nodes. For the concept node, using a pre-training word vector of the concept word as an initial feature vector of the concept node; for the chapter nodes, the average word vector of all the words retained in the chapter is used as the initial feature vector of the chapter node. The weights of the edges between nodes in the graph structure are defined as follows: if the node i and the node j are both concept nodes, the weight of the edge is the PMI value between the node i and the node j. If node i and node j are the same node, the weight of the edge is defined as 1. PMI is defined as follows:
Figure BDA0002902953000000021
Figure BDA0002902953000000022
Figure BDA0002902953000000023
where, # W (i, j) indicates the number of sliding windows containing concept words of both node i and node j, # W (i) indicates the number of sliding windows containing concept words of node i, and # W indicates the number of sliding windows in the book text. If the node i is a concept node and the node j is a chapter node, the weight of the edge is the TF-IDF value of the concept in the chapter text. If the node i and the node j are chapter nodes, the weight of the edge is defined as:
Figure BDA0002902953000000024
wherein d isiAnd a serial number indicating the sequence of the chapter represented by the chapter node i in the book directory. M represents the number of chapters of the book.
Further, the step 2) is specifically as follows:
features between pairs of concepts are extracted. For each pair of concepts, the following features are calculated:
suffix characteristics: if concept c in book textiIs a concept cjThe suffix of (2) is denoted as 1, otherwise, it is denoted as 0.
TOC distance characteristics:
Figure BDA0002902953000000025
wherein, a and b represent two concept words, akAnd indicating the kth level serial number of the directory where the concept word a is located. In the formula k is such that ak≠bkThe smallest sequence number of. Beta is a fading parameter.
Chapter-RefD characteristic:
Figure BDA0002902953000000031
Crd(ci,cj)=Crw(cj,ci)-Crw(ci,cj)
wherein o represents a chapter, ciRepresenting the i-th concept word in the text, f (c)iO) represents a concept word ciThe frequency of occurrence in section o. r (o, c)j) Meaning concept word cjWhether it appears in chapter o, if so, the value is 1, otherwise the value is 0.
Wiki-RefD characteristics:
Figure BDA0002902953000000032
Wrd(ci,cj)=Wrw(cj,ci)-Wrw(ci,cj)
wherein w represents a Wikipedia document, f (c)iW) represents a concept word ciThe frequency of occurrence in the document w. r (w, c)j) Representing concept word cjAnd if the document w appears, the value is 1, otherwise the value is 0.
Complexity characteristics: the complexity feature consists of two parts:
CldFrequency(ci,cj)=ava(ci)-ava(cj)
Figure BDA0002902953000000033
CldDistribution(ci,cj)=KL(P(ci)||P(cj))
where | D | represents the total number of chapters in the book text D, | I (D, c)i) I represents a concept word ciThe number of chapters appearing in the book text D. P (c)i) Representing concept word ciProbability distribution in each chapter in the book text D. KL (P (c)i)||P(cj) Represent a concept word ciAnd concept word cjKL distance in between.
And splicing the characteristic values obtained by calculating each pair of concept words to obtain the inter-word characteristic vector of each pair of concept words.
Further, the step 3) is specifically as follows:
3.1) inputting the graph structure obtained in the step 1) into a neural network model, performing convolution operation by using a relational graph convolution network, and integrating the information of the neighbor nodes of each node, thereby extracting the spatial features of the graph structure. The relational graph convolution operation can be represented as follows:
Figure BDA0002902953000000034
Figure BDA0002902953000000035
wherein the content of the first and second substances,
Figure BDA0002902953000000036
representing the set of neighbor nodes for node i for relationship type r.
Figure BDA0002902953000000037
Representing the weight matrix parameters associated with the relationship type r in the l-th layer convolution operation,
Figure BDA0002902953000000038
when representing convolution operation of the l-th layer, the weight matrix parameter is independent of the relation type r,
Figure BDA0002902953000000039
hidden layer vector representing node i at layer l, AijRepresenting the weight of the edge between node i and node j. σ denotes the activation function.
After several layers of convolution operation, the hidden layer vector of the concept node and the chapter node integrated with the neighbor node information can be obtained.
3.2) after obtaining the hidden vector expression of the concept nodes and the chapter nodes, a twin network is used for predicting whether the concepts have the front-rear order relation. Specifically, first passing each pair of concepts through the twin network, the process can be expressed as follows:
Figure BDA0002902953000000041
Figure BDA0002902953000000042
pGCN=sigmoid(fGCN)
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0002902953000000048
representing concept word ciAnd integrating the hidden layer vector of the neighbor node information. Ws,bsAnd W and b represent the weight and the bias of the second layer in the twin network, which are parameters to be trained.
Figure BDA0002902953000000043
Representing the hadamard product of the vector.
Figure BDA0002902953000000044
Representing a vector vi、vj、vi-vj
Figure BDA0002902953000000045
Splicing, calculating the result pGCNRepresenting concept words c based on graph structureiIs a concept word cjProbability value of the context concept of (a). The first partial loss function is defined as follows:
Figure BDA0002902953000000046
wherein, yijIs a concept word ciAnd concept word cjThe true label of the relationship between. That is, if the concept word i is the preamble concept of the concept word j, the value is 1, otherwise the value is 0, T represents the concept word ciConcept word cjAnd a genuine label yijA collection of (a).
3.3) after the characteristic vectors between the concept word pairs are obtained through the step 2), the characteristic vectors corresponding to each pair of concept words are subjected to a full-connection neural network and a Sigmoid function, and a predicted value of the front-back order relation of the concept words based on the text characteristics is obtained. This process can be expressed as follows:
fF=ReLU(WF·vij+bF)
pF=sigmoid(fF)
wherein v isijIndicates, inter-word feature vectors of concept word i and concept word j, WFAnd bFIs the weight and bias parameter, p, of the fully-connected neural network to be trainedFRepresenting concept words c based on text featuresiIs a concept word cjProbability value of the context concept of (a). The second partial loss function is defined as follows:
Figure BDA0002902953000000047
3.4) if there is an early-late relationship between two sections, there is also a high probability that there is an early-late relationship between the concept words contained in the two sections. Therefore, after the hidden vector expressions of the concept node and the chapter node are obtained, whether the chapters have the preamble relation or not is predicted by using the twin network. Specifically, the hidden vector of each pair of chapter nodes is passed through the twin network, and the process can be expressed as follows:
Figure BDA0002902953000000051
Figure BDA0002902953000000052
po=sigmoid(fo)
wherein the content of the first and second substances,
Figure BDA0002902953000000054
indicates the ith chapter oiAnd (4) obtaining a hidden vector after the relational graph convolution operation. Calculation result poShows passing through chapter o based on graph structureiIs section ojProbability value of the preamble section of (1). The third partial loss function is defined as follows:
Figure BDA0002902953000000053
wherein, y'ijIs section oiAnd section ojThe true label of the relationship between. If chapter oiIs section ojIn the preamble section, the value is 1, otherwise the value is 0. T' denotes chapter oiSection ojAnd a genuine tag y'ijA set of (a);
3.5) defining the loss function of the neural network as follows:
L=LGCN+λLF+μLo
wherein λ and μ are weighting parameters selected according to actual requirements.
Further, the specific meaning of chapter pair distance in the book is the ith chapter oiAnd j section ojThe distance | i-j | therebetween.
Compared with the prior art, the method has the following beneficial effects:
1. the method of the invention extracts the front-to-back sequence relation among the concepts by means of an artificial intelligence method, reduces manual work, and is more systematic and scientific.
2. The flow of the method can be automatically completed by depending on machine learning without manual intervention, thereby reducing the burden of a user.
3. The method introduces the graph neural network into the neural network, and can fully utilize concepts and spatial features between the concepts and texts. Meanwhile, the statistical characteristics among the concepts are introduced, and the interactive information of the concept words in the text is utilized.
4. The method has high prediction accuracy and can accurately judge the front-rear sequence relation among the concepts.
5. The method has good expandability, and aiming at different fields, only fewer books in different fields are needed to be used, the graph structure is constructed, the concept interword characteristics are extracted, and the graph structure and the concept interword characteristics are used as input training extractors, so that the automatic judgment of the front-back order relation among concepts in different fields can be realized.
Drawings
FIG. 1 is a general flow diagram of the present invention;
FIG. 2 is a schematic structural diagram of the neural network model of step 3);
Detailed Description
The invention is described in further detail below with reference to the figures and specific examples.
As shown in FIG. 1, the invention provides a book concept pre-and post-order relation extraction method based on a graph neural network and a twin network model. The method comprises the following steps:
1) construction of graph Structure: based on the book text, a graph structure containing concept nodes and chapter nodes is constructed. The concept and the chapter in the book text are taken as the vertexes of the graph structure, and the PMI (point-wise mutual information) value between the concept pairs, the TF-IDF value of the concept in the chapter and the distance between the chapter pair in the book are respectively calculated as the weight of the edge between the corresponding vertexes in the graph structure, so as to obtain the graph structure.
2) Extracting the features by concept: and extracting semantic and structural features between the concept pairs by using the book text, and expressing the semantic and structural features into a feature vector for training a neural network model later.
3) Constructing a neural network model, as shown in FIG. 2; and (3) taking the graph structure obtained in the step 1) and the feature vector extracted in the step 2) as the input of the neural network model. Firstly, graph convolution operation is carried out to obtain hidden layer vectors of chapter nodes and concept nodes. And inputting the hidden layer vector of each pair of concept nodes into the twin network, obtaining the prediction probability with the front-back order relation between each pair of concept nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-back order relation of the concept pairs as a first partial loss function. And inputting the hidden layer vector of each pair of chapter nodes into a twin network, obtaining the prediction probability with the front-back order relationship between each pair of chapter nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-back order relationship of the chapter pairs as a second partial loss function. Passing the feature vectors between the concept pairs extracted in the step 2) through a full connection layer, obtaining the prediction probability with the front-rear sequence relation between each pair of concepts by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-rear sequence relation of the concept pairs as a third partial loss function. And weighting and summing the three-part loss functions to obtain the loss function of the neural network model.
4) Extracting concept front-rear order relation: and 3) training the neural network model constructed in the step 3) to obtain an extractor of the concept context relationship. The method comprises the steps of preprocessing a book text by word segmentation, stop word removal and the like, then constructing a graph structure corresponding to the book text, extracting a feature vector between concept pairs of the book text, then inputting the feature vector as an input into an extractor, and judging whether the concept pairs of the output book text have a front-rear order relation or not by the extractor.
Further, the construction of the graph structure in step 1) specifically includes:
firstly, preprocessing the book text such as word segmentation and word stop. And then, constructing a graph structure by using the preprocessed texts. The graph contains two types of nodes: concept nodes and chapter nodes. For the concept node, using a pre-training word vector of the concept word as an initial feature vector of the concept node; for the chapter nodes, the average word vector of all the words retained in the chapter is used as the initial feature vector of the chapter node. The weights of the edges between nodes in the graph structure are defined as follows: if the node i and the node j are both concept nodes, the weight of the edge is the PMI value between the node i and the node j. If node i and node j are the same node, the weight of the edge is defined as 1. PMI is defined as follows:
Figure BDA0002902953000000061
Figure BDA0002902953000000062
Figure BDA0002902953000000063
where # W (i, j) indicates the number of sliding windows containing concept words of both nodes i and j, # W (i) indicates the number of sliding windows containing concept words of node i, and # W indicates the number of sliding windows in the book text. If the node i is a concept node and the node j is a chapter node, the weight of the edge is the TF-IDF value of the concept in the chapter text. If the node i and the node j are chapter nodes, the weight of the edge is defined as:
Figure BDA0002902953000000071
wherein d isiAnd a serial number indicating the sequence of the chapter represented by the chapter node i in the book directory. M represents the number of chapters of the book.
Further, the extracting of the features between the concept pairs in step 2) specifically includes:
features between pairs of concepts are extracted. For each pair of concepts, the following features are calculated:
suffix characteristics: if concept c in book textiIs a concept cjThe suffix of (2) is denoted as 1, otherwise, it is denoted as 0.
TOC distance characteristics:
Figure BDA0002902953000000072
wherein, a and b represent two concept words, akAnd indicating the kth level serial number of the directory where the concept word a is located. In the formula k is such that ak≠bkThe smallest sequence number of. Beta is a fading parameter.
Chapter-RefD characteristic:
Figure BDA0002902953000000073
Crd(ci,cj)=Crw(cj,ci)-Crw(ci,cj)
wherein o represents a chapter, ciRepresenting the i-th concept word in the text, f (c)iO) represents a concept word ciThe frequency of occurrence in section o. r (o, c)j) Representing concept word cjWhether it appears in chapter o, if so, the value is 1, otherwise the value is 0.
Wiki-RefD characteristics:
Figure BDA0002902953000000074
Wrd(ci,cj)=Wrw(cj,ci)-Wrw(ci,cj)
wherein w represents a Wikipedia document, f (c)iW) represents a concept word ciThe frequency of occurrence in the document w. r (w, c)j) Representing concept word cjAnd if the document w appears, the value is 1, otherwise the value is 0.
Complexity characteristics: the complexity feature consists of two parts:
CldFrequency(ci,cj)=ava(ci)-ava(cj)
Figure BDA0002902953000000075
CldDistribution(ci,cj)=KL(P(ci)||P(cj))
where | D | represents the total number of chapters in the book text D, | I (D, c)i) I represents a concept word ciThe number of chapters appearing in the book text D. P (c)i) To representConcept word ciProbability distribution in each chapter in the book text D. KL (P (c)i)||P(cj) Represent a concept word ciAnd concept word cjKL distance in between.
And splicing the characteristic values obtained by calculating each pair of concept words to obtain the inter-word characteristic vector of each pair of concept words.
Further, the constructing of the neural network in the step 3) specifically includes:
3.1) inputting the graph structure obtained in the step 1) into a neural network model, performing convolution operation by using a relational graph convolution network, and integrating the information of the neighbor nodes of each node, thereby extracting the spatial features of the graph structure. The relational graph convolution operation can be represented as follows:
Figure BDA0002902953000000081
Figure BDA0002902953000000082
wherein the content of the first and second substances,
Figure BDA0002902953000000083
representing the set of neighbor nodes for node i for relationship type r.
Figure BDA0002902953000000084
Representing the weight matrix parameters associated with the relationship type r in the l-th layer convolution operation,
Figure BDA0002902953000000085
when representing convolution operation of the l-th layer, the weight matrix parameter is independent of the relation type r,
Figure BDA0002902953000000086
hidden layer vector representing node i at layer l, AijRepresenting the weight of the edge between node i and node j. σ denotes the activation function.
After several layers of convolution operation, the hidden layer vector of the concept node and the chapter node integrated with the neighbor node information can be obtained.
3.2) after obtaining the hidden vector expression of the concept nodes and the chapter nodes, a twin network is used for predicting whether the concepts have the front-rear order relation. Specifically, first passing each pair of concepts through the twin network, the process can be represented as follows:
Figure BDA0002902953000000087
Figure BDA0002902953000000088
pGCN=sigmoid(fGCN)
wherein the content of the first and second substances,
Figure BDA0002902953000000089
representing concept word ciAnd integrating the hidden layer vector of the neighbor node information. Ws,bsAnd W and b represent the weight and the bias of the second layer in the twin network, which are parameters to be trained.
Figure BDA00029029530000000810
Representing the hadamard product of the vector.
Figure BDA00029029530000000811
Representing a vector vi、vj、vi-vj
Figure BDA00029029530000000812
Splicing, calculating the result pGCNRepresenting concept words c based on graph structureiIs a concept word cjProbability value of the context concept of (a). The first partial loss function is defined as follows:
Figure BDA00029029530000000813
wherein, yijIs a concept word ciAnd concept word cjThe true label of the relationship between. That is, if the concept word i is the preamble of the concept word j, the value is 1, otherwise the value is 0, T represents the concept word ciConcept word cjAnd a genuine label yijA collection of (a).
3.3) after the characteristic vectors between the concept word pairs are obtained through the step 2), the characteristic vectors corresponding to each pair of concept words are subjected to a full-connection neural network and a Sigmoid function, and a predicted value of the front-back order relation of the concept words based on the text characteristics is obtained. This process can be expressed as follows:
fF=ReLU(WF·vij+bF)
pF=sigmoid(fF)
wherein v isijIndicates, inter-word feature vectors of concept word i and concept word j, WFAnd bFIs the weight and bias parameter, p, of the fully-connected neural network to be trainedFRepresenting concept words c based on text featuresiIs a concept word cjProbability value of the context concept of (a). The second partial loss function is defined as follows:
Figure BDA0002902953000000091
3.4) if there is an early-late relationship between two sections, there is also a high probability that there is an early-late relationship between the concept words contained in the two sections. Therefore, after the hidden vector expressions of the concept node and the chapter node are obtained, whether the chapters have the preamble relation or not is predicted by using the twin network. Specifically, the hidden vector of each pair of chapter nodes is passed through the twin network, and the process can be expressed as follows:
Figure BDA0002902953000000092
Figure BDA0002902953000000093
po=sigmoid(fo)
wherein the content of the first and second substances,
Figure BDA0002902953000000095
indicates the ith chapter oiAnd (4) obtaining a hidden vector after the relational graph convolution operation. Calculation result poShows passing through chapter o based on graph structureiIs section ojProbability value of the preamble section of (1). The third partial loss function is defined as follows:
Figure BDA0002902953000000094
wherein, y'ijIs section oiAnd section ojThe true label of the relationship between. If chapter oiIs section ojIn the preamble section, the value is 1, otherwise the value is 0. T' denotes chapter oiSection ojAnd a genuine tag y'ijA set of (a);
3.5) defining the loss function of the neural network as follows:
L=LGCN+λLF+μLo
wherein λ and μ are weighting parameters selected according to actual requirements.
Examples
The specific steps performed in this example are described in detail below with reference to the method of the present invention, as follows:
in this embodiment, the method of the present invention is applied to a book in the field of data structures to predict the context relationship of concept words therein.
1) The book has 66 chapters and 89 concept words. Firstly, preprocessing the text of the book such as word segmentation and word stop, and then constructing a graph structure. The graph structure contains 155 nodes in total, 89 concept nodes and 66 chapter nodes. The weight values for all edges in the graph are next calculated. For the concept nodes and the concept nodes, the size of a sliding window is set to be 20, and PMI values between the concept nodes are calculated as weights of edges between the concept nodes and the concept nodes. For the concept nodes and the section nodes, TF-IDF values of each concept word in the section text are calculated by using a TF-IDF algorithm and serve as weights of edges between the concept nodes and the section nodes. For the chapter nodes and chapter nodes, the distance between each chapter in the book directory is calculated using the serial number of the chapter as a weight of the edge between the chapter node and the chapter node. Each node in the graph has an edge to itself with a weight of 1.
2) For 89 concept words involved in the book, 7921 pairs of concept pairs extract the characteristics between words. Including suffix feature, TOC distance feature, Chapter-RefD feature, Wiki-RefD feature and complexity feature, and concatenating the calculated feature values together to express a feature vector.
3) And constructing a neural network. In the method of the invention, we use the PyTorch framework to construct the network model. All labeled data contains 449 pairs of positive examples (i.e., pairs of concepts, one concept being the preamble of the other) and 7472 pairs of negative examples. In all labeled data sets, 314 pairs of positive examples and 471 pairs of negative examples are selected as training sets, and the remaining 135 pairs of positive examples and the other 202 pairs of negative examples are labeled as test sets. In the weight selection of the loss function, λ is 0.2 and μ is 0.1 in this embodiment. In evaluating the model, Precision Accuracy, Recall, F1 score, and Accuracy of the prediction results are calculated. These evaluation indices may better characterize the accuracy of the prediction of the context relationships between concepts. The results are shown in Table 1.
TABLE 1 estimation of prediction results
Rate of accuracy Recall rate F1-score Rate of accuracy
0.757 0.743 0.750 0.801
4) After the neural network training is completed, the samples which do not appear in the training set can be input into the neural network, and the model outputs whether the first concept in the concept pair is a preamble concept of the second concept. Our method was tested in three different fields of data structure, calculus and physics. The test results are shown in table 2.
TABLE 2 evaluation of test results
FIELD Rate of accuracy Recall rate F1-score Rate of accuracy
Data structure 0.795 0.809 0.802 0.838
Calculus of 0.778 0.798 0.788 0.828
Physics of physics 0.770 0.825 0.797 0.827
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.

Claims (4)

1. A book concept front-rear order relation extraction method based on a neural network is characterized by comprising the following steps:
1) construction of graph Structure: constructing a graph structure containing concept nodes and chapter nodes based on book texts; taking the concepts and the chapters in the book text as vertexes of the graph structure, and respectively calculating PMI values between concept pairs, TF-IDF values of the concepts in the chapters and distances of the chapter pairs in the book as weights of edges between corresponding vertexes in the graph structure to obtain the graph structure; PMI is defined as follows:
Figure FDA0003548030440000011
Figure FDA0003548030440000012
Figure FDA0003548030440000013
wherein, # W (i, j) represents the number of sliding windows containing concept words of both node i and node j, # W (i) represents the number of sliding windows containing concept words of node i, and # W represents the number of sliding windows in the book text; if the node i is a concept node and the node j is a chapter node, the weight of the edge is a TF-IDF value of the concept in the chapter text; if the node i and the node j are chapter nodes, the weight of the edge is defined as:
Figure FDA0003548030440000014
wherein d isiA serial number which represents the sequence of the chapters represented by the chapter node i in the book catalog; m represents the number of chapters of the book;
2) extracting the features by concept: extracting semantic and structural features between concept pairs by using book texts, and expressing the semantic and structural features into a feature vector for training a neural network model;
3) constructing a neural network model: taking the graph structure obtained in the step 1) and the feature vector extracted in the step 2) as the input of a neural network model; firstly, carrying out graph convolution operation to obtain hidden layer vectors of chapter nodes and concept nodes; inputting the hidden layer vector of each pair of concept nodes into a twin network, obtaining the prediction probability with the front-back order relation between each pair of concept nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-back order relation of the concept pairs as a first partial loss function; inputting hidden layer vectors of each pair of chapter nodes into a twin network, obtaining a prediction probability with a front-back order relation between each pair of chapter nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and a real label of the front-back order relation of the chapter pairs to be used as a second partial loss function; passing the feature vectors between the concept pairs extracted in the step 2) through a full connection layer, obtaining the prediction probability with the front-rear sequence relation between each pair of concepts by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-rear sequence relation of the concept pairs as a third partial loss function; weighting and summing the three loss functions to obtain a loss function of the neural network model; the specific process is as follows:
3.1) inputting the graph structure obtained in the step 1) into a neural network model, performing convolution operation by using a relational graph convolution network, and integrating the information of neighbor nodes of each node so as to extract the spatial characteristics of the graph structure; the relational graph convolution operation is represented as follows:
Figure FDA0003548030440000021
Figure FDA0003548030440000022
wherein the content of the first and second substances,
Figure FDA0003548030440000023
representing a set of neighbor nodes for node i for relationship type r;
Figure FDA0003548030440000024
representing the weight matrix parameters associated with the relationship type r in the l-th layer convolution operation,
Figure FDA0003548030440000025
when representing convolution operation of the l-th layer, the weight matrix parameter is independent of the relation type r,
Figure FDA0003548030440000026
hidden layer vector representing node i at layer l, AijRepresenting nodes i and jThe weight of the edge between points j; σ represents an activation function;
after a plurality of layers of convolution operations, hidden layer vectors of concept nodes and chapter nodes integrated with neighbor node information can be obtained;
3.2) after the hidden layer vector expressions of the concept nodes and the chapter nodes are obtained, a twin network is used for predicting whether the concepts have a front-rear order relation or not; specifically, first each pair of concepts is passed through a twin network, and the process is expressed as follows:
Figure FDA0003548030440000027
Figure FDA0003548030440000028
pGCN=sigmoid(fGCN)
wherein the content of the first and second substances,
Figure FDA00035480304400000213
representing concept word ciA hidden layer vector of neighbor node information is integrated; ws,bsRepresenting the weight and the bias of a first layer in the twin network, and W and b representing the weight and the bias of a second layer in the twin network, which are parameters needing to be trained;
Figure FDA0003548030440000029
a Hadamard product representing a vector;
Figure FDA00035480304400000210
representing a vector vi、vj、vi-vj
Figure FDA00035480304400000211
Splicing, calculating the result pGCNRepresenting concept words c based on graph structureiIs a concept word cjProbability values of the rank concept of (a); the first partial loss function is defined as follows:
Figure FDA00035480304400000212
wherein, yijIs a concept word ciAnd concept word cjA true tag of a relationship therebetween; that is, if the concept word i is the preamble concept of the concept word j, the value is 1, otherwise the value is 0, T represents the concept word ciConcept word cjAnd a genuine label yijA set of (a);
3.3) after the characteristic vectors between the concept word pairs are obtained in the step 2), the characteristic vectors corresponding to each pair of concept words are subjected to a full-connection neural network and a Sigmoid function, and a predicted value of the front-back order relation of the concept words based on text characteristics is obtained; the process is represented as follows:
fF=ReLU(WF·vij+bF)
pF=sigmoid(fF)
wherein v isijIndicates, inter-word feature vectors of concept word i and concept word j, WFAnd bFIs the weight and bias parameter, p, of the fully-connected neural network to be trainedFRepresenting concept words c based on text featuresiIs a concept word cjProbability values of the rank concept of (a); the second partial loss function is defined as follows:
Figure FDA0003548030440000031
3.4) if the two sections have the front-rear order relationship, the front-rear order relationship is very likely to exist between the concept words contained in the two sections; therefore, after the hidden layer vectors of the concept nodes and the chapter nodes are expressed, whether the chapters have the preorder relationship or not is predicted by using the twin network; specifically, the hidden vector of each pair of chapter nodes is passed through the twin network, and the process is represented as follows:
Figure FDA0003548030440000032
Figure FDA0003548030440000033
po=sigmoid(fo)
wherein the content of the first and second substances,
Figure FDA0003548030440000034
indicates the ith chapter oiObtaining a hidden vector after the relational graph convolution operation; calculation result poShows passing through chapter o based on graph structureiIs section ojA probability value of the preamble section of (1); the third partial loss function is defined as follows:
Figure FDA0003548030440000035
wherein, y'ijIs section oiAnd section ojA true tag of a relationship therebetween; if chapter oiIs section ojIn the preamble section, the value is 1, otherwise the value is 0; t' denotes chapter oiSection ojAnd a genuine tag y'ijA set of (a);
3.5) defining the loss function of the neural network as follows:
L=LGCN+λLF+μLo
wherein, λ and μ are weight parameters selected according to actual requirements;
4) extracting concept front-rear order relation: training the neural network model constructed in the step 3) to obtain an extractor of the concept context relationship; the method comprises the steps of carrying out word segmentation and stop word preprocessing on a book text, then constructing a graph structure corresponding to the book text, extracting a feature vector between concept pairs of the book text, then inputting the feature vector as an input into an extractor, and judging whether the concept pairs of the output book text have a front-rear order relation or not by the extractor.
2. The book concept pre-and-post relationship extraction method based on the neural network as claimed in claim 1, wherein the step 1) is specifically as follows:
firstly, performing word segmentation and word stop removal pretreatment on a book text; then, constructing a graph structure by using the preprocessed text; the graph contains two types of nodes: concept nodes and chapter nodes; for the concept node, using a pre-training word vector of the concept word as an initial feature vector of the concept node; for the chapter nodes, using the average word vector of all reserved words in the chapter as the initial feature vector of the chapter nodes; the weights of the edges between nodes in the graph structure are defined as follows: if the node i and the node j are both concept nodes, the weight of the edge is a PMI value between the node i and the node j; if node i and node j are the same node, the weight of the edge is defined as 1.
3. The book concept pre-and-post relationship extraction method based on neural network as claimed in claim 1, wherein the step 2) is specifically:
extracting features between the concept pairs; for each pair of concepts, the following features are calculated:
suffix characteristics: if concept c in book textiIs a concept cjIf the suffix of (1) is not greater than (0);
TOC distance characteristics:
Figure FDA0003548030440000041
wherein, a and b represent two concept words, akRepresenting the kth level serial number of the catalogue where the concept word a is located; in the formula k is such that ak≠bkThe minimum serial number of (c); β is a decay parameter;
Chapter-RefD characteristic:
Figure FDA0003548030440000042
Crd(ci,cj)=Crw(cj,ci)-Crw(ci,cj)
wherein o represents a chapter, ciRepresenting the i-th concept word in the text, f (c)iO) represents a concept word ciThe frequency of occurrence in section o; r (o, c)j) Representing concept word cjWhether the mark appears in the section o, if so, the value is 1, otherwise, the value is 0;
Wiki-RefD characteristics:
Figure FDA0003548030440000043
Wrd(ci,cj)=Wrw(cj,ci)-Wrw(ci,cj)
wherein w represents a Wikipedia document, f (c)iW) represents a concept word ciFrequency of occurrence in the document w; r (w, c)j) Representing concept word cjWhether the document w appears in the document w, if so, the value is 1, otherwise, the value is 0;
complexity characteristics: the complexity feature consists of two parts:
CldFrequency(ci,cj)=ava(ci)-ava(cj)
Figure FDA0003548030440000044
CldDistribution(ci,cj)=KL(P(ci)||P(cj))
where | D | represents the total number of chapters in the book text D, | I (D, c)i) I represents a concept word ciThe number of chapters appearing in the book text D; p (c)i) Representing concept word ciProbability distribution in each chapter in the book text D; KL (P (c)i)||P(cj) Represent a concept word ciAnd concept word cjKL distance therebetween;
and splicing the characteristic values obtained by calculating each pair of concept words to obtain the inter-word characteristic vector of each pair of concept words.
4. The method for extracting book concept front-to-back relationship based on neural network as claimed in claim 1, wherein the specific meaning of chapter-to-chapter distance in the book is i chapter-oiAnd the jth chapter ojThe distance | i-j | therebetween.
CN202110061782.XA 2021-01-18 2021-01-18 Book concept front-rear order relation extraction method based on neural network Active CN112860882B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110061782.XA CN112860882B (en) 2021-01-18 2021-01-18 Book concept front-rear order relation extraction method based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110061782.XA CN112860882B (en) 2021-01-18 2021-01-18 Book concept front-rear order relation extraction method based on neural network

Publications (2)

Publication Number Publication Date
CN112860882A CN112860882A (en) 2021-05-28
CN112860882B true CN112860882B (en) 2022-05-10

Family

ID=76006295

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110061782.XA Active CN112860882B (en) 2021-01-18 2021-01-18 Book concept front-rear order relation extraction method based on neural network

Country Status (1)

Country Link
CN (1) CN112860882B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853310A (en) * 2010-06-21 2010-10-06 北京大学 Method for producing preorder and postorder code of single traversing tree
CN107491541A (en) * 2017-08-24 2017-12-19 北京丁牛科技有限公司 File classification method and device
US10319364B2 (en) * 2017-05-18 2019-06-11 Telepathy Labs, Inc. Artificial intelligence-based text-to-speech system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853310A (en) * 2010-06-21 2010-10-06 北京大学 Method for producing preorder and postorder code of single traversing tree
US10319364B2 (en) * 2017-05-18 2019-06-11 Telepathy Labs, Inc. Artificial intelligence-based text-to-speech system and method
CN107491541A (en) * 2017-08-24 2017-12-19 北京丁牛科技有限公司 File classification method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Diverse Image Captioning via GroupTalk;Zhuhao Wang等;《Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16)》;20161231;全文 *
基于图书的领域概念抽取及其前后序关系挖掘算法研究与应用;周洋帆;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190815;全文 *

Also Published As

Publication number Publication date
CN112860882A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
CN108363790B (en) Method, device, equipment and storage medium for evaluating comments
CN108073568B (en) Keyword extraction method and device
CN111783474B (en) Comment text viewpoint information processing method and device and storage medium
CN106599032B (en) Text event extraction method combining sparse coding and structure sensing machine
CN108549658B (en) Deep learning video question-answering method and system based on attention mechanism on syntax analysis tree
CN111325029B (en) Text similarity calculation method based on deep learning integrated model
CN107818164A (en) A kind of intelligent answer method and its system
CN110188047B (en) Double-channel convolutional neural network-based repeated defect report detection method
CN110222163A (en) A kind of intelligent answer method and system merging CNN and two-way LSTM
CN108874896B (en) Humor identification method based on neural network and humor characteristics
CN111368088A (en) Text emotion classification method based on deep learning
CN108108347B (en) Dialogue mode analysis system and method
CN114818703B (en) Multi-intention recognition method and system based on BERT language model and TextCNN model
Sartakhti et al. Persian language model based on BiLSTM model on COVID-19 corpus
CN110516070A (en) A kind of Chinese Question Classification method based on text error correction and neural network
CN111428481A (en) Entity relation extraction method based on deep learning
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN110569355B (en) Viewpoint target extraction and target emotion classification combined method and system based on word blocks
Liu et al. Revisit word embeddings with semantic lexicons for modeling lexical contrast
CN110874392A (en) Text network information fusion embedding method based on deep bidirectional attention mechanism
CN106815209B (en) Uygur agricultural technical term identification method
Tianxiong et al. Identifying chinese event factuality with convolutional neural networks
CN110334204B (en) Exercise similarity calculation recommendation method based on user records
CN113378024A (en) Deep learning-based public inspection field-oriented related event identification method
CN112860882B (en) Book concept front-rear order relation extraction method based on neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant