CN112860882B

CN112860882B - Book concept front-rear order relation extraction method based on neural network

Info

Publication number: CN112860882B
Application number: CN202110061782.XA
Authority: CN
Inventors: 鲁伟明; 贾程皓; 庄越挺
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2021-01-18
Filing date: 2021-01-18
Publication date: 2022-05-10
Anticipated expiration: 2041-01-18
Also published as: CN112860882A

Abstract

The invention discloses a book concept front-to-back order relation extraction method based on a neural network. First, a graph structure containing concept nodes and chapter nodes is constructed based on book text. And extracting the features between the concept pairs by using the book text. And then, training a neural network model by using the graph structure and the concept pair features constructed by the book text to form a concept front-rear sequence relation extractor. And finally, constructing book texts of which the front-rear order relation of the concepts needs to be extracted into a graph structure and inputting the graph structure into an extractor, so that the front-rear order relation between the concepts in the book can be obtained. The method has good expandability, and aiming at different fields, only fewer books in different fields are needed to be used, the graph structure is constructed, the concept interword characteristics are extracted, and the graph structure and the concept interword characteristics are used as input training extractors, so that the automatic judgment of the front-back order relation among concepts in different fields can be realized.

Description

Book concept front-rear order relation extraction method based on neural network

Technical Field

The invention relates to the field of book concept extraction, in particular to a book concept front-rear order relation extraction method based on a neural network.

Background

With the popularization of internet education resources, people can contact with very rich education resources, especially book resources. Many concepts are often involved in any one area. In order to improve the learning efficiency and optimize the learning path, the artificial intelligence technology can be used for assisting in extracting the front-back order relation among the concepts so as to help people to understand the dependency relation among the concepts in a certain field more quickly and better.

However, the book text has many chapters and also contains many concept words. There are various and complex relationships between concepts and chapters, and there is also abundant information. On the other hand, there are a lot of statistical features between concepts, which can be used to assist in determining the front-to-back order relationship between concepts.

In view of the above, on one hand, a graph structure is constructed based on book texts, interactive relations between concepts and between chapters are integrated by using a graph convolution network, and the context relation of the concepts is predicted by using a twin network. On the other hand, we also extract and use statistical features between concepts to help judge the pre-post order relationship between concepts.

Disclosure of Invention

The invention aims to provide a book concept front-rear order relation extraction method based on a neural network, so that people can conveniently understand the dependency relation between concepts in a certain field more quickly and better.

The technical scheme adopted by the invention for solving the technical problems is as follows: a book concept front-rear order relation extraction method based on a neural network comprises the following steps:

1) construction of graph Structure: based on the book text, a graph structure containing concept nodes and chapter nodes is constructed. And taking the concepts and the chapters in the book text as the vertexes of the graph structure, and respectively calculating the PMI value between the concept pairs, the TF-IDF value of the concept in the chapter, and the distance between the chapter pair and the book as the weight of the edge between the corresponding vertexes in the graph structure to obtain the graph structure.

2) Extracting the features by concept: and extracting semantic and structural features between the concept pairs by using the book text, and expressing the semantic and structural features into a feature vector for training a neural network model later.

3) Constructing a neural network model: and (3) taking the graph structure obtained in the step 1) and the feature vector extracted in the step 2) as the input of the neural network model. Firstly, graph convolution operation is carried out to obtain hidden layer vectors of chapter nodes and concept nodes. And inputting the hidden layer vector of each pair of concept nodes into the twin network, obtaining the prediction probability with the front-back order relation between each pair of concept nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-back order relation of the concept pairs as a first partial loss function. And inputting the hidden layer vector of each pair of chapter nodes into a twin network, obtaining the prediction probability with the front-back order relationship between each pair of chapter nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-back order relationship of the chapter pairs as a second partial loss function. Passing the feature vectors between the concept pairs extracted in the step 2) through a full connection layer, obtaining the prediction probability with the front-rear sequence relation between each pair of concepts by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-rear sequence relation of the concept pairs as a third partial loss function. And weighting and summing the three part loss functions to obtain the loss function of the neural network model.

4) Extracting concept front-rear order relation: and 3) training the neural network model constructed in the step 3) to obtain an extractor of the concept context relationship. The method comprises the steps of preprocessing a book text by word segmentation, stop word removal and the like, then constructing a graph structure corresponding to the book text, extracting a feature vector between concept pairs of the book text, then inputting the feature vector as an input into an extractor, and judging whether the concept pairs of the output book text have a front-rear order relation or not by the extractor.

Further, the step 1) is specifically as follows:

firstly, preprocessing the book text such as word segmentation and word stop. And then, constructing a graph structure by using the preprocessed texts. The graph contains two types of nodes: concept nodes and chapter nodes. For the concept node, using a pre-training word vector of the concept word as an initial feature vector of the concept node; for the chapter nodes, the average word vector of all the words retained in the chapter is used as the initial feature vector of the chapter node. The weights of the edges between nodes in the graph structure are defined as follows: if the node i and the node j are both concept nodes, the weight of the edge is the PMI value between the node i and the node j. If node i and node j are the same node, the weight of the edge is defined as 1. PMI is defined as follows:

where, # W (i, j) indicates the number of sliding windows containing concept words of both node i and node j, # W (i) indicates the number of sliding windows containing concept words of node i, and # W indicates the number of sliding windows in the book text. If the node i is a concept node and the node j is a chapter node, the weight of the edge is the TF-IDF value of the concept in the chapter text. If the node i and the node j are chapter nodes, the weight of the edge is defined as:

wherein d is_iAnd a serial number indicating the sequence of the chapter represented by the chapter node i in the book directory. M represents the number of chapters of the book.

Further, the step 2) is specifically as follows:

features between pairs of concepts are extracted. For each pair of concepts, the following features are calculated:

suffix characteristics: if concept c in book text_iIs a concept c_jThe suffix of (2) is denoted as 1, otherwise, it is denoted as 0.

TOC distance characteristics:

wherein, a and b represent two concept words, a_kAnd indicating the kth level serial number of the directory where the concept word a is located. In the formula k is such that a_k≠b_kThe smallest sequence number of. Beta is a fading parameter.

Chapter-RefD characteristic:

Crd(c_i,c_j)＝Crw(c_j,c_i)-Crw(c_i,c_j)

wherein o represents a chapter, c_iRepresenting the i-th concept word in the text, f (c)_iO) represents a concept word c_iThe frequency of occurrence in section o. r (o, c)_j) Meaning concept word c_jWhether it appears in chapter o, if so, the value is 1, otherwise the value is 0.

Wiki-RefD characteristics:

Wrd(c_i,c_j)＝Wrw(c_j,c_i)-Wrw(c_i,c_j)

wherein w represents a Wikipedia document, f (c)_iW) represents a concept word c_iThe frequency of occurrence in the document w. r (w, c)_j) Representing concept word c_jAnd if the document w appears, the value is 1, otherwise the value is 0.

Complexity characteristics: the complexity feature consists of two parts:

Cld_Frequency(c_i,c_j)＝ava(c_i)-ava(c_j)

Cld_Distribution(c_i,c_j)＝KL(P(c_i)||P(c_j))

where | D | represents the total number of chapters in the book text D, | I (D, c)_i) I represents a concept word c_iThe number of chapters appearing in the book text D. P (c)_i) Representing concept word c_iProbability distribution in each chapter in the book text D. KL (P (c)_i)||P(c_j) Represent a concept word c_iAnd concept word c_jKL distance in between.

And splicing the characteristic values obtained by calculating each pair of concept words to obtain the inter-word characteristic vector of each pair of concept words.

Further, the step 3) is specifically as follows:

3.1) inputting the graph structure obtained in the step 1) into a neural network model, performing convolution operation by using a relational graph convolution network, and integrating the information of the neighbor nodes of each node, thereby extracting the spatial features of the graph structure. The relational graph convolution operation can be represented as follows:

wherein the content of the first and second substances,

representing the set of neighbor nodes for node i for relationship type r.

Representing the weight matrix parameters associated with the relationship type r in the l-th layer convolution operation,

when representing convolution operation of the l-th layer, the weight matrix parameter is independent of the relation type r,

hidden layer vector representing node i at layer l, A_ijRepresenting the weight of the edge between node i and node j. σ denotes the activation function.

After several layers of convolution operation, the hidden layer vector of the concept node and the chapter node integrated with the neighbor node information can be obtained.

3.2) after obtaining the hidden vector expression of the concept nodes and the chapter nodes, a twin network is used for predicting whether the concepts have the front-rear order relation. Specifically, first passing each pair of concepts through the twin network, the process can be expressed as follows:

p_GCN＝sigmoid(f_GCN)

wherein, the first and the second end of the pipe are connected with each other,

representing concept word c_iAnd integrating the hidden layer vector of the neighbor node information. W_s，b_sAnd W and b represent the weight and the bias of the second layer in the twin network, which are parameters to be trained.

Representing the hadamard product of the vector.

Representing a vector v_i、v_j、v_i-v_j、

Splicing, calculating the result p_GCNRepresenting concept words c based on graph structure_iIs a concept word c_jProbability value of the context concept of (a). The first partial loss function is defined as follows:

wherein, y_ijIs a concept word c_iAnd concept word c_jThe true label of the relationship between. That is, if the concept word i is the preamble concept of the concept word j, the value is 1, otherwise the value is 0, T represents the concept word c_iConcept word c_jAnd a genuine label y_ijA collection of (a).

3.3) after the characteristic vectors between the concept word pairs are obtained through the step 2), the characteristic vectors corresponding to each pair of concept words are subjected to a full-connection neural network and a Sigmoid function, and a predicted value of the front-back order relation of the concept words based on the text characteristics is obtained. This process can be expressed as follows:

f_F＝ReLU(W_F·v_ij+b_F)

p_F＝sigmoid(f_F)

wherein v is_ijIndicates, inter-word feature vectors of concept word i and concept word j, W_FAnd b_FIs the weight and bias parameter, p, of the fully-connected neural network to be trained_FRepresenting concept words c based on text features_iIs a concept word c_jProbability value of the context concept of (a). The second partial loss function is defined as follows:

3.4) if there is an early-late relationship between two sections, there is also a high probability that there is an early-late relationship between the concept words contained in the two sections. Therefore, after the hidden vector expressions of the concept node and the chapter node are obtained, whether the chapters have the preamble relation or not is predicted by using the twin network. Specifically, the hidden vector of each pair of chapter nodes is passed through the twin network, and the process can be expressed as follows:

p_o＝sigmoid(f_o)

wherein the content of the first and second substances,

indicates the ith chapter o_iAnd (4) obtaining a hidden vector after the relational graph convolution operation. Calculation result p_oShows passing through chapter o based on graph structure_iIs section o_jProbability value of the preamble section of (1). The third partial loss function is defined as follows:

wherein, y'_ijIs section o_iAnd section o_jThe true label of the relationship between. If chapter o_iIs section o_jIn the preamble section, the value is 1, otherwise the value is 0. T' denotes chapter o_iSection o_jAnd a genuine tag y'_ijA set of (a);

3.5) defining the loss function of the neural network as follows:

L＝L_GCN+λL_F+μL_o

wherein λ and μ are weighting parameters selected according to actual requirements.

Further, the specific meaning of chapter pair distance in the book is the ith chapter o_iAnd j section o_jThe distance | i-j | therebetween.

Compared with the prior art, the method has the following beneficial effects:

1. the method of the invention extracts the front-to-back sequence relation among the concepts by means of an artificial intelligence method, reduces manual work, and is more systematic and scientific.

2. The flow of the method can be automatically completed by depending on machine learning without manual intervention, thereby reducing the burden of a user.

3. The method introduces the graph neural network into the neural network, and can fully utilize concepts and spatial features between the concepts and texts. Meanwhile, the statistical characteristics among the concepts are introduced, and the interactive information of the concept words in the text is utilized.

4. The method has high prediction accuracy and can accurately judge the front-rear sequence relation among the concepts.

5. The method has good expandability, and aiming at different fields, only fewer books in different fields are needed to be used, the graph structure is constructed, the concept interword characteristics are extracted, and the graph structure and the concept interword characteristics are used as input training extractors, so that the automatic judgment of the front-back order relation among concepts in different fields can be realized.

Drawings

FIG. 1 is a general flow diagram of the present invention;

FIG. 2 is a schematic structural diagram of the neural network model of step 3);

Detailed Description

The invention is described in further detail below with reference to the figures and specific examples.

As shown in FIG. 1, the invention provides a book concept pre-and post-order relation extraction method based on a graph neural network and a twin network model. The method comprises the following steps:

1) construction of graph Structure: based on the book text, a graph structure containing concept nodes and chapter nodes is constructed. The concept and the chapter in the book text are taken as the vertexes of the graph structure, and the PMI (point-wise mutual information) value between the concept pairs, the TF-IDF value of the concept in the chapter and the distance between the chapter pair in the book are respectively calculated as the weight of the edge between the corresponding vertexes in the graph structure, so as to obtain the graph structure.

3) Constructing a neural network model, as shown in FIG. 2; and (3) taking the graph structure obtained in the step 1) and the feature vector extracted in the step 2) as the input of the neural network model. Firstly, graph convolution operation is carried out to obtain hidden layer vectors of chapter nodes and concept nodes. And inputting the hidden layer vector of each pair of concept nodes into the twin network, obtaining the prediction probability with the front-back order relation between each pair of concept nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-back order relation of the concept pairs as a first partial loss function. And inputting the hidden layer vector of each pair of chapter nodes into a twin network, obtaining the prediction probability with the front-back order relationship between each pair of chapter nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-back order relationship of the chapter pairs as a second partial loss function. Passing the feature vectors between the concept pairs extracted in the step 2) through a full connection layer, obtaining the prediction probability with the front-rear sequence relation between each pair of concepts by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-rear sequence relation of the concept pairs as a third partial loss function. And weighting and summing the three-part loss functions to obtain the loss function of the neural network model.

Further, the construction of the graph structure in step 1) specifically includes:

where # W (i, j) indicates the number of sliding windows containing concept words of both nodes i and j, # W (i) indicates the number of sliding windows containing concept words of node i, and # W indicates the number of sliding windows in the book text. If the node i is a concept node and the node j is a chapter node, the weight of the edge is the TF-IDF value of the concept in the chapter text. If the node i and the node j are chapter nodes, the weight of the edge is defined as:

Further, the extracting of the features between the concept pairs in step 2) specifically includes:

TOC distance characteristics:

Chapter-RefD characteristic:

Crd(c_i,c_j)＝Crw(c_j,c_i)-Crw(c_i,c_j)

wherein o represents a chapter, c_iRepresenting the i-th concept word in the text, f (c)_iO) represents a concept word c_iThe frequency of occurrence in section o. r (o, c)_j) Representing concept word c_jWhether it appears in chapter o, if so, the value is 1, otherwise the value is 0.

Wiki-RefD characteristics:

Wrd(c_i,c_j)＝Wrw(c_j,c_i)-Wrw(c_i,c_j)

Complexity characteristics: the complexity feature consists of two parts:

Cld_Frequency(c_i,c_j)＝ava(c_i)-ava(c_j)

Cld_Distribution(c_i,c_j)＝KL(P(c_i)||P(c_j))

where | D | represents the total number of chapters in the book text D, | I (D, c)_i) I represents a concept word c_iThe number of chapters appearing in the book text D. P (c)_i) To representConcept word c_iProbability distribution in each chapter in the book text D. KL (P (c)_i)||P(c_j) Represent a concept word c_iAnd concept word c_jKL distance in between.

Further, the constructing of the neural network in the step 3) specifically includes:

wherein the content of the first and second substances,

representing the set of neighbor nodes for node i for relationship type r.

3.2) after obtaining the hidden vector expression of the concept nodes and the chapter nodes, a twin network is used for predicting whether the concepts have the front-rear order relation. Specifically, first passing each pair of concepts through the twin network, the process can be represented as follows:

p_GCN＝sigmoid(f_GCN)

wherein the content of the first and second substances,

Representing the hadamard product of the vector.

Representing a vector v_i、v_j、v_i-v_j、

wherein, y_ijIs a concept word c_iAnd concept word c_jThe true label of the relationship between. That is, if the concept word i is the preamble of the concept word j, the value is 1, otherwise the value is 0, T represents the concept word c_iConcept word c_jAnd a genuine label y_ijA collection of (a).

f_F＝ReLU(W_F·v_ij+b_F)

p_F＝sigmoid(f_F)

p_o＝sigmoid(f_o)

wherein the content of the first and second substances,

3.5) defining the loss function of the neural network as follows:

L＝L_GCN+λL_F+μL_o

Examples

The specific steps performed in this example are described in detail below with reference to the method of the present invention, as follows:

in this embodiment, the method of the present invention is applied to a book in the field of data structures to predict the context relationship of concept words therein.

1) The book has 66 chapters and 89 concept words. Firstly, preprocessing the text of the book such as word segmentation and word stop, and then constructing a graph structure. The graph structure contains 155 nodes in total, 89 concept nodes and 66 chapter nodes. The weight values for all edges in the graph are next calculated. For the concept nodes and the concept nodes, the size of a sliding window is set to be 20, and PMI values between the concept nodes are calculated as weights of edges between the concept nodes and the concept nodes. For the concept nodes and the section nodes, TF-IDF values of each concept word in the section text are calculated by using a TF-IDF algorithm and serve as weights of edges between the concept nodes and the section nodes. For the chapter nodes and chapter nodes, the distance between each chapter in the book directory is calculated using the serial number of the chapter as a weight of the edge between the chapter node and the chapter node. Each node in the graph has an edge to itself with a weight of 1.

2) For 89 concept words involved in the book, 7921 pairs of concept pairs extract the characteristics between words. Including suffix feature, TOC distance feature, Chapter-RefD feature, Wiki-RefD feature and complexity feature, and concatenating the calculated feature values together to express a feature vector.

3) And constructing a neural network. In the method of the invention, we use the PyTorch framework to construct the network model. All labeled data contains 449 pairs of positive examples (i.e., pairs of concepts, one concept being the preamble of the other) and 7472 pairs of negative examples. In all labeled data sets, 314 pairs of positive examples and 471 pairs of negative examples are selected as training sets, and the remaining 135 pairs of positive examples and the other 202 pairs of negative examples are labeled as test sets. In the weight selection of the loss function, λ is 0.2 and μ is 0.1 in this embodiment. In evaluating the model, Precision Accuracy, Recall, F1 score, and Accuracy of the prediction results are calculated. These evaluation indices may better characterize the accuracy of the prediction of the context relationships between concepts. The results are shown in Table 1.

TABLE 1 estimation of prediction results

Rate of accuracy	Recall rate	F1-score	Rate of accuracy
				0.757	0.743	0.750	0.801

4) After the neural network training is completed, the samples which do not appear in the training set can be input into the neural network, and the model outputs whether the first concept in the concept pair is a preamble concept of the second concept. Our method was tested in three different fields of data structure, calculus and physics. The test results are shown in table 2.

TABLE 2 evaluation of test results

FIELD	Rate of accuracy	Recall rate	F1-score	Rate of accuracy
					Data structure	0.795	0.809	0.802	0.838
Calculus of	0.778	0.798	0.788	0.828
					Physics of physics	0.770	0.825	0.797	0.827

The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.

Claims

1. A book concept front-rear order relation extraction method based on a neural network is characterized by comprising the following steps:

1) construction of graph Structure: constructing a graph structure containing concept nodes and chapter nodes based on book texts; taking the concepts and the chapters in the book text as vertexes of the graph structure, and respectively calculating PMI values between concept pairs, TF-IDF values of the concepts in the chapters and distances of the chapter pairs in the book as weights of edges between corresponding vertexes in the graph structure to obtain the graph structure; PMI is defined as follows:

wherein, # W (i, j) represents the number of sliding windows containing concept words of both node i and node j, # W (i) represents the number of sliding windows containing concept words of node i, and # W represents the number of sliding windows in the book text; if the node i is a concept node and the node j is a chapter node, the weight of the edge is a TF-IDF value of the concept in the chapter text; if the node i and the node j are chapter nodes, the weight of the edge is defined as:

wherein d is_iA serial number which represents the sequence of the chapters represented by the chapter node i in the book catalog; m represents the number of chapters of the book;

2) extracting the features by concept: extracting semantic and structural features between concept pairs by using book texts, and expressing the semantic and structural features into a feature vector for training a neural network model;

3) constructing a neural network model: taking the graph structure obtained in the step 1) and the feature vector extracted in the step 2) as the input of a neural network model; firstly, carrying out graph convolution operation to obtain hidden layer vectors of chapter nodes and concept nodes; inputting the hidden layer vector of each pair of concept nodes into a twin network, obtaining the prediction probability with the front-back order relation between each pair of concept nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-back order relation of the concept pairs as a first partial loss function; inputting hidden layer vectors of each pair of chapter nodes into a twin network, obtaining a prediction probability with a front-back order relation between each pair of chapter nodes by using a Sigmoid function, and calculating the cross entropy of the prediction probability and a real label of the front-back order relation of the chapter pairs to be used as a second partial loss function; passing the feature vectors between the concept pairs extracted in the step 2) through a full connection layer, obtaining the prediction probability with the front-rear sequence relation between each pair of concepts by using a Sigmoid function, and calculating the cross entropy of the prediction probability and the real label of the front-rear sequence relation of the concept pairs as a third partial loss function; weighting and summing the three loss functions to obtain a loss function of the neural network model; the specific process is as follows:

3.1) inputting the graph structure obtained in the step 1) into a neural network model, performing convolution operation by using a relational graph convolution network, and integrating the information of neighbor nodes of each node so as to extract the spatial characteristics of the graph structure; the relational graph convolution operation is represented as follows:

wherein the content of the first and second substances,

representing a set of neighbor nodes for node i for relationship type r;

hidden layer vector representing node i at layer l, A_ijRepresenting nodes i and jThe weight of the edge between points j; σ represents an activation function;

after a plurality of layers of convolution operations, hidden layer vectors of concept nodes and chapter nodes integrated with neighbor node information can be obtained;

3.2) after the hidden layer vector expressions of the concept nodes and the chapter nodes are obtained, a twin network is used for predicting whether the concepts have a front-rear order relation or not; specifically, first each pair of concepts is passed through a twin network, and the process is expressed as follows:

p_GCN＝sigmoid(f_GCN)

wherein the content of the first and second substances,

representing concept word c_iA hidden layer vector of neighbor node information is integrated; w_s，b_sRepresenting the weight and the bias of a first layer in the twin network, and W and b representing the weight and the bias of a second layer in the twin network, which are parameters needing to be trained;

a Hadamard product representing a vector;

representing a vector v_i、v_j、v_i-v_j、

Splicing, calculating the result p_GCNRepresenting concept words c based on graph structure_iIs a concept word c_jProbability values of the rank concept of (a); the first partial loss function is defined as follows:

wherein, y_ijIs a concept word c_iAnd concept word c_jA true tag of a relationship therebetween; that is, if the concept word i is the preamble concept of the concept word j, the value is 1, otherwise the value is 0, T represents the concept word c_iConcept word c_jAnd a genuine label y_ijA set of (a);

3.3) after the characteristic vectors between the concept word pairs are obtained in the step 2), the characteristic vectors corresponding to each pair of concept words are subjected to a full-connection neural network and a Sigmoid function, and a predicted value of the front-back order relation of the concept words based on text characteristics is obtained; the process is represented as follows:

f_F＝ReLU(W_F·v_ij+b_F)

p_F＝sigmoid(f_F)

wherein v is_ijIndicates, inter-word feature vectors of concept word i and concept word j, W_FAnd b_FIs the weight and bias parameter, p, of the fully-connected neural network to be trained_FRepresenting concept words c based on text features_iIs a concept word c_jProbability values of the rank concept of (a); the second partial loss function is defined as follows:

3.4) if the two sections have the front-rear order relationship, the front-rear order relationship is very likely to exist between the concept words contained in the two sections; therefore, after the hidden layer vectors of the concept nodes and the chapter nodes are expressed, whether the chapters have the preorder relationship or not is predicted by using the twin network; specifically, the hidden vector of each pair of chapter nodes is passed through the twin network, and the process is represented as follows:

p_o＝sigmoid(f_o)

wherein the content of the first and second substances,

indicates the ith chapter o_iObtaining a hidden vector after the relational graph convolution operation; calculation result p_oShows passing through chapter o based on graph structure_iIs section o_jA probability value of the preamble section of (1); the third partial loss function is defined as follows:

wherein, y'_ijIs section o_iAnd section o_jA true tag of a relationship therebetween; if chapter o_iIs section o_jIn the preamble section, the value is 1, otherwise the value is 0; t' denotes chapter o_iSection o_jAnd a genuine tag y'_ijA set of (a);

3.5) defining the loss function of the neural network as follows:

L＝L_GCN+λL_F+μL_o

wherein, λ and μ are weight parameters selected according to actual requirements;

4) extracting concept front-rear order relation: training the neural network model constructed in the step 3) to obtain an extractor of the concept context relationship; the method comprises the steps of carrying out word segmentation and stop word preprocessing on a book text, then constructing a graph structure corresponding to the book text, extracting a feature vector between concept pairs of the book text, then inputting the feature vector as an input into an extractor, and judging whether the concept pairs of the output book text have a front-rear order relation or not by the extractor.

2. The book concept pre-and-post relationship extraction method based on the neural network as claimed in claim 1, wherein the step 1) is specifically as follows:

firstly, performing word segmentation and word stop removal pretreatment on a book text; then, constructing a graph structure by using the preprocessed text; the graph contains two types of nodes: concept nodes and chapter nodes; for the concept node, using a pre-training word vector of the concept word as an initial feature vector of the concept node; for the chapter nodes, using the average word vector of all reserved words in the chapter as the initial feature vector of the chapter nodes; the weights of the edges between nodes in the graph structure are defined as follows: if the node i and the node j are both concept nodes, the weight of the edge is a PMI value between the node i and the node j; if node i and node j are the same node, the weight of the edge is defined as 1.

3. The book concept pre-and-post relationship extraction method based on neural network as claimed in claim 1, wherein the step 2) is specifically:

extracting features between the concept pairs; for each pair of concepts, the following features are calculated:

suffix characteristics: if concept c in book text_iIs a concept c_jIf the suffix of (1) is not greater than (0);

TOC distance characteristics:

wherein, a and b represent two concept words, a_kRepresenting the kth level serial number of the catalogue where the concept word a is located; in the formula k is such that a_k≠b_kThe minimum serial number of (c); β is a decay parameter;

Chapter-RefD characteristic:

Crd(c_i，c_j)＝Crw(c_j，c_i)-Crw(c_i，c_j)

wherein o represents a chapter, c_iRepresenting the i-th concept word in the text, f (c)_iO) represents a concept word c_iThe frequency of occurrence in section o; r (o, c)_j) Representing concept word c_jWhether the mark appears in the section o, if so, the value is 1, otherwise, the value is 0;

Wiki-RefD characteristics:

Wrd(c_i，c_j)＝Wrw(c_j，c_i)-Wrw(c_i，c_j)

wherein w represents a Wikipedia document, f (c)_iW) represents a concept word c_iFrequency of occurrence in the document w; r (w, c)_j) Representing concept word c_jWhether the document w appears in the document w, if so, the value is 1, otherwise, the value is 0;

complexity characteristics: the complexity feature consists of two parts:

Cld_Frequency(c_i，c_j)＝ava(c_i)-ava(c_j)

Cld_Distribution(c_i，c_j)＝KL(P(c_i)||P(c_j))

where | D | represents the total number of chapters in the book text D, | I (D, c)_i) I represents a concept word c_iThe number of chapters appearing in the book text D; p (c)_i) Representing concept word c_iProbability distribution in each chapter in the book text D; KL (P (c)_i)||P(c_j) Represent a concept word c_iAnd concept word c_jKL distance therebetween;

4. The method for extracting book concept front-to-back relationship based on neural network as claimed in claim 1, wherein the specific meaning of chapter-to-chapter distance in the book is i chapter-o_iAnd the jth chapter o_jThe distance | i-j | therebetween.