CN116882502B

CN116882502B - Professional text inference method and system integrating structured knowledge and text semantics

Info

Publication number: CN116882502B
Application number: CN202311146912.5A
Authority: CN
Inventors: 孙宇清; 夏天宇; 马磊; 袁峰; 张华英; 付大江; 郭成锋
Original assignee: SHANDONG SHANDA OUMA SOFTWARE CO Ltd
Current assignee: SHANDONG SHANDA OUMA SOFTWARE CO Ltd
Priority date: 2023-09-07
Filing date: 2023-09-07
Publication date: 2023-11-28
Anticipated expiration: 2043-09-07
Also published as: CN116882502A

Abstract

A professional text inference method and system integrating structured knowledge and text semantics belong to the technical field of natural language processing, and comprise the following steps: first, two features of professional text are constructed: representing the knowledge in the professional text by using the expression form of the knowledge graph to obtain structured knowledge; coding the structured knowledge based on the self-coder to obtain professional text knowledge characteristics; encoding the text by using a large pre-training language model to obtain complete text semantic features of the text; then, a consistency loss function is introduced during inference to train a professional text inference model, so that professional text inference integrating professional text knowledge features and text semantic features is realized. The method can comprehensively utilize the understandable knowledge representation and the whole semantic information, and provide an interpretable knowledge form as an inference basis while better performing professional text inference.

Description

Professional text inference method and system integrating structured knowledge and text semantics

Technical Field

The invention discloses a professional text inference method and system integrating structured knowledge and text semantics, belonging to the technical field of natural language processing.

Background

A great deal of examinee texts are generated when the examinee answers subjective questions in the professional exam, the examinee texts reflect the understanding and application of the examinee to knowledge, and the texts generally comprise professional concepts and terms and belong to professional texts. The professional text inference task is an important task in the field of natural language processing, and can realize intelligent review of the text of the examinee. Given the examinee text, the reference answer text and the predefined score types, a neural network is used to infer the score types of the examinee text. The intelligent evaluation can improve the evaluation efficiency, reduce the influence of subjective factors of the evaluation person, and has important practical application value.

For professional text inference methods, the following technical difficulties mainly exist:

the results of professional text inference are difficult to give an inference basis from a knowledge perspective. The result of professional examination directly affects the evaluation and selection of the examinee, and has important influence on the personal development of the examinee. Therefore, for the purpose of maintaining the authority and seriousness of the test, a professional text inference method lacking a certain interpretability will be difficult to be applied to the actual scene by the test supervisor. The existing professional text inference method is generally based on a neural network model, results are given through operation and conversion of implicit vectors, explicit knowledge representation is difficult to give, the basis for inference is given from the knowledge perspective, and technical difficulties are brought to improving the interpretability of the model. While existing methods require a large number of marking samples to train, in practical scenarios, the marking samples provided to the intelligent review system by the review expert through pre-review are limited.

The prior art, such as chinese patent document CN115422920a, discloses a method for recognizing the disputed focus of a referee document based on BERT and GAT, and chinese patent document CN114996463a discloses an intelligent classification method and device for cases, which are mainly based on semantic features and usually require training of tag data in a rich professional field. In the professional text inference task, the methods lack of interpretation of the inference results from the knowledge perspective, and are difficult to apply to the actual scene of intelligent review of subjective questions.

In conclusion, the method for deducing the professional text has important practical value in various professional fields, and the method for deducing the professional text which meets the real scene requirement is an urgent requirement for intelligent review of subjective questions in professional examination.

Disclosure of Invention

Aiming at the defects of the prior art, the invention discloses a professional text inference method integrating structured knowledge and text semantics;

the invention also discloses a system for realizing the inference method.

Summary of the invention:

the invention provides a professional text inference method and a system for fusing structured knowledge and text semantics, wherein firstly, two characteristics of the professional text are constructed: representing the knowledge in the professional text by using the expression form of the knowledge graph to obtain structured knowledge; coding the structured knowledge based on the self-coder to obtain professional text knowledge characteristics; encoding the text by using a large pre-training language model to obtain complete text semantic features of the text; then, a consistency loss function is introduced during inference to train a professional text inference model, so that professional text inference integrating professional text knowledge features and text semantic features is realized. The method can comprehensively utilize the understandable knowledge representation and the whole semantic information, and provide an interpretable knowledge form as an inference basis while better performing professional text inference.

The technical scheme of the invention is as follows:

a specialized text inference method fusing structured knowledge and text semantics, comprising:

s1: constructing a structured knowledge feature, and for a given professional text, constructing a knowledge graph of knowledge in the professional text by using a knowledge graph construction method facing the professional text to obtain structured knowledge; encoding the structural knowledge by using a knowledge graph encoder to obtain professional text knowledge characteristics;

s2: constructing text semantic features, and coding given professional texts by using a deep learning model to obtain the text semantic features;

s3: introducing consistency loss, fusing professional text knowledge features and text semantic features to realize professional text inference, and providing inference basis to form a professional text inference model fusing structured knowledge and text semantics;

defining a professional text inference task integrating professional text knowledge features and text semantic features: given professional texts are used as input, professional text knowledge features are obtained through the S1, text semantic features are obtained through the S2, inference is carried out by fusing the two features according to the S3, finally an inference result and structured knowledge are output as inference basis, and under the subjective question review scene, the interpretable inference for fusing the examinee texts comprises:

for a subjective questionGiven examinee text->And corresponding reference answer text->Wherein the examinee is co->Individual words, reference answer text co->A personal word;

respectively extracting test taker texts by using the professional text inference modelKnowledge graph of->And reference answer text +.>Knowledge graph of->Wherein->For knowledge graph->Node set of->For knowledge graph->Edge set and corresponding weights of +.>For knowledge graph->Node set of->For knowledge graph->Is a set of edges and corresponding weights; text of examinee->Make an inferred score-> , ，/>，/>Is a set of inferred results, |s| is the number of inferred score categories; at the same time, since the professional text inference model uses knowledge graph +.>Text of examinee->Is inferred by the score of (i) knowledge graph +.>Deducing examinee text as an explanation>Score->A knowledge form of (a);

，/>for the total number of professional elements in the test taker text, < >>Are all nodes in the knowledge graph and represent a professional element; the edge set and the corresponding weight are +.>Structured knowledge is composed of node sets and edge sets: />；

The knowledge graph comprises professional elements and association relations among the professional elements, wherein the professional elements comprise terms, entities and important universal words in professional texts; the association relation comprises important association among specialized elements;

according to the method, the professional elements in the professional text are extracted through a knowledge graph construction method, and the association relation among the professional elements in the professional text is constructed, so that the structured knowledge is obtained.

According to a preferred embodiment of the present invention, the step S1 specifically includes:

s11, constructing a knowledge graph facing to the professional text to obtain professional text structured knowledge;

s11-1, extracting professional elements in the professional text to obtain a node set of the professional text knowledge graphThat is, the professional in the text is obtained by matching the professional text with the professional list, and the professional in the professional list should be selected to have an important role in expressing the professional knowledge, including but not limited to, measuring by calculating the information gain value of the word, and selecting the word with high information gain value to add into the professional list;

s11-2, constructing association relations among the professional factors to obtain edge sets and corresponding weightsThe method comprises the steps of carrying out a first treatment on the surface of the The method for constructing the association relation between the professional elements comprises the following steps:

using the formulaCosine similarity is calculated for every two specialized elements and used as the weight of the edge +.>Cosine similarity is an algorithm for measuring the similarity degree between two words in a text, a vector included angle cosine value is used as a similarity measurement standard, and a pre-training language model BERT is used for indicating +_for the text of an examinee>Coding to obtain vector sequence->Wherein the professional is->Encoded as vector +.>Professional element->Correspondence vector->The method comprises the steps of carrying out a first treatment on the surface of the Using two vectors in vector space>And->Cosine value of included angle calculates weight of edge +.>：

After professional element extraction and association relation construction are completed, the test taker text is subjected toExtracting structured knowledge, i.e. knowledge graph；

S12, coding the professional text structured knowledge to obtain the professional text knowledge characteristics:

for knowledge graph extracted from professional textIn order to make the method further use of the structured knowledge, it is necessary to convert the structured knowledge into a form that is easy to infer the process calculation, pretrain the graph encoder by means of a self-encoder, and +_in the high-dimensional space>Encoding into vectors in a low-dimensional semantic space to represent structural knowledge in a test taker text, adopting a multi-layer convolutional neural network as an encoder in a self-encoder structure, and adopting a multi-layer perceptron MLP (multi-layer perceptron) and English Multilayer Perceptron as a decoder in the self-encoder structure;

s12-1, the convolutional neural network is a neural network model widely applied to image processing, can effectively extract local features in an image through local receptive fields, shared weights, pooling and other technologies, and performs reduction and compression simultaneously when the features are extracted, and structural knowledge extracted from a test taker text is expressed in the form of an adjacent matrix of a graph and has a certain similarity with a conventional expression structure of the image, so that a multi-layer convolutional neural network is adopted as an encoder;

by usingThe convolutional neural network of the layer acts as a knowledge graph encoder: the knowledge graph encoder first uses the knowledge graph +.>Converted into a graph on a fixed point set to form an adjacency matrix +.>The method comprises the steps of carrying out a first treatment on the surface of the The node set of the graph is all words in the professional word stock, edges and knowledge graph +.>The association relation of the two is consistent; adjacency matrix->Input to the picture encoder via +.>The sub convolution operation and the maximum pooling operation obtain a knowledge graph +.>Coding result of->The method comprises the steps of carrying out a first treatment on the surface of the Finally, the coding result is->Flattening and linearly transforming to obtain knowledge graph->Coding vector +.>As in formula (2) -formula (5)：

In equation (2), the initial input of the graph encoderIs->；

The convolution layer in the graph encoder convolves the input matrix withThe representation is performed:

convolution kernel of convolution layerThe size is +.>；/>For convolution kernel->Go->Values of columns; receiving a representation matrix->As input->Is the +.about.of the representation matrix obtained after the convolution operation>Go->A value of column position; />Is the offset of the row; />Is the offset of the column:

in the first placeIn layers, the convolution kernel is +.>Convolutional layer reception +.>As input, convolving and inputting into pooling layer for pooling operation, and +.>Representing maximum pooling operation,/->Adjacency matrix for knowledge graph->Through->Matrix after layer convolution:

when (when)When continuing to +.>The formula +.>Up to->Completion->Manipulation of the layers, get->The method comprises the steps of carrying out a first treatment on the surface of the For->Carry out flattening operation +.>And use a multi-layer perceptron +.>For coding result->Dimension reduction is carried out to obtain a knowledge graph->Coding vector +.>:

S12-2 training of graph encoder, knowledge graphAdjacency matrix->Obtaining a coding vector via a graph encoder>Then through the multi-layer perceptron->Constituent decoder->Obtaining->Reduced adjacency matrix->：

Adjacency matrix with reductionAdjacency matrix with reduction->Mean square error between as a loss function training the graph encoder,/>Is the loss function of the graph encoder; />Is knowledge graph->The number of intermediate nodes; in order for the vector to fully represent the information in the structured knowledge, a decoder structured as a two-layer linear transformation is designed according to the form of the self-encoder>Coding the graph vector +.>Decoding into a reduced adjacency matrix by means of a decoder>:

。

According to a preferred embodiment of the present invention, the S2 specifically includes:

through text semantic coding on a given professional text, various characteristics in the text are comprehensively extracted and utilized, characteristics of other elements in the text are supplemented, more comprehensive text semantic characteristics are provided for inference, and the accuracy of inference is improved;

comprehensively understanding and processing the text through the deep learning model, and extracting semantic information features in the text;

preferably, the deep learning model is a large pre-training language model with strong semantic feature extraction capability, including the following stepsThe semantics of the complete original text are encoded as an encoder:

the given professional text refers to examinee textReference answer +.>For examinee text->Reference answer +.>After splicing, carrying out text semantic coding to obtain text semantic features of the test taker text:

。

according to a preferred embodiment of the present invention, the S3 specifically includes:

s31, performing professional text inference based on the professional text knowledge features:

the knowledge graphs of the test taker text and the reference answer are respectively converted into vector codes through S1 and S2And->In order to consider both the difference and the similarity between the examinee text and the reference answer, the difference between features is introduced when the vector coding features are spliced>And correlation between features->The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Representing the outer product of two vectors, reflecting the similarity and interaction between the vectors by multiplying the corresponding elements in the two vectors; />Representing the operation of the concatenation of vectors, the characteristics for inference are obtained after the concatenation->：

Input to a classifier based on knowledge feature inferenceIn obtaining the probability distribution of the inferred result +.>：

Classifier based on knowledge feature inferenceThe loss function of (2) is:

in formula (11), S is the number of inferred score categories; the real labels of the test taker text samples are classifiedIn the time-course of which the first and second contact surfaces,the others are 0; />Is based on knowledge, the examinee text is inferred as category +.>Probability of time;

s32, performing professional text inference based on text semantic features:

characterizing text semanticsInput to a classifier based on text semantic features +.>In obtaining the probability distribution of the inferred result +.>：

Classifier based on text semantic featuresThe loss function of (2) is:

where |s| is the number of inferred score categories; the true labels of the samples are classifiedWhen (I)>The others are 0; />In the text semantic based inference, the examinee text is inferred as category +.>Probability of time;

s33, fusing the interpretable deduction of the professional text knowledge features and the text semantic features:

s331, in the fusion training stage, in order to realize fusion inference of professional text knowledge features and text semantic features, introducing a consistency loss functionTraining the professional text inference model; consistency loss is typically used to ensure consistency between different representations and to facilitate information flow between the different representations, to help effectively fuse together structured knowledge and text semantics and to maximize the use of information between them; using KL divergence constraint knowledge feature inference based classifier and text semantic feature based classifier, the inference result probability distribution is close to:

in the case of the formula (14),is a consistency loss function;

fusing the consistency loss into a loss function, and training a professional text inference model according to S1-S2:

in the formulas (15) - (17),training the loss function of the professional text inference model according to S1-S2; />Is a super parameter; by the formula->And formula->Obtaining an inference result based on the feature of the expert text knowledge +.>And inference results based on text semantic features +.>The consistency loss cannot guarantee the inferred result +.>And->Finally keeping consistency, in the actual scene, taking special knowledge based on professional textInferred results of symptoms->As a final result.

In the professional text inference integrating structured knowledge and text semantics, the knowledge graph is adopted as priori and structured knowledge and is used for guiding the understanding and reasoning process of the model on the professional text. The knowledge graph is used as a structured and understandable knowledge representation mode, and is more visual compared with common hidden vectors and other modes. Thus, knowledge graphs can be used to interpret the prediction results of the model. The interpretability of the model is improved, for example, by analysis of nodes and edges in the knowledge graph that are related to the predicted outcome. The model outputs the inferred result and takes the knowledge graph extracted from the test taker text as the interpretation of the inferred result. The knowledge graph has readability, and the examinee text can be analyzed from the text content reflected in the knowledge graph in combination with the scoring standard.

The system for realizing the inference method is characterized by comprising an examinee text processing path, a reference answer processing path, a fusion inference module and an output module;

the examinee text processing path comprises a knowledge graph construction module and a structured knowledge coding module;

the reference answer processing path comprises a text preprocessing module and a text semantic coding module;

the output module is used for outputting the knowledge graph and the inferred result.

The technical advantages of the invention include:

1. compared with the existing subjective question review method utilizing the neural network to fit the reference answer and the similarity of the test taker text, the method can explicitly display the expertise in the text and is used for the next text inference. The knowledge contained in the information is used as key information of the test taker text, and is an important basis for evaluating the test taker text and determining the score. The explicit knowledge expression can provide a certain interpretability for the inferred result, and meets the requirements of actual application scenes.

2. The invention meets the requirement of a real scene on a lightweight model, the parameter quantity of the model is small, the operation speed is high, the actual professional examination review scene needs to be carried out in a physical closed state, so that the confidentiality and fairness of review are ensured, and the model is easy to meet in the closed review scene.

3. The invention can improve the performance of subjective question review tasks. The method can still obtain better deduction performance under the condition of smaller marking data quantity, and is suitable for the practical application scene with fewer pre-review samples provided by an expert.

Drawings

FIG. 1 is a flow chart of an inference method of the present invention;

FIG. 2 is a knowledge graph of reference answers in an application scenario of the present invention;

FIG. 3 is a schematic diagram of a test text in an application scenario of the present inventionIs a knowledge graph of (1);

FIG. 4 is a schematic diagram of a test text in an application scenario of the present inventionIs a knowledge graph of (1);

FIG. 5 is a schematic diagram of a test text in an application scenario of the present inventionIs a knowledge graph of (1).

Detailed Description

The present invention will be described in detail with reference to examples and drawings, but is not limited thereto.

Example 1,

As shown in fig. 1, a professional text inference method integrating structured knowledge and text semantics includes:

respectively extracting test taker texts by using the professional text inference modelKnowledge graph of->And reference answer text +.>Knowledge graph of->Wherein->For knowledge graph->Node set of->For knowledge graph->Edge set and corresponding weights of +.>For knowledge graph->Node set of->For knowledge graph->Is a set of edges and corresponding weights; text of examinee->Make an inferred score-> , ，/>，/>Is a set of inferred results, |s| is the number of inferred score categories; meanwhile, due to the professional text inference modelBy means of knowledge graphs->Text of examinee->Is inferred by the score of (i) knowledge graph +.>Deducing examinee text as an explanation>Score->A knowledge form of (a);

The step S1 specifically includes:

for knowledge graph extracted from professional textIn order to make the method further use of the structured knowledge, it is necessary to convert the structured knowledge into a form that is easy to infer the process calculation, pretrain the graph encoder by means of a self-encoder, and +_in the high-dimensional space>Encoding into vectors in low-dimensional semantic space to characterize structured knowledge in examinee text, employing multi-layer convolutional neural networks as encoding in self-encoder structureThe decoder adopts a multi-layer perceptron MLP and English Multilayer Perceptron as a decoder in a self-encoder structure;

by usingThe convolutional neural network of the layer acts as a knowledge graph encoder: the knowledge graph encoder first uses the knowledge graph +.>Converted into a graph on a fixed point set to form an adjacency matrix +.>The method comprises the steps of carrying out a first treatment on the surface of the The node set of the graph is all words in the professional word stock, edges and knowledge graph +.>The association relation of the two is consistent; adjacency matrix->Input to the picture encoder via +.>The sub convolution operation and the maximum pooling operation obtain a knowledge graph +.>Coding result of->The method comprises the steps of carrying out a first treatment on the surface of the Finally, the coding result is->Flattening and linearly transforming to obtain knowledge graph->Coding vector +.>As in formula (2) -formula (5):

in equation (2), the initial input of the graph encoderIs->；

。

The step S2 specifically comprises the following steps:

。

the step S3 specifically comprises the following steps:

Classifier based on knowledge feature inferenceThe loss function of (2) is:

s32, performing professional text inference based on text semantic features:

Classifier based on text semantic featuresThe loss function of (2) is:

s331, in the fusion training stage, in order to realize fusion inference of professional text knowledge features and text semantic features, introducing a consistency loss functionTraining the professional text inference model; consistency loss is typically used to ensure consistency between different representations and to facilitate information flow between the different representations, to help effectively fuse together structured knowledge and text semantics and to maximize the use of information between them; inferring a result probability score using a classifier based on knowledge feature inference and a classifier based on text semantic features using KL divergence constraintsThe cloth approaches:

in the case of the formula (14),is a consistency loss function;

in the formulas (15) - (17),training the loss function of the professional text inference model according to S1-S2; />Is a super parameter; by the formula->And formula->Obtaining an inference result based on the feature of the expert text knowledge +.>And inference results based on text semantic features +.>Consistency lossCannot guarantee the inferred result->And->Finally keeping the consistency, in the actual scene, taking the inferred result based on the professional text knowledge feature ++>As a final result.

EXAMPLE 2,

A system for implementing the inference method of embodiment 1, comprising an examinee text processing path, a reference answer processing path, a fusion inference module, and an output module;

The above embodiment 1 and embodiment 2 are applied to the real data of a certain professional qualification test at a certain country level, and as an application scenario, a subjective question is taken as an example, as shown in table 1; outputting an inferred result and structured knowledge of the model, and explaining the inferred result by using the structured knowledge:

TABLE 1 examinee text inference example

The nodes in fig. 2, 3, 4 and 5 represent the specialized elements contained in the corresponding specialized text; the presence of a connection between the professionals indicates that it is in contact with the examinee's text, and a connection with a higher weight indicates that there may be a close contact between the professionals.

When the problem is reviewed manually, the examinee expresses that the judgment of "anti-dialect" is "not established", and then 1 score can be obtained. For the second score point, the examinee should answer with expertise "ordinary partner turns into limited partner, and should take on infinite liabilities with liabilities that occur to partner enterprises during ordinary partner". Scoring may be achieved by expressing similar semantics using specialized elements such as "common partner", "limited partner", "liability", "undertaking", "infinitely liability", etc.

For the examinee text with score of 2 pointsAs shown in fig. 3. The nodes 'Wang Mou', 'anti-dialect' and 'disagreement' in the knowledge graph are two by two and have sides, and the weight is higher, so that the examinee can correctly judge whether the disagreement is established or not for Wang Mou. The association relationship between the nodes "need", "liability", "bear" and "infinite liability" in fig. 3 shows that the test taker's text is analyzed and a correct determination is made as to whether Wang Mou needs to bear infinite liability. As can be seen from the comparison of FIG. 2 with FIG. 3, the repetition rate of the professional elements is high, the structure of the knowledge graph is very similar, and the fact that the examinee expresses the same knowledge in the text of the examinee is explained, so that the model is added to the examinee text>It is reasonable to infer as 2 points.

For the test taker text with score of 1As shown in fig. 4. The examinee also makes a correct judgment of "not true" as to whether "Wang Mou anti-dialect is true. However, the knowledge graph has no special factor of 'infinite liability' which has a key role in inference, only expresses the semantic meaning of 'should take responsibility for borrowing', and does not analyze the second score point, so the model is about the examinee text->It is reasonable to infer 1 point.

For a score of 0Examinee text of (a)As shown in fig. 5. In the knowledge graph, "Wang Mou", "anti-dialect" and "hold" three nodes are two by two and have higher weight, which indicates that the examinee makes error judgment of "hold" for "Wang Mou" whether the anti-dialect is hold or not. The examinee can see that the examinee considers that the examinee does not need to bear the infinite liability of the liabilities through the professional factors such as the unnecessary liabilities, the liabilities of the liabilities, the infinite liabilities of the liabilities and the like in the figure, so that the model deduces the examinee textA score of 0 is reasonable.

Further, in order to verify the technical advantages of the invention, validity verification is performed for the application scenario, and 8 data sets are included in total. All data of each subjective question are divided into a training set, a verification set and a test set according to the proportion of 70%, 20% and 10% of the data set. In order to verify the performance of the model on different training set sizes, training sets with the sizes of 5% and 1% of the data set are respectively constructed, and a small number of review samples provided by review experts in a real scene are simulated. The data amounts are shown in the following table:

table 2 data volumes of training set, test set and validation set of different proportions

The accuracy is used as an overall evaluation index, and the calculation formula of the accuracy is as follows:

in the formula (18) of the present invention,the accuracy is taken as an overall evaluation index; TP (1 Positive): the classifier predicts positive samplesThe actual positive samples, i.e. the number of positive samples that are correctly identified; FP (0 Positive): the prediction result of the classifier is a positive sample, and the actual result is a negative sample, namely the number of false positive negative samples; TN (1 Negative), the classifier prediction result is a Negative sample, and the actual result is a Negative sample, namely the number of Negative samples which are correctly identified; FN (0 Negative), the classifier predicts Negative samples, and actually positive samples, i.e. the number of positive samples missing report.

The selected comparison method comprises the following steps:

and (3) using a pre-training language model BERT to encode the test taker text and then using a classifier to review the answer of the test taker. Jacob Devlin, ming-Wei Chang, kenton Lee, et al Bert: pre-training of deep bidirectional transformers for language understanding [ C ]. The 2019 Conference of The North American Chapter of The Association for Computational Linguistics: human Language Technologies (NAACL-HLT), minneapolis, MN, USA, 2018:4171-4186.

Baserin-Roberta, namely encoding the examinee text by using a pre-training language model Roberta, and then reading answers of the examinee by using a classifier. YInhan Liu, myle Ott, naman Goyal et al Roberta A robustly optimized bert pretraining approach [ J ]. ArXiv preprint arXiv:1907.11692, 2019.

LR +: to promote the performance of the pre-trained BERT model in the general field on professional text inference, the model is fine tuned using textbooks as a corpus, the refined BERT model is used to encode the test taker answer and the reference answer, and the classifier is used to predict the score. Chul Sung, tejas Dhamech, swarnadeep Saha, et al Pre-training BERT on domain resources for short answer grading [ C ]. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), hong Kong, china, 2019:6071-6075.

Conv-GRNN: the model firstly uses a convolutional neural network to code sentences in the test taker text at a vocabulary level, then generates document vectors of the test taker text at the sentence level through GRU, and finally classifies the document vectors of the coded test taker text. Duyu Tang, bing Qin, ting Liu. Document modeling with gated recurrent neural network for sentiment classification [ C ]. Proceedings of the 2015 conference on empirical methods in natural language processing, lisbon, portugal, 2015:1422-1432.

KnowSTI is a model formed by the method of the present invention.

The performance of the process is shown in the following table:

table 3 accuracy of models on professional subjective question dataset (70% training set)

Table 4 accuracy of models on professional subjective questions dataset (5% training set)

Table 5 accuracy of models on professional subjective question dataset (1% training set)

As can be seen from Table 3, the model formed by the method of the present invention achieves optimal results in most cases, demonstrating the effectiveness of the method of the present invention. Considering that in an actual subjective question review scene, samples for model learning can be provided for each subjective question and mainly come from the pre-review of experts, and the number of the samples is limited. To verify the robustness of the inventive method model in terms of performance on training data sets of different sizes, a comparison is made between training sets of different sizes and the comparison method. The accuracy of the model reaches more than 95% in a training set with the proportion of 70%, and the accuracy of the model is basically higher than 90% in a training set with the proportion of 1% or 5%. As shown in Table 4, when the proportion of the training set is 5%, the model achieves more than 95% accuracy on the data sets I, III and VI, which can prove that the model can train the model by using limited data more fully. As shown in tables 4 and 5, as the training data amount is reduced, the effect of each model is affected to a certain extent, and the difference of accuracy between the models is gradually reduced, but the model of the method of the invention remains leading in most cases. The experimental results show that the method has the advantage of performance robustness.

Claims

1. A specialized text inference method fusing structured knowledge and text semantics, comprising:

s3: fused interpretable inference to form a specialized text inference model fusing structured knowledge and text semantics;

the step S1 specifically includes:

s11-1, extracting professional elements in the professional text to obtain a node set of the professional text knowledge graphNamely, obtaining the professional factors in the text through matching the professional text with the professional factor list;

using the formulaCosine similarity is calculated for every two specialized elements and used as the weight of the edge +.>Use of a pre-trained language model BERT for examinee text +.>Coding to obtain vector sequence->Wherein the professional is->Encoded as vector +.>Professional element->Correspondence vector->The method comprises the steps of carrying out a first treatment on the surface of the Using two vectors in vector space>And->Cosine value of included angle calculates weight of edge +.>：

After the professional element extraction and the association relation construction are completed, the method is toExaminee textExtracting structured knowledge, i.e. knowledge graph；

for knowledge graph extracted from professional textThe method comprises the steps of pre-training a graph encoder in a self-encoder mode, adopting a multi-layer convolutional neural network as an encoder in a self-encoder structure, and adopting a multi-layer perceptron MLP;

s12-1, adopting a multi-layer convolutional neural network as an encoder;

by usingThe convolutional neural network of the layer acts as a knowledge graph encoder: the knowledge graph encoder first uses the knowledge graph +.>Converted into a graph on a fixed point set to form an adjacency matrix +.>The method comprises the steps of carrying out a first treatment on the surface of the The node set of the graph is all words in the professional word stock, edges and knowledge graph +.>The association relation of the two is consistent; adjacency matrix->Input to the picture encoder via +.>Obtaining knowledge graph through secondary convolution operation and maximum pooling operationCoding result of->The method comprises the steps of carrying out a first treatment on the surface of the Finally, the coding result is->Flattening and linearly transforming to obtain knowledge graph->Coding vector +.>As in formula (2) -formula (5):

in equation (2), the initial input of the graph encoderIs->；

convolution kernel of convolution layerThe size is +.>；/>For convolution kernel->Go->Values of columns; receiving a representation matrix->As an input there is provided,is the +.about.of the representation matrix obtained after the convolution operation>Go->A value of column position; />Is the offset of the row; />Is the offset of the column:

when (when)When continuing to +.>The formula +.>Up to->Completion->Manipulation of the layers to obtainThe method comprises the steps of carrying out a first treatment on the surface of the For->Carry out flattening operation +.>And use a multi-layer perceptron +.>For coding result->Dimension reduction is carried out to obtain a knowledge graph->Coding vector +.>:

Adjacency matrix with reductionAdjacency matrix with reduction->Mean square error between as a loss function training the graph encoder,/>Is the loss function of the graph encoder; />Is knowledge graph->The number of intermediate nodes; decoder with design structure of two-layer linear transformation>Coding the graph vector +.>Decoding into a reduced adjacency matrix by means of a decoder>:

；

The step S2 specifically comprises the following steps:

by text semantic coding of a given specialized text;

the deep learning model is a large pre-training language model comprising the following steps ofThe semantics of the complete original text are encoded as an encoder:

the given professional text refers to examinee textReference answer +.>For examinee textReference answer +.>After splicing, carrying out text semantic coding to obtain text semantic features of the test taker text:

；

the step S3 specifically comprises the following steps:

the knowledge graphs of the test taker text and the reference answer are respectively converted into vector codes through S1 and S2And->In order to consider both the difference and the similarity between the examinee text and the reference answer, the difference between features is introduced when the vector coding features are spliced>And correlation between features->The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Representing the outer product of the two vectors; />Representing the operation of the concatenation of vectors, the characteristics for inference are obtained after the concatenation->：

Classifier based on knowledge feature inferenceThe loss function of (2) is:

in formula (11), S is the number of inferred score categories; the real labels of the test taker text samples are classifiedWhen (I)>The others are 0; />Is based on knowledge, the examinee text is inferred as category +.>Probability of time;

s32, performing professional text inference based on text semantic features:

Classifier based on text semantic featuresThe loss function of (2) is:

s331, in a fusion training stage, using a KL divergence constraint classifier based on knowledge feature inference and a classifier based on text semantic features, wherein the probability distribution of an inference result is close to that of the classifier:

in the case of the formula (14),is a consistency loss function;

in the formulas (15) - (17),is trained according to S1-S2A loss function of a specialized text inference model;is a super parameter; by the formula->And formula->Obtaining an inference result based on the feature of the expert text knowledge +.>And inference results based on text semantic features +.>Taking the inferred result based on the expert text knowledge feature +.>As a final result.

2. A system for implementing the inference method of claim 1, comprising an examinee text processing path, a reference answer processing path, a fusion inference module, and an output module;