CN116340543A - Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation - Google Patents
Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation Download PDFInfo
- Publication number
- CN116340543A CN116340543A CN202310344991.4A CN202310344991A CN116340543A CN 116340543 A CN116340543 A CN 116340543A CN 202310344991 A CN202310344991 A CN 202310344991A CN 116340543 A CN116340543 A CN 116340543A
- Authority
- CN
- China
- Prior art keywords
- knowledge graph
- representation
- entity
- relation
- embedded
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010276 construction Methods 0.000 title claims abstract description 47
- 238000009795 derivation Methods 0.000 title claims description 21
- 230000003044 adaptive effect Effects 0.000 title claims description 15
- 239000013598 vector Substances 0.000 claims abstract description 65
- 230000006870 function Effects 0.000 claims description 39
- 238000013507 mapping Methods 0.000 claims description 26
- 238000012549 training Methods 0.000 claims description 22
- 239000011159 matrix material Substances 0.000 claims description 21
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000002372 labelling Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 5
- 238000000034 method Methods 0.000 abstract description 7
- 230000003993 interaction Effects 0.000 abstract description 4
- 238000012360 testing method Methods 0.000 description 12
- 230000009286 beneficial effect Effects 0.000 description 7
- 230000000694 effects Effects 0.000 description 4
- 238000001179 sorption measurement Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000005215 recombination Methods 0.000 description 3
- 230000006798 recombination Effects 0.000 description 3
- 230000002457 bidirectional effect Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000011748 cell maturation Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
The invention relates to a knowledge graph construction method and a system for self-adaptive deduction of mathematical theorem, wherein the method comprises the steps of analyzing acquired text data, extracting entities and relations in the text data, and obtaining head entity embedded representation, tail entity relation representation and relation embedded representation; constructing candidate triples by embedding feature vectors corresponding to each relation of the head entity embedding representation and the tail entity relation embedding representation with the calculus knowledge graph; and calculating the energy values of all the ternary candidate groups according to the trained knowledge graph reasoning model, and outputting the mathematical conclusion corresponding to the candidate ternary group with the optimal energy value as an answer. Through analyzing text data, deep semantic interaction of entities and relations in a mathematical knowledge graph is mined, then candidate triples are constructed, energy values of the candidate triples are calculated, mathematical conclusions corresponding to the candidate triples with the optimal energy values are output as answers, and therefore optimal answers are obtained, and mathematical learning efficiency can be improved.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a knowledge graph construction method and system for self-adaptive deduction of mathematical theorem.
Background
Calculus is an important basic course of a first repair of management specialized students in higher institutions, and is the discipline field of intersection of economy and mathematics. Has wide application in economy.
The principal task of the calculus course is to enable students to obtain basic knowledge of calculus, culture basic computing capacity of the students, enhance preliminary capacity of the students to treat economic problems by a qualitative and quantitative combined method, and lay necessary mathematical foundation for learning subsequent courses and further obtaining mathematical knowledge. However, the basic concepts and theory of functions, limits and continuity, functional differentiation and integration are often obscure, and students of beginner calculus knowledge often have difficulty in grasping the logical relationship between the concepts in calculus. Therefore, how to build the knowledge graph of the calculus to assist students in learning and mastering the calculus becomes a technical problem to be solved urgently.
Disclosure of Invention
The invention aims to solve the technical problem of providing a knowledge graph construction method and a system for self-adaptive deduction of mathematical theorem aiming at the defects of the prior art.
The technical scheme for solving the technical problems is as follows: a knowledge graph construction method for self-adaptive deduction of mathematical theorem comprises the following steps:
analyzing the acquired text data, extracting entities and relations in the text data, and obtaining a head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i ;
The head entity h i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i The method comprises the steps of carrying out a first treatment on the surface of the Each relation r of calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
All candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) The corresponding mathematical conclusion in (a) is output as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t i Having a relation r u Probability values of (a) are provided.
The beneficial effects of the invention are as follows: according to the knowledge graph construction method oriented to the mathematical theorem self-adaptive derivation, the text data is analyzed, and the deep semantic interaction of entities and relations in the mathematical knowledge graph is mined to obtain the head entity representation h i Tail entity t j Sum relation r i Then, recombination is performed to construct candidate triples (h i ,r u ,t j ) Then according to the trained knowledge graphCandidate triples (h) are calculated by the force model i ,r u ,t j ) By establishing an inference path of the mathematical knowledge graph by means of TransR, so that the candidate triples (h) i ,r u ,t j ) And the corresponding mathematical conclusion is output as an answer, so that the optimal answer is positioned, and the mathematical learning efficiency can be improved.
Based on the technical scheme, the invention can also be improved as follows:
further: the analyzing the acquired text data specifically comprises the following steps:
converting the text data into character vectors;
extracting the characteristics of the converted character vector;
and outputting labeling results of the entities and the relations in the text data according to the extracted features.
The beneficial effects of the above-mentioned further scheme are: converting the text data into adsorption connection through a pre-training language model, extracting features of the converted character vectors, and finally marking the extracted features to accurately obtain a labeling result of entities and relations in the text data, thereby accurately obtaining a head entity h i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i 。
Further: the training of the knowledge graph reasoning model specifically comprises the following steps:
acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u ;
Embedding representation h according to entities i ,t i Classifying the entity pairs (h, t) in the calculus knowledge graph, wherein each class has the same relation r;
embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r 。
The beneficial effects of the above-mentioned further scheme are: converting the entity and the relation in the calculus knowledge graph into embedded representation through a TransE model, classifying the entity pairs (h, t) in the calculus knowledge graph to obtain the entity pairs with the same kind of relation r, embedding each relation r into a d-dimensional space through CTransR, and learning each relation r to obtain a relation vector r c And mapping matrix M r Thus, according to the relation vector r c And mapping matrix M r And accurately calculating the energy value of the meta candidate group.
Further: the calculating the energy values of all the ternary candidate groups specifically comprises:
h r,c =hM r
t r,c =tM r
wherein h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r c For the embedded representation of the relation r, α is a constant.
The beneficial effects of the above-mentioned further scheme are: a relation vector r passing through each relation r c And mapping matrix M r In combination with energy value functionsThe energy values of all the ternary candidate groups can be accurately calculated, so that the energy can be conveniently selected from the ternary candidate groupsCandidate triples of optimal magnitude (h i ,r u ,t j ) And then a new mathematical conclusion is obtained.
The invention also provides a knowledge graph construction system for the self-adaptive deduction of the mathematical theorem, which comprises an analysis module, a construction module and a calculation module;
the analysis module is used for analyzing the acquired text data, extracting the entity and the relation in the text data and obtaining the head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i ;
The construction module is used for embedding the head entity into the representation h i And tail entity embedded representation t j Each relation r with the calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
The calculation module is used for converting all candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) The corresponding mathematical conclusion in (a) is output as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t j Having a relation r u Probability values of (a) are provided.
According to the knowledge graph construction system oriented to the mathematical theorem self-adaptive derivation, the text data is analyzed, and the deep semantic interaction of the entity and the relation in the mathematical knowledge graph is mined to obtain the head entity h of the knowledge graph i And an embedded representation h thereof i Tail entity t i And its embedded representation t i Sum relation r i And its embedded representation r i Then, recombination is performed to construct candidate triples (h i ,r u ,t j ) And then calculating a candidate triplet (h) according to the trained knowledge graph thrust model i ,r u ,t j ) By establishing an inference path of the mathematical knowledge graph by means of TransR, so that the candidate triples (h) i ,r u ,t j ) And the corresponding mathematical conclusion is output as an answer, so that the optimal answer is positioned, and the mathematical learning efficiency can be improved.
Based on the technical scheme, the invention can also be improved as follows:
further: the specific implementation of the analysis module for analyzing the acquired text data is as follows:
converting the text data into character vectors;
extracting the characteristics of the converted character vector;
and outputting labeling results of the entities and the relations in the text data according to the extracted features.
The beneficial effects of the above-mentioned further scheme are: converting the text data into adsorption connection through a pre-training language model, extracting features of the converted character vectors, and finally marking the extracted features to accurately obtain a labeling result of entities and relations in the text data, thereby accurately obtaining a head entity h i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i 。
Further: the training of the knowledge graph reasoning model by the calculation module is specifically realized as follows:
acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u ;
Embedding representation h according to entities i ,t i The entity pairs in the calculus knowledge graph are treated with # -, and the calculus knowledge graph is treated with the methodh, t) classifying to obtain the same relationship r of each entity pair;
embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r 。
The beneficial effects of the above-mentioned further scheme are: converting the entity and the relation in the calculus knowledge graph into embedded representation through a TransE model, classifying the entity pairs (h, t) in the calculus knowledge graph to obtain the same kind of relation r, embedding each relation r into d-dimensional space through CTransR, and learning each relation r to obtain a relation vector r c And mapping matrix M r Thus, according to the relation vector r c And mapping matrix M r And accurately calculating the energy value of the meta candidate group.
Further: the specific implementation of the calculation module for calculating the energy values of all the ternary candidate groups is as follows:
h r,c =hM r
t r,c =tM r
where h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r is the relationship between h and t, and α is a constant.
The beneficial effects of the above-mentioned further scheme are: a relation vector r passing through each relation r c And mapping matrix M r In combination with energy value functionsThe energy values of all the triples can be accurately calculated, so that the candidate triples with the optimal energy value (h i ,r u ,t j ) And then a new mathematical conclusion is obtained.
The invention also provides a computer readable storage medium, which is characterized in that the computer readable storage medium stores computer instructions, and the computer instructions are used for realizing the knowledge graph construction method for the adaptive derivation of the mathematical theorem when the processor executes the knowledge graph construction method.
The invention also provides knowledge graph construction equipment for the self-adaptive derivation of the mathematical theorem, which comprises the following steps:
at least one processor and a storage medium, the memory communicatively coupled to the processor;
wherein the storage medium stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the mathematical theorem-oriented adaptive derivation knowledge graph construction method.
Drawings
Fig. 1 is a schematic diagram of an application scenario of a knowledge graph construction method for adaptive derivation of mathematical theorem according to an embodiment of the present invention;
FIG. 2 is a flow chart of a knowledge graph construction method for adaptive derivation of mathematical theorem according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a mathematical knowledge graph reasoning path according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a knowledge graph construction system for adaptive derivation based on mathematical theorem according to an embodiment of the present invention.
Detailed Description
The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
An application scenario of a knowledge graph construction method for self-adaptive deduction of mathematical theorem in the embodiment of the invention is shown in fig. 1.
As shown in fig. 2, a knowledge graph construction method for adaptively deriving a mathematical theorem includes the following steps:
s1: analyzing the acquired text data, extracting entities and relations in the text data, and obtaining a head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i ;
In the embodiment of the invention, a text analyzer is utilized to analyze text data and extract entity e in known propositions i Sum relation r i For example, after the analysis of the "constraint that the univariate function can be micro", the head entity, the tail entity and the relationship are respectively "the univariate function can be micro", "the univariate function can be conducted", "the sufficient requirement"; after the analysis of the "guidable continuous", the head entity, the tail entity and the relation are respectively "guidable of the monobasic function", "continuous of the monobasic function" and "necessary condition".
After the analysis is completed, the embedded representation is realized through a mapping vector space, and the specific realization process of the mapping vector space is to know an entity e of a math learner i Relation r i Mapping to vector space of low-dimensional real value to obtain corresponding embedded representations e i ,r i Mapping the head entity 'unitary function' microscopic, the tail entity 'unitary function' and the relation 'sufficient requirement' of the 'unitary function conductive to the vector space, and obtaining the embedded representation of the head entity' unitary function microscopic ', the tail entity' unitary function conductive and the relation 'sufficient requirement'; mapping the head entity ' unitary function ' of ' leading must continuous ', the tail entity ' unitary function ' of ' leading must continuous ' and the relation ' necessary insufficient condition ' to a vector space, and obtaining the embedded representation of the head entity ' unitary function ' of ' leading, the tail entity ' unitary function ' of ' continuous ' and the relation ' necessary condition '.
Specifically, in one or more embodiments of the present invention, the parsing the acquired text data specifically includes the following steps:
s11: converting the text data into character vectors using a pre-trained language model;
s12: extracting the characteristics of the converted character vector by using a bidirectional gating circulating unit;
s13: and outputting labeling results of the entities and the relations in the text data according to the extracted features by using a sequence labeling model.
Converting the text data into adsorption connection through a pre-training language model, extracting features of the converted character vectors, and finally marking the extracted features to accurately obtain a labeling result of entities and relations in the text data, thereby accurately obtaining a head entity h i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i ;
S2: embedding the header entity into a representation h i And tail entity embedded representation t j Each relation r with the calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
S3: all candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) Outputting the corresponding mathematical conclusion as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t i Having a relation r u Probability values of (a) are provided.
In one or more embodiments of the present invention, the training of the knowledge graph inference model specifically includes the following steps:
s31: acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u ;
Firstly, extracting information from data such as a data course standard, a teaching material, a test class, a teaching plan, a test question set and the like, and constructing a mathematical knowledge graph, wherein mathematics comprise limits, differentiation, integration and the like; then taking an entity set E and a relation set R of triples (head entity, relation and tail entity) in the mathematical knowledge graph as m-dimensional embedded representation to obtain an entity embedded representation E in the mathematical knowledge graph i And the relation embedding represents r' i 。
Here, the embedding process we employ a transition:
first, assign an initial vector e to all entities and relationships i And r' i Then, training and testing are carried out on the triples in the mathematical knowledge graph, and the data set of the mathematical knowledge graph is divided into a training set and a testing set as in the common strategy of machine learning, wherein 80% of the data set of the mathematical knowledge graph is used as the training set, and 20% is used as the testing set. The training functions for the training set are:
wherein, gamma is more than 0, gamma is a constant, S is a correct triplet set, h, r ', t respectively represent embedded representations of h, r and t, S' (h,r,t) Is an error triplet set constructed by (h, r, t), S' (h,r,t) = { (h, r, t ') |t ' ∈e }. U { (h ', r, t) |h ' ∈e }, h ', r ', t ' represent embedded representations of h ', r, t ', respectively, E is a set of all entities.
Training to obtain entity embedded representation e i 。
S32: embedding representation h according to entities i ,t i Classifying the entity pairs (h, t) in the calculus knowledge graph, wherein each class has the same relation r;
here we replace the entity pair (h, t) with x=h-t, and the intermediate variable x is clustered with K-means, i.e. K samples (u 1 ,u 2 ,…,u k ) As an initial mean vector, training is performed such that E is minimized:
and E is the mean clustering distance between the current intermediate variable x and k samples by taking k samples as the center.
S33: embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r 。
Considering that entities and relationships are of different types, it is not appropriate to represent them in the same space, so we use TransR to embed the relationship r into d-dimensional space.
Converting the entity and the relation in the calculus knowledge graph into embedded representation through a TransE model, classifying the (h, t) entity in the calculus knowledge graph to obtain the same kind of relation r, embedding each relation r into d-dimensional space through CTransR, and learning each relation r to obtain a relation vector r c And mapping matrix M r Thus, according to the relation vector r c And mapping matrix M r And accurately calculating the energy value of the meta candidate group.
In one or more embodiments of the present invention, said calculating energy values of all said ternary candidates specifically comprises:
h r,c =hM r
t r,c =tM r
where h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r is the relationship between h and t, and α is a constant.
We learn the relationship vector r for each relationship c And mapping matrix M r By scoring function f r (h, t), training and testing the triples in the mathematical knowledge-graph, and dividing the data set of the mathematical knowledge-graph into a training set and a testing set as in the common strategy of machine learning, wherein 80% of the data set of the mathematical knowledge-graph is used as the training set and 20% is used as the testing set. The training function is:
wherein, gamma is more than 0, gamma is a constant, S is a correct triplet set, h r,c =hM r ,t r,c =tM r ,M r Is a mapping matrix, h, r ', t represent embedded representations of h, r, t, S' (h,r,t) Is an error triplet set constructed by (h, r, t), S' (h,r,t) ={(h,r,t′)|t′∈E}∪{(h′,r,t)|h′∈E),h′ r,c =h′M r ,t′ r,c =t′M r H ', r ', t ' represent embedded representations of h ', r, t ', respectively, and E is a set of all entities.
It is contemplated that some of the relationships are one-to-many and some are many-to-one. When the error set S' is constructed, if the relation r is one-to-many, the head entity h is replaced as much as possible, and if the relation r is many-to-one, the tail entity t is replaced as much as possible. The training process adopts a gradient descent method. Obtaining a relation vector r c And mapping matrix M r 。
The test set is used for testing, and the performance of the model in the test stage is evaluated through evaluation indexes MR (average ranking), MRR (average ranking reciprocal) and Hits@k (the proportion of the correct result entering the previous k in the energy value sequence). MR refers to the average of the exact results in the ranking of energy values, with smaller MR meaning that the better the ranking of the correct answers, the better the model. MRR refers to the average of the inverse of the ranking of the correct results in the ranking of energy values, in contrast to MR, the larger the MRR, meaning the smaller the ranking of the correct answers, i.e., the higher the ranking, the better the effect of the model. The ratio of the correct result to the previous k in the energy value ranking is represented by the fact that the entity prediction effect is generally measured by the fact that the value of the fact@k is larger, and the effect of the model is represented by the fact that the relation prediction effect is measured by the fact that the value of the fact@k is larger.
After the test is completed, the candidate triplet (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) The energy values of all the candidate triples are calculated through the energy value function and are input to the trained knowledge graph reasoning module, and whether the triples correctly give out interpretable probabilistic predictions or not can be judged.
For example, after resolving the "constraint that the unitary function is micro-able", the head entity and the tail entity are respectively "the unitary function is micro-able" and "the corresponding entity embedments of the unitary function is h 1 And t 1 The method comprises the steps of carrying out a first treatment on the surface of the After the analysis of 'guidability continuous', the head entity and the tail entity are respectively 'unitary function guidability' and 'unitary function continuous' are respectively embedded into the corresponding entities which are respectively h 2 And t 2 . To get more mathematical conclusions, it is necessary to recombine the entities and embed the relationships of the existing mathematical knowledge-graph into r u Such as "irrelevant condition", "sufficient requisite", "sufficient unnecessary condition", etc., are matched one by one to form candidate triples (h i ,r u ,t j ) For example, (the unitary function is differentiable, the condition is irrelevant, the unitary function is continuous), (the unitary function is differentiable, the condition is sufficient, the unitary function is not necessaryContinuous) and the like, inputting the candidate triples into a trained knowledge graph reasoning module to calculate energy values, wherein the triples with the optimal energy values are the optimal answers. After energy value calculation is performed on the triples through an energy value function, ascending order is performed on the triples, and the triples with the optimal energy value (h i ,r u ,t j ) The corresponding mathematical conclusion is the best answer, as shown in fig. 3. For example, inputting the two conclusions of "the unitary function is a minutely charged condition" and "the guidable must be continuous" into the system, the triplet with the highest energy value is obtained as (the unitary function is minutely, the necessary insufficient condition, the unitary function is continuous), so that the mathematical conclusion obtained by the learner is that the unitary function is minutely continuous.
A relation vector r passing through each relation r c And mapping matrix M r In combination with energy value functionsThe energy values of all the triples can be accurately calculated, so that the candidate triples with the optimal energy value (h i ,r u ,t j ) And then the optimal feature vector in the text data is obtained.
As shown in fig. 4, the invention further provides a knowledge graph construction system for self-adaptive derivation of mathematical theorem, which comprises an analysis module, a construction module and a calculation module;
the analysis module is used for analyzing the acquired text data, extracting the entity and the relation in the text data and obtaining the head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i ;
The construction module is used for embedding the head entity into the representation h i And tail entity embedded representation t j Each relation r with the calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
The calculation module is used for calculating the candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) The corresponding mathematical conclusion in (a) is output as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t i Having a relation r u Probability values of (a) are provided.
According to the knowledge graph construction system for self-adaptive deduction of the mathematical theorem, the text data are analyzed, and the deep semantic interaction of the entity and the relation in the mathematical knowledge graph is mined to obtain the head entity h i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i Then, recombination is performed to construct candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) And then calculating a candidate triplet (h) according to the trained knowledge graph thrust model i ,r u ,t j ) By establishing an inference path of the mathematical knowledge graph by means of TransR, so that the candidate triples (h) i ,r u ,t j ) And the corresponding mathematical conclusion is output as an answer, so that the optimal answer is positioned, and the mathematical learning efficiency can be improved.
In one or more embodiments of the present invention, the specific implementation of the parsing module for parsing the acquired text data is:
converting the text data into character vectors using a pre-trained language model;
extracting the characteristics of the converted character vector by using a bidirectional gating circulating unit;
and outputting labeling results of the entities and the relations in the text data according to the extracted features by using a sequence labeling model.
Converting the text data into adsorption connection through a pre-training language model, extracting features of the converted character vectors, and finally marking the extracted features to accurately obtain a labeling result of entities and relations in the text data, thereby accurately obtaining head entity embedded representation h in the text data i Tail entity embedded representation t i And relation embedding representation r i 。
In one or more embodiments of the present invention, the training of the knowledge-graph inference model by the calculation module is specifically implemented as follows:
acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u ;
Embedding representation h according to entities i ,t i Classifying the entity pairs (h, t) in the calculus knowledge graph, wherein the obtained entity pairs of the same type have the same relationship r;
embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r 。
Converting the entity and the relation in the calculus knowledge graph into embedded representation through a TransE model, classifying the entity pairs (h, t) in the calculus knowledge graph to obtain the entity pairs with the same kind of relation r, embedding each relation r into a d-dimensional space through CTransR, and learning each relation r to obtain a relation vector r c And mapping matrix M r Thus, according to the relation vector r c And mapping matrix M r And accurately calculating the energy value of the meta candidate group.
In one or more embodiments of the present invention, the calculating module calculates the energy values of all the ternary candidates as follows:
h r,c =hM r
t r,c =tM r
wherein h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r c For the embedded representation of the relation r, α is a constant.
A relation vector r passing through each relation r c And mapping matrix M r In combination with energy value functionsThe energy values of all the triples can be accurately calculated, so that the candidate triples with the optimal energy value (h i ,r u ,t j ) And then a new mathematical conclusion is obtained.
The invention also provides a computer readable storage medium, which is characterized in that the computer readable storage medium stores computer instructions, and the computer instructions are used for realizing the knowledge graph construction method for the adaptive derivation of the mathematical theorem when the processor executes the knowledge graph construction method.
The invention also provides a knowledge graph construction device for the self-adaptive derivation of the mathematical theorem, which comprises:
at least one processor and a storage medium, the memory communicatively coupled to the processor;
wherein the storage medium stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the mathematical theorem-oriented adaptive derivation knowledge graph construction method.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.
Claims (10)
1. The knowledge graph construction method for the self-adaptive derivation of the mathematical theorem is characterized by comprising the following steps of:
analyzing the acquired text data, extracting entities and relations in the text data, and obtaining a head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i ;
Embedding the header entity into a representation h i And tail entity embedded representation t j Each relation r with the calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
All candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) Outputting the corresponding mathematical conclusion as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t i Having a relation r u Probability values of (a) are provided.
2. The knowledge graph construction method for adaptively deriving a mathematical theorem according to claim 1, wherein the parsing the obtained text data comprises the following steps:
converting the text data into character vectors;
extracting the characteristics of the converted character vector;
and outputting labeling results of the entities and the relations in the text data according to the extracted features.
3. The knowledge graph construction method for self-adaptive deduction of mathematical theorem according to claim 1, wherein the training of the knowledge graph reasoning model specifically comprises the following steps:
acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u ;
Embedding representation h according to entities i ,t i Classifying the entity pairs (h, t) in the calculus knowledge graph to obtain the same relation r of each class;
embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r 。
4. The knowledge graph construction method for adaptively deriving a mathematical theorem according to claim 3, wherein said calculating energy values of all the ternary candidate groups comprises:
h r,c =hM r
t r,c =tM r
wherein h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r c For the embedded representation of the relation r, α is a constant.
5. A knowledge graph construction system for self-adaptive deduction of mathematical theorem is characterized in that: the system comprises an analysis module, a construction module and a calculation module;
the analysis module is used for analyzing the acquired text data, extracting the entity and the relation in the text data and obtaining the head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i ;
The construction module is used for embedding the head entity into the representation h i And tail entity embedded representation t j Each relation r with the calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
The calculation module is used for converting all candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) The corresponding mathematical conclusion in (a) is output as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t i Having a relation r u Probability values of (a) are provided.
6. The knowledge graph construction system for adaptive derivation according to mathematical theorem of claim 5, wherein the parsing module performs parsing on the obtained text data by:
converting the text data into character vectors;
extracting the characteristics of the converted character vector;
and outputting labeling results of the entities and the relations in the text data according to the extracted features.
7. The knowledge graph construction system for adaptive derivation based on mathematical theorem according to claim 5, wherein the training of the knowledge graph inference model by the calculation module is specifically implemented as follows:
acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u ;
Embedding representation h according to entities i ,t i Classifying the entity pairs (h, t) in the calculus knowledge graph, wherein the obtained same class has the same relation r;
embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r 。
8. The knowledge graph construction system for adaptive derivation of mathematical theorem according to claim 7, wherein the specific implementation of calculating the energy values of all the ternary candidate groups by the calculation module is as follows:
h r,c =hM r
t r,c =tM r
wherein h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r c For the embedded representation of the relation r, α is a constant.
9. A computer readable storage medium, wherein the computer readable storage medium stores computer instructions for causing a processor to implement the mathematical theorem-oriented adaptive derivation knowledge-graph construction method of any one of claims 1-4 when executed.
10. The knowledge graph construction equipment facing the adaptive derivation of the mathematical theorem is characterized by comprising the following components:
at least one processor and a storage medium, the memory communicatively coupled to the processor;
wherein the storage medium has stored thereon a computer program executable by the at least one processor to enable the at least one processor to perform the mathematical theorem-oriented adaptive derivation knowledge-graph construction method of any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310344991.4A CN116340543A (en) | 2023-03-31 | 2023-03-31 | Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310344991.4A CN116340543A (en) | 2023-03-31 | 2023-03-31 | Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116340543A true CN116340543A (en) | 2023-06-27 |
Family
ID=86889314
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310344991.4A Pending CN116340543A (en) | 2023-03-31 | 2023-03-31 | Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116340543A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111260064A (en) * | 2020-04-15 | 2020-06-09 | 中国人民解放军国防科技大学 | Knowledge inference method, system and medium based on knowledge graph of meta knowledge |
CN113807519A (en) * | 2021-08-30 | 2021-12-17 | 华中师范大学 | Knowledge graph construction method integrating teaching feedback and learned understanding |
WO2022033072A1 (en) * | 2020-08-12 | 2022-02-17 | 哈尔滨工业大学 | Knowledge graph-oriented representation learning training local training method |
CN114840679A (en) * | 2022-01-25 | 2022-08-02 | 华中师范大学 | Robot intelligent learning guiding method based on music theory knowledge graph reasoning and application |
-
2023
- 2023-03-31 CN CN202310344991.4A patent/CN116340543A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111260064A (en) * | 2020-04-15 | 2020-06-09 | 中国人民解放军国防科技大学 | Knowledge inference method, system and medium based on knowledge graph of meta knowledge |
WO2022033072A1 (en) * | 2020-08-12 | 2022-02-17 | 哈尔滨工业大学 | Knowledge graph-oriented representation learning training local training method |
CN113807519A (en) * | 2021-08-30 | 2021-12-17 | 华中师范大学 | Knowledge graph construction method integrating teaching feedback and learned understanding |
CN114840679A (en) * | 2022-01-25 | 2022-08-02 | 华中师范大学 | Robot intelligent learning guiding method based on music theory knowledge graph reasoning and application |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107967254B (en) | Knowledge point prediction method and device, storage medium and electronic equipment | |
Lakkaraju et al. | A machine learning framework to identify students at risk of adverse academic outcomes | |
CN111241243A (en) | Knowledge measurement-oriented test question, knowledge and capability tensor construction and labeling method | |
CN114913729B (en) | Question selecting method, device, computer equipment and storage medium | |
US20230027526A1 (en) | Method and apparatus for classifying document based on attention mechanism and semantic analysis | |
CN114254208A (en) | Identification method of weak knowledge points and planning method and device of learning path | |
Ding et al. | Why Deep Knowledge Tracing Has Less Depth than Anticipated. | |
Das et al. | An examination system automation using natural language processing | |
Liu | Data analysis of educational evaluation using K-means clustering method | |
CN115310520A (en) | Multi-feature-fused depth knowledge tracking method and exercise recommendation method | |
CN115238036A (en) | Cognitive diagnosis method and device based on graph attention network and text information | |
CN114840649A (en) | Student cognitive diagnosis method based on cross-modal mutual attention neural network | |
Huizhong et al. | Research on the automation integration terminal of the education management platform based on big data analysis | |
CN113283488A (en) | Learning behavior-based cognitive diagnosis method and system | |
CN110765241B (en) | Super-outline detection method and device for recommendation questions, electronic equipment and storage medium | |
CN114117033B (en) | Knowledge tracking method and system | |
CN116361541A (en) | Test question recommendation method based on knowledge tracking and similarity analysis | |
GE et al. | A machine learning based framework for predicting student’s academic performance | |
CN115935969A (en) | Heterogeneous data feature extraction method based on multi-mode information fusion | |
CN116340543A (en) | Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation | |
CN112785039B (en) | Prediction method and related device for answer score rate of test questions | |
Clavié et al. | Deep Embeddings of Contextual Assessment Data for Improving Performance Prediction. | |
Makhlouf et al. | Mining Students' Comments to Build an Automated Feedback System. | |
Wang et al. | Large-scale educational question analysis with partial variational auto-encoders | |
KR20170105969A (en) | Apparatus And Computer Program for Searching Similar Mathematical Problem |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |