CN116340543A - Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation - Google Patents

Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation Download PDF

Info

Publication number
CN116340543A
CN116340543A CN202310344991.4A CN202310344991A CN116340543A CN 116340543 A CN116340543 A CN 116340543A CN 202310344991 A CN202310344991 A CN 202310344991A CN 116340543 A CN116340543 A CN 116340543A
Authority
CN
China
Prior art keywords
knowledge graph
representation
entity
relation
embedded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310344991.4A
Other languages
Chinese (zh)
Inventor
王何慧
鞠剑平
刘婷婷
唐剑隐
刘海
肖振华
林明玉
万飞
边帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei Business College
Original Assignee
Hubei Business College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei Business College filed Critical Hubei Business College
Priority to CN202310344991.4A priority Critical patent/CN116340543A/en
Publication of CN116340543A publication Critical patent/CN116340543A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention relates to a knowledge graph construction method and a system for self-adaptive deduction of mathematical theorem, wherein the method comprises the steps of analyzing acquired text data, extracting entities and relations in the text data, and obtaining head entity embedded representation, tail entity relation representation and relation embedded representation; constructing candidate triples by embedding feature vectors corresponding to each relation of the head entity embedding representation and the tail entity relation embedding representation with the calculus knowledge graph; and calculating the energy values of all the ternary candidate groups according to the trained knowledge graph reasoning model, and outputting the mathematical conclusion corresponding to the candidate ternary group with the optimal energy value as an answer. Through analyzing text data, deep semantic interaction of entities and relations in a mathematical knowledge graph is mined, then candidate triples are constructed, energy values of the candidate triples are calculated, mathematical conclusions corresponding to the candidate triples with the optimal energy values are output as answers, and therefore optimal answers are obtained, and mathematical learning efficiency can be improved.

Description

Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a knowledge graph construction method and system for self-adaptive deduction of mathematical theorem.
Background
Calculus is an important basic course of a first repair of management specialized students in higher institutions, and is the discipline field of intersection of economy and mathematics. Has wide application in economy.
The principal task of the calculus course is to enable students to obtain basic knowledge of calculus, culture basic computing capacity of the students, enhance preliminary capacity of the students to treat economic problems by a qualitative and quantitative combined method, and lay necessary mathematical foundation for learning subsequent courses and further obtaining mathematical knowledge. However, the basic concepts and theory of functions, limits and continuity, functional differentiation and integration are often obscure, and students of beginner calculus knowledge often have difficulty in grasping the logical relationship between the concepts in calculus. Therefore, how to build the knowledge graph of the calculus to assist students in learning and mastering the calculus becomes a technical problem to be solved urgently.
Disclosure of Invention
The invention aims to solve the technical problem of providing a knowledge graph construction method and a system for self-adaptive deduction of mathematical theorem aiming at the defects of the prior art.
The technical scheme for solving the technical problems is as follows: a knowledge graph construction method for self-adaptive deduction of mathematical theorem comprises the following steps:
analyzing the acquired text data, extracting entities and relations in the text data, and obtaining a head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i
The head entity h i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i The method comprises the steps of carrying out a first treatment on the surface of the Each relation r of calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
All candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) The corresponding mathematical conclusion in (a) is output as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t i Having a relation r u Probability values of (a) are provided.
The beneficial effects of the invention are as follows: according to the knowledge graph construction method oriented to the mathematical theorem self-adaptive derivation, the text data is analyzed, and the deep semantic interaction of entities and relations in the mathematical knowledge graph is mined to obtain the head entity representation h i Tail entity t j Sum relation r i Then, recombination is performed to construct candidate triples (h i ,r u ,t j ) Then according to the trained knowledge graphCandidate triples (h) are calculated by the force model i ,r u ,t j ) By establishing an inference path of the mathematical knowledge graph by means of TransR, so that the candidate triples (h) i ,r u ,t j ) And the corresponding mathematical conclusion is output as an answer, so that the optimal answer is positioned, and the mathematical learning efficiency can be improved.
Based on the technical scheme, the invention can also be improved as follows:
further: the analyzing the acquired text data specifically comprises the following steps:
converting the text data into character vectors;
extracting the characteristics of the converted character vector;
and outputting labeling results of the entities and the relations in the text data according to the extracted features.
The beneficial effects of the above-mentioned further scheme are: converting the text data into adsorption connection through a pre-training language model, extracting features of the converted character vectors, and finally marking the extracted features to accurately obtain a labeling result of entities and relations in the text data, thereby accurately obtaining a head entity h i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i
Further: the training of the knowledge graph reasoning model specifically comprises the following steps:
acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u
Embedding representation h according to entities i ,t i Classifying the entity pairs (h, t) in the calculus knowledge graph, wherein each class has the same relation r;
embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r
The beneficial effects of the above-mentioned further scheme are: converting the entity and the relation in the calculus knowledge graph into embedded representation through a TransE model, classifying the entity pairs (h, t) in the calculus knowledge graph to obtain the entity pairs with the same kind of relation r, embedding each relation r into a d-dimensional space through CTransR, and learning each relation r to obtain a relation vector r c And mapping matrix M r Thus, according to the relation vector r c And mapping matrix M r And accurately calculating the energy value of the meta candidate group.
Further: the calculating the energy values of all the ternary candidate groups specifically comprises:
by a function of energy values
Figure SMS_1
And (3) calculating:
Figure SMS_2
Figure SMS_3
h r,c =hM r
t r,c =tM r
wherein h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r c For the embedded representation of the relation r, α is a constant.
The beneficial effects of the above-mentioned further scheme are: a relation vector r passing through each relation r c And mapping matrix M r In combination with energy value functions
Figure SMS_4
The energy values of all the ternary candidate groups can be accurately calculated, so that the energy can be conveniently selected from the ternary candidate groupsCandidate triples of optimal magnitude (h i ,r u ,t j ) And then a new mathematical conclusion is obtained.
The invention also provides a knowledge graph construction system for the self-adaptive deduction of the mathematical theorem, which comprises an analysis module, a construction module and a calculation module;
the analysis module is used for analyzing the acquired text data, extracting the entity and the relation in the text data and obtaining the head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i
The construction module is used for embedding the head entity into the representation h i And tail entity embedded representation t j Each relation r with the calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
The calculation module is used for converting all candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) The corresponding mathematical conclusion in (a) is output as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t j Having a relation r u Probability values of (a) are provided.
According to the knowledge graph construction system oriented to the mathematical theorem self-adaptive derivation, the text data is analyzed, and the deep semantic interaction of the entity and the relation in the mathematical knowledge graph is mined to obtain the head entity h of the knowledge graph i And an embedded representation h thereof i Tail entity t i And its embedded representation t i Sum relation r i And its embedded representation r i Then, recombination is performed to construct candidate triples (h i ,r u ,t j ) And then calculating a candidate triplet (h) according to the trained knowledge graph thrust model i ,r u ,t j ) By establishing an inference path of the mathematical knowledge graph by means of TransR, so that the candidate triples (h) i ,r u ,t j ) And the corresponding mathematical conclusion is output as an answer, so that the optimal answer is positioned, and the mathematical learning efficiency can be improved.
Based on the technical scheme, the invention can also be improved as follows:
further: the specific implementation of the analysis module for analyzing the acquired text data is as follows:
converting the text data into character vectors;
extracting the characteristics of the converted character vector;
and outputting labeling results of the entities and the relations in the text data according to the extracted features.
The beneficial effects of the above-mentioned further scheme are: converting the text data into adsorption connection through a pre-training language model, extracting features of the converted character vectors, and finally marking the extracted features to accurately obtain a labeling result of entities and relations in the text data, thereby accurately obtaining a head entity h i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i
Further: the training of the knowledge graph reasoning model by the calculation module is specifically realized as follows:
acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u
Embedding representation h according to entities i ,t i The entity pairs in the calculus knowledge graph are treated with # -, and the calculus knowledge graph is treated with the methodh, t) classifying to obtain the same relationship r of each entity pair;
embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r
The beneficial effects of the above-mentioned further scheme are: converting the entity and the relation in the calculus knowledge graph into embedded representation through a TransE model, classifying the entity pairs (h, t) in the calculus knowledge graph to obtain the same kind of relation r, embedding each relation r into d-dimensional space through CTransR, and learning each relation r to obtain a relation vector r c And mapping matrix M r Thus, according to the relation vector r c And mapping matrix M r And accurately calculating the energy value of the meta candidate group.
Further: the specific implementation of the calculation module for calculating the energy values of all the ternary candidate groups is as follows:
by a function of energy values
Figure SMS_5
And (3) calculating:
Figure SMS_6
Figure SMS_7
h r,c =hM r
t r,c =tM r
where h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r is the relationship between h and t, and α is a constant.
The beneficial effects of the above-mentioned further scheme are: a relation vector r passing through each relation r c And mapping matrix M r In combination with energy value functions
Figure SMS_8
The energy values of all the triples can be accurately calculated, so that the candidate triples with the optimal energy value (h i ,r u ,t j ) And then a new mathematical conclusion is obtained.
The invention also provides a computer readable storage medium, which is characterized in that the computer readable storage medium stores computer instructions, and the computer instructions are used for realizing the knowledge graph construction method for the adaptive derivation of the mathematical theorem when the processor executes the knowledge graph construction method.
The invention also provides knowledge graph construction equipment for the self-adaptive derivation of the mathematical theorem, which comprises the following steps:
at least one processor and a storage medium, the memory communicatively coupled to the processor;
wherein the storage medium stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the mathematical theorem-oriented adaptive derivation knowledge graph construction method.
Drawings
Fig. 1 is a schematic diagram of an application scenario of a knowledge graph construction method for adaptive derivation of mathematical theorem according to an embodiment of the present invention;
FIG. 2 is a flow chart of a knowledge graph construction method for adaptive derivation of mathematical theorem according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a mathematical knowledge graph reasoning path according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a knowledge graph construction system for adaptive derivation based on mathematical theorem according to an embodiment of the present invention.
Detailed Description
The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
An application scenario of a knowledge graph construction method for self-adaptive deduction of mathematical theorem in the embodiment of the invention is shown in fig. 1.
As shown in fig. 2, a knowledge graph construction method for adaptively deriving a mathematical theorem includes the following steps:
s1: analyzing the acquired text data, extracting entities and relations in the text data, and obtaining a head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i
In the embodiment of the invention, a text analyzer is utilized to analyze text data and extract entity e in known propositions i Sum relation r i For example, after the analysis of the "constraint that the univariate function can be micro", the head entity, the tail entity and the relationship are respectively "the univariate function can be micro", "the univariate function can be conducted", "the sufficient requirement"; after the analysis of the "guidable continuous", the head entity, the tail entity and the relation are respectively "guidable of the monobasic function", "continuous of the monobasic function" and "necessary condition".
After the analysis is completed, the embedded representation is realized through a mapping vector space, and the specific realization process of the mapping vector space is to know an entity e of a math learner i Relation r i Mapping to vector space of low-dimensional real value to obtain corresponding embedded representations e i ,r i Mapping the head entity 'unitary function' microscopic, the tail entity 'unitary function' and the relation 'sufficient requirement' of the 'unitary function conductive to the vector space, and obtaining the embedded representation of the head entity' unitary function microscopic ', the tail entity' unitary function conductive and the relation 'sufficient requirement'; mapping the head entity ' unitary function ' of ' leading must continuous ', the tail entity ' unitary function ' of ' leading must continuous ' and the relation ' necessary insufficient condition ' to a vector space, and obtaining the embedded representation of the head entity ' unitary function ' of ' leading, the tail entity ' unitary function ' of ' continuous ' and the relation ' necessary condition '.
Specifically, in one or more embodiments of the present invention, the parsing the acquired text data specifically includes the following steps:
s11: converting the text data into character vectors using a pre-trained language model;
s12: extracting the characteristics of the converted character vector by using a bidirectional gating circulating unit;
s13: and outputting labeling results of the entities and the relations in the text data according to the extracted features by using a sequence labeling model.
Converting the text data into adsorption connection through a pre-training language model, extracting features of the converted character vectors, and finally marking the extracted features to accurately obtain a labeling result of entities and relations in the text data, thereby accurately obtaining a head entity h i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i
S2: embedding the header entity into a representation h i And tail entity embedded representation t j Each relation r with the calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
S3: all candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) Outputting the corresponding mathematical conclusion as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t i Having a relation r u Probability values of (a) are provided.
In one or more embodiments of the present invention, the training of the knowledge graph inference model specifically includes the following steps:
s31: acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u
Firstly, extracting information from data such as a data course standard, a teaching material, a test class, a teaching plan, a test question set and the like, and constructing a mathematical knowledge graph, wherein mathematics comprise limits, differentiation, integration and the like; then taking an entity set E and a relation set R of triples (head entity, relation and tail entity) in the mathematical knowledge graph as m-dimensional embedded representation to obtain an entity embedded representation E in the mathematical knowledge graph i And the relation embedding represents r' i
Here, the embedding process we employ a transition:
first, assign an initial vector e to all entities and relationships i And r' i Then, training and testing are carried out on the triples in the mathematical knowledge graph, and the data set of the mathematical knowledge graph is divided into a training set and a testing set as in the common strategy of machine learning, wherein 80% of the data set of the mathematical knowledge graph is used as the training set, and 20% is used as the testing set. The training functions for the training set are:
Figure SMS_9
wherein, gamma is more than 0, gamma is a constant, S is a correct triplet set, h, r ', t respectively represent embedded representations of h, r and t, S' (h,r,t) Is an error triplet set constructed by (h, r, t), S' (h,r,t) = { (h, r, t ') |t ' ∈e }. U { (h ', r, t) |h ' ∈e }, h ', r ', t ' represent embedded representations of h ', r, t ', respectively, E is a set of all entities.
Training to obtain entity embedded representation e i
S32: embedding representation h according to entities i ,t i Classifying the entity pairs (h, t) in the calculus knowledge graph, wherein each class has the same relation r;
here we replace the entity pair (h, t) with x=h-t, and the intermediate variable x is clustered with K-means, i.e. K samples (u 1 ,u 2 ,…,u k ) As an initial mean vector, training is performed such that E is minimized:
Figure SMS_10
and E is the mean clustering distance between the current intermediate variable x and k samples by taking k samples as the center.
S33: embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r
Considering that entities and relationships are of different types, it is not appropriate to represent them in the same space, so we use TransR to embed the relationship r into d-dimensional space.
Converting the entity and the relation in the calculus knowledge graph into embedded representation through a TransE model, classifying the (h, t) entity in the calculus knowledge graph to obtain the same kind of relation r, embedding each relation r into d-dimensional space through CTransR, and learning each relation r to obtain a relation vector r c And mapping matrix M r Thus, according to the relation vector r c And mapping matrix M r And accurately calculating the energy value of the meta candidate group.
In one or more embodiments of the present invention, said calculating energy values of all said ternary candidates specifically comprises:
s34: by a function of energy values
Figure SMS_11
And (3) calculating:
Figure SMS_12
Figure SMS_13
h r,c =hM r
t r,c =tM r
where h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r is the relationship between h and t, and α is a constant.
We learn the relationship vector r for each relationship c And mapping matrix M r By scoring function f r (h, t), training and testing the triples in the mathematical knowledge-graph, and dividing the data set of the mathematical knowledge-graph into a training set and a testing set as in the common strategy of machine learning, wherein 80% of the data set of the mathematical knowledge-graph is used as the training set and 20% is used as the testing set. The training function is:
Figure SMS_14
wherein, gamma is more than 0, gamma is a constant, S is a correct triplet set, h r,c =hM r ,t r,c =tM r ,M r Is a mapping matrix, h, r ', t represent embedded representations of h, r, t, S' (h,r,t) Is an error triplet set constructed by (h, r, t), S' (h,r,t) ={(h,r,t′)|t′∈E}∪{(h′,r,t)|h′∈E),h′ r,c =h′M r ,t′ r,c =t′M r H ', r ', t ' represent embedded representations of h ', r, t ', respectively, and E is a set of all entities.
It is contemplated that some of the relationships are one-to-many and some are many-to-one. When the error set S' is constructed, if the relation r is one-to-many, the head entity h is replaced as much as possible, and if the relation r is many-to-one, the tail entity t is replaced as much as possible. The training process adopts a gradient descent method. Obtaining a relation vector r c And mapping matrix M r
The test set is used for testing, and the performance of the model in the test stage is evaluated through evaluation indexes MR (average ranking), MRR (average ranking reciprocal) and Hits@k (the proportion of the correct result entering the previous k in the energy value sequence). MR refers to the average of the exact results in the ranking of energy values, with smaller MR meaning that the better the ranking of the correct answers, the better the model. MRR refers to the average of the inverse of the ranking of the correct results in the ranking of energy values, in contrast to MR, the larger the MRR, meaning the smaller the ranking of the correct answers, i.e., the higher the ranking, the better the effect of the model. The ratio of the correct result to the previous k in the energy value ranking is represented by the fact that the entity prediction effect is generally measured by the fact that the value of the fact@k is larger, and the effect of the model is represented by the fact that the relation prediction effect is measured by the fact that the value of the fact@k is larger.
After the test is completed, the candidate triplet (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) The energy values of all the candidate triples are calculated through the energy value function and are input to the trained knowledge graph reasoning module, and whether the triples correctly give out interpretable probabilistic predictions or not can be judged.
For example, after resolving the "constraint that the unitary function is micro-able", the head entity and the tail entity are respectively "the unitary function is micro-able" and "the corresponding entity embedments of the unitary function is h 1 And t 1 The method comprises the steps of carrying out a first treatment on the surface of the After the analysis of 'guidability continuous', the head entity and the tail entity are respectively 'unitary function guidability' and 'unitary function continuous' are respectively embedded into the corresponding entities which are respectively h 2 And t 2 . To get more mathematical conclusions, it is necessary to recombine the entities and embed the relationships of the existing mathematical knowledge-graph into r u Such as "irrelevant condition", "sufficient requisite", "sufficient unnecessary condition", etc., are matched one by one to form candidate triples (h i ,r u ,t j ) For example, (the unitary function is differentiable, the condition is irrelevant, the unitary function is continuous), (the unitary function is differentiable, the condition is sufficient, the unitary function is not necessaryContinuous) and the like, inputting the candidate triples into a trained knowledge graph reasoning module to calculate energy values, wherein the triples with the optimal energy values are the optimal answers. After energy value calculation is performed on the triples through an energy value function, ascending order is performed on the triples, and the triples with the optimal energy value (h i ,r u ,t j ) The corresponding mathematical conclusion is the best answer, as shown in fig. 3. For example, inputting the two conclusions of "the unitary function is a minutely charged condition" and "the guidable must be continuous" into the system, the triplet with the highest energy value is obtained as (the unitary function is minutely, the necessary insufficient condition, the unitary function is continuous), so that the mathematical conclusion obtained by the learner is that the unitary function is minutely continuous.
A relation vector r passing through each relation r c And mapping matrix M r In combination with energy value functions
Figure SMS_15
The energy values of all the triples can be accurately calculated, so that the candidate triples with the optimal energy value (h i ,r u ,t j ) And then the optimal feature vector in the text data is obtained.
As shown in fig. 4, the invention further provides a knowledge graph construction system for self-adaptive derivation of mathematical theorem, which comprises an analysis module, a construction module and a calculation module;
the analysis module is used for analyzing the acquired text data, extracting the entity and the relation in the text data and obtaining the head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i
The construction module is used for embedding the head entity into the representation h i And tail entity embedded representation t j Each relation r with the calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
The calculation module is used for calculating the candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) The corresponding mathematical conclusion in (a) is output as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t i Having a relation r u Probability values of (a) are provided.
According to the knowledge graph construction system for self-adaptive deduction of the mathematical theorem, the text data are analyzed, and the deep semantic interaction of the entity and the relation in the mathematical knowledge graph is mined to obtain the head entity h i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i Then, recombination is performed to construct candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) And then calculating a candidate triplet (h) according to the trained knowledge graph thrust model i ,r u ,t j ) By establishing an inference path of the mathematical knowledge graph by means of TransR, so that the candidate triples (h) i ,r u ,t j ) And the corresponding mathematical conclusion is output as an answer, so that the optimal answer is positioned, and the mathematical learning efficiency can be improved.
In one or more embodiments of the present invention, the specific implementation of the parsing module for parsing the acquired text data is:
converting the text data into character vectors using a pre-trained language model;
extracting the characteristics of the converted character vector by using a bidirectional gating circulating unit;
and outputting labeling results of the entities and the relations in the text data according to the extracted features by using a sequence labeling model.
Converting the text data into adsorption connection through a pre-training language model, extracting features of the converted character vectors, and finally marking the extracted features to accurately obtain a labeling result of entities and relations in the text data, thereby accurately obtaining head entity embedded representation h in the text data i Tail entity embedded representation t i And relation embedding representation r i
In one or more embodiments of the present invention, the training of the knowledge-graph inference model by the calculation module is specifically implemented as follows:
acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u
Embedding representation h according to entities i ,t i Classifying the entity pairs (h, t) in the calculus knowledge graph, wherein the obtained entity pairs of the same type have the same relationship r;
embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r
Converting the entity and the relation in the calculus knowledge graph into embedded representation through a TransE model, classifying the entity pairs (h, t) in the calculus knowledge graph to obtain the entity pairs with the same kind of relation r, embedding each relation r into a d-dimensional space through CTransR, and learning each relation r to obtain a relation vector r c And mapping matrix M r Thus, according to the relation vector r c And mapping matrix M r And accurately calculating the energy value of the meta candidate group.
In one or more embodiments of the present invention, the calculating module calculates the energy values of all the ternary candidates as follows:
by a function of energy values
Figure SMS_16
And (3) calculating:
Figure SMS_17
Figure SMS_18
h r,c =hM r
t r,c =tM r
wherein h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r c For the embedded representation of the relation r, α is a constant.
A relation vector r passing through each relation r c And mapping matrix M r In combination with energy value functions
Figure SMS_19
The energy values of all the triples can be accurately calculated, so that the candidate triples with the optimal energy value (h i ,r u ,t j ) And then a new mathematical conclusion is obtained.
The invention also provides a computer readable storage medium, which is characterized in that the computer readable storage medium stores computer instructions, and the computer instructions are used for realizing the knowledge graph construction method for the adaptive derivation of the mathematical theorem when the processor executes the knowledge graph construction method.
The invention also provides a knowledge graph construction device for the self-adaptive derivation of the mathematical theorem, which comprises:
at least one processor and a storage medium, the memory communicatively coupled to the processor;
wherein the storage medium stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the mathematical theorem-oriented adaptive derivation knowledge graph construction method.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (10)

1. The knowledge graph construction method for the self-adaptive derivation of the mathematical theorem is characterized by comprising the following steps of:
analyzing the acquired text data, extracting entities and relations in the text data, and obtaining a head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i
Embedding the header entity into a representation h i And tail entity embedded representation t j Each relation r with the calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
All candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) Outputting the corresponding mathematical conclusion as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t i Having a relation r u Probability values of (a) are provided.
2. The knowledge graph construction method for adaptively deriving a mathematical theorem according to claim 1, wherein the parsing the obtained text data comprises the following steps:
converting the text data into character vectors;
extracting the characteristics of the converted character vector;
and outputting labeling results of the entities and the relations in the text data according to the extracted features.
3. The knowledge graph construction method for self-adaptive deduction of mathematical theorem according to claim 1, wherein the training of the knowledge graph reasoning model specifically comprises the following steps:
acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u
Embedding representation h according to entities i ,t i Classifying the entity pairs (h, t) in the calculus knowledge graph to obtain the same relation r of each class;
embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r
4. The knowledge graph construction method for adaptively deriving a mathematical theorem according to claim 3, wherein said calculating energy values of all the ternary candidate groups comprises:
by a function of energy values
Figure QLYQS_1
And (3) calculating:
Figure QLYQS_2
Figure QLYQS_3
h r,c =hM r
t r,c =tM r
wherein h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r c For the embedded representation of the relation r, α is a constant.
5. A knowledge graph construction system for self-adaptive deduction of mathematical theorem is characterized in that: the system comprises an analysis module, a construction module and a calculation module;
the analysis module is used for analyzing the acquired text data, extracting the entity and the relation in the text data and obtaining the head entity h in the text data i Is embedded in the representation h i Tail entity t i Is embedded in the representation t i Sum relation r i Is embedded in the representation r i
The construction module is used for embedding the head entity into the representation h i And tail entity embedded representation t j Each relation r with the calculus knowledge graph u Corresponding feature vector r u Construction of candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j );
The calculation module is used for converting all candidate triples (h i ,r u ,t j ) Vector representation (h) i ,r u ,t j ) Input to the trained knowledge graph inference model, calculate the energy values of all the triplet candidates, and compute the candidate triplet (h i ,r u ,t j ) The corresponding mathematical conclusion in (a) is output as an answer;
wherein the energy value is used to characterize the head entity h i And tail entity t i Having a relation r u Probability values of (a) are provided.
6. The knowledge graph construction system for adaptive derivation according to mathematical theorem of claim 5, wherein the parsing module performs parsing on the obtained text data by:
converting the text data into character vectors;
extracting the characteristics of the converted character vector;
and outputting labeling results of the entities and the relations in the text data according to the extracted features.
7. The knowledge graph construction system for adaptive derivation based on mathematical theorem according to claim 5, wherein the training of the knowledge graph inference model by the calculation module is specifically implemented as follows:
acquiring a pre-input entity set and a pre-input relation set in a calculus knowledge graph, and converting the entity and the relation in the calculus knowledge graph into embedded representations by adopting a TransE model, wherein the entity embedded representations in the calculus knowledge graph are marked as h i ,t i The relation embedded representation in the calculus knowledge graph is marked as r' u
Embedding representation h according to entities i ,t i Classifying the entity pairs (h, t) in the calculus knowledge graph, wherein the obtained same class has the same relation r;
embedding the relationships r into d-dimensional space using CTransR and learning a relationship vector r for each relationship r c And mapping matrix M r
8. The knowledge graph construction system for adaptive derivation of mathematical theorem according to claim 7, wherein the specific implementation of calculating the energy values of all the ternary candidate groups by the calculation module is as follows:
by a function of energy values
Figure QLYQS_4
And (3) calculating:
Figure QLYQS_5
Figure QLYQS_6
h r,c =hM r
t r,c =tM r
wherein h is the embedded representation of the head entity h, t is the embedded representation of the tail entity t, r c For the embedded representation of the relation r, α is a constant.
9. A computer readable storage medium, wherein the computer readable storage medium stores computer instructions for causing a processor to implement the mathematical theorem-oriented adaptive derivation knowledge-graph construction method of any one of claims 1-4 when executed.
10. The knowledge graph construction equipment facing the adaptive derivation of the mathematical theorem is characterized by comprising the following components:
at least one processor and a storage medium, the memory communicatively coupled to the processor;
wherein the storage medium has stored thereon a computer program executable by the at least one processor to enable the at least one processor to perform the mathematical theorem-oriented adaptive derivation knowledge-graph construction method of any one of claims 1-4.
CN202310344991.4A 2023-03-31 2023-03-31 Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation Pending CN116340543A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310344991.4A CN116340543A (en) 2023-03-31 2023-03-31 Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310344991.4A CN116340543A (en) 2023-03-31 2023-03-31 Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation

Publications (1)

Publication Number Publication Date
CN116340543A true CN116340543A (en) 2023-06-27

Family

ID=86889314

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310344991.4A Pending CN116340543A (en) 2023-03-31 2023-03-31 Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation

Country Status (1)

Country Link
CN (1) CN116340543A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111260064A (en) * 2020-04-15 2020-06-09 中国人民解放军国防科技大学 Knowledge inference method, system and medium based on knowledge graph of meta knowledge
CN113807519A (en) * 2021-08-30 2021-12-17 华中师范大学 Knowledge graph construction method integrating teaching feedback and learned understanding
WO2022033072A1 (en) * 2020-08-12 2022-02-17 哈尔滨工业大学 Knowledge graph-oriented representation learning training local training method
CN114840679A (en) * 2022-01-25 2022-08-02 华中师范大学 Robot intelligent learning guiding method based on music theory knowledge graph reasoning and application

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111260064A (en) * 2020-04-15 2020-06-09 中国人民解放军国防科技大学 Knowledge inference method, system and medium based on knowledge graph of meta knowledge
WO2022033072A1 (en) * 2020-08-12 2022-02-17 哈尔滨工业大学 Knowledge graph-oriented representation learning training local training method
CN113807519A (en) * 2021-08-30 2021-12-17 华中师范大学 Knowledge graph construction method integrating teaching feedback and learned understanding
CN114840679A (en) * 2022-01-25 2022-08-02 华中师范大学 Robot intelligent learning guiding method based on music theory knowledge graph reasoning and application

Similar Documents

Publication Publication Date Title
CN107967254B (en) Knowledge point prediction method and device, storage medium and electronic equipment
Lakkaraju et al. A machine learning framework to identify students at risk of adverse academic outcomes
CN111241243A (en) Knowledge measurement-oriented test question, knowledge and capability tensor construction and labeling method
CN114913729B (en) Question selecting method, device, computer equipment and storage medium
US20230027526A1 (en) Method and apparatus for classifying document based on attention mechanism and semantic analysis
CN114254208A (en) Identification method of weak knowledge points and planning method and device of learning path
Ding et al. Why Deep Knowledge Tracing Has Less Depth than Anticipated.
Das et al. An examination system automation using natural language processing
Liu Data analysis of educational evaluation using K-means clustering method
CN115310520A (en) Multi-feature-fused depth knowledge tracking method and exercise recommendation method
CN115238036A (en) Cognitive diagnosis method and device based on graph attention network and text information
CN114840649A (en) Student cognitive diagnosis method based on cross-modal mutual attention neural network
Huizhong et al. Research on the automation integration terminal of the education management platform based on big data analysis
CN113283488A (en) Learning behavior-based cognitive diagnosis method and system
CN110765241B (en) Super-outline detection method and device for recommendation questions, electronic equipment and storage medium
CN114117033B (en) Knowledge tracking method and system
CN116361541A (en) Test question recommendation method based on knowledge tracking and similarity analysis
GE et al. A machine learning based framework for predicting student’s academic performance
CN115935969A (en) Heterogeneous data feature extraction method based on multi-mode information fusion
CN116340543A (en) Knowledge graph construction method and system for mathematical theorem-oriented adaptive derivation
CN112785039B (en) Prediction method and related device for answer score rate of test questions
Clavié et al. Deep Embeddings of Contextual Assessment Data for Improving Performance Prediction.
Makhlouf et al. Mining Students' Comments to Build an Automated Feedback System.
Wang et al. Large-scale educational question analysis with partial variational auto-encoders
KR20170105969A (en) Apparatus And Computer Program for Searching Similar Mathematical Problem

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination