CN114297399A - Knowledge graph generation method, knowledge graph generation system, storage medium and electronic equipment - Google Patents

Knowledge graph generation method, knowledge graph generation system, storage medium and electronic equipment Download PDF

Info

Publication number
CN114297399A
CN114297399A CN202111403397.5A CN202111403397A CN114297399A CN 114297399 A CN114297399 A CN 114297399A CN 202111403397 A CN202111403397 A CN 202111403397A CN 114297399 A CN114297399 A CN 114297399A
Authority
CN
China
Prior art keywords
knowledge
learning data
feature vector
target
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111403397.5A
Other languages
Chinese (zh)
Inventor
胡阳
张武旭
汪张龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN202111403397.5A priority Critical patent/CN114297399A/en
Publication of CN114297399A publication Critical patent/CN114297399A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a knowledge graph generation method, a knowledge graph generation system, a storage medium and electronic equipment, wherein the method comprises the following steps: acquiring a learning data set of a target user in a preset time period, wherein the learning data set comprises at least one item of learning data; performing feature extraction processing on learning data in the learning data set to obtain a target feature vector, wherein the target feature vector is used for representing the learning data learned by the target user; adopting a trained neural network model to analyze and process the knowledge points of the target characteristic vector so as to output the weight of the knowledge points corresponding to the learning data set; and updating the knowledge graph of the target user according to the weight of the knowledge point, and outputting the updated target knowledge graph. The invention can generate the knowledge graph through the neural network model, thereby meeting the individual learning requirement of the target user and obviously reducing the manual workload.

Description

Knowledge graph generation method, knowledge graph generation system, storage medium and electronic equipment
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a knowledge graph generation method, a knowledge graph generation system, a storage medium and electronic equipment.
Background
The conventional learning path planning and resource recommendation method adopts a traditional knowledge map form, namely, teaching and research experts sort and summarize knowledge points learned by student subjects in advance, and a learning path from one knowledge point to another knowledge point is established according to the examinees and own experiences, so that an integral knowledge map is formed. For example, when a student makes a wrong test question of a previous knowledge point, the system recommends a similar question and related knowledge point explanation contents, and simultaneously gives a test question of a next knowledge point and specific contents of the knowledge point. However, the existing knowledge map generation and use method leads to relatively fixed learning path of the whole planning, and is difficult to meet different actual learning requirements of different students, and the knowledge point labels are generated by manually labeling all test question contents of the question bank, so that a large amount of manual work is required.
Disclosure of Invention
The invention provides a knowledge graph generation method, a knowledge graph generation system, a storage medium and electronic equipment, which are used for solving the problem that the learning path brought by the traditional knowledge graph in the prior art is relatively fixed and can not adapt to the individual learning requirement of students.
In a first aspect, the present invention provides a method for generating a knowledge graph, the method comprising:
acquiring a learning data set of a target user in a preset time period, wherein the learning data set comprises at least one item of learning data;
performing feature extraction processing on learning data in the learning data set to obtain a target feature vector, wherein the target feature vector is used for representing the learning data learned by the target user;
adopting a trained neural network model to analyze and process the knowledge points of the target characteristic vector so as to output the weight of the knowledge points corresponding to the learning data set;
and updating the knowledge graph of the target user according to the weight of the knowledge point, and outputting the updated target knowledge graph.
In an embodiment of the present invention, performing feature extraction processing on the learning data in the learning data set to obtain a target feature vector includes:
extracting text features from each item of learning data in the learning data set and inputting the text features into a pre-training model to obtain a first feature vector, wherein the first feature vector is used for representing information of each item of learning data;
performing attention processing on the first feature vector to obtain a second feature vector representing the relevance between the learning data in the learning data set, wherein the second feature vector is used for representing the relation between each item of learning data;
and acquiring the feature vector of the target knowledge graph and combining the feature vector with the second feature vector to obtain the target feature vector.
In an embodiment of the present invention, the acquiring the target knowledge-graph includes:
when the target knowledge graph is acquired for the target user for the first time, taking a preset original knowledge graph as the target knowledge graph;
and when the target knowledge graph is not acquired for the target user for the first time, taking the target knowledge graph with updated history as the target knowledge graph.
In an embodiment of the present invention, the extracting text features from each item of learning data in the learning data set and inputting the text features into a pre-training model to obtain a first feature vector includes:
and inputting the learning data contained in the learning data set into a trained pre-training model so as to extract text features from unlabeled text data forming the learning data and output the first feature vector with preset dimensionality.
In an embodiment of the present invention, the extracting text features from each item of learning data in the learning data set and inputting the text features into a pre-training model to obtain a first feature vector includes:
extracting text features from learning data of a preset number M and inputting the text features into the preset pre-training model;
outputting the first feature vector through preprocessing of the pre-training model, wherein the dimension of the first feature vector is M x N;
wherein N represents the preset dimension, and M, N is a positive integer.
In an embodiment of the present invention, the performing attention processing on the first feature vector to obtain a second feature vector representing a correlation between learning data in the learning data set includes:
and inputting the first feature vector into an attention processing function, and performing normalization processing on the attention processing function according to the first feature vector and an adjusting factor to obtain a second feature vector representing the relevance between the learning data in the learning data set.
In an embodiment of the present invention, the obtaining the feature vector of the target knowledge-graph and merging the feature vector with the second feature vector to obtain the target feature vector includes:
acquiring a feature vector of the target knowledge graph;
merging the feature vector and the second feature vector to obtain the target feature vector, wherein the dimension of the target feature vector is M (X + N);
wherein M represents the number of learning data, N represents the preset dimensionality of the pre-training model, X represents the dimensionality of the target knowledge graph, and X, M, N is a positive integer.
In an embodiment of the present invention, the target knowledge-graph includes a plurality of knowledge points and a weight corresponding to each knowledge point.
In an embodiment of the present invention, the performing knowledge point analysis processing on the target feature vector by using the trained neural network model to output the knowledge point weight corresponding to the learning data set includes:
and inputting the target feature vector into a neural network model based on an encoder-decoder structure, wherein the neural network model is trained by taking learning data of a preset user in a preset time period as training sample data.
In an embodiment of the present invention, the performing knowledge point analysis processing on the target feature vector by using the trained neural network model to output the knowledge point weight corresponding to the learning data set includes:
after self-attention processing and addition normalization processing are carried out on the input target characteristic vector through an encoder layer of the neural network model, addition normalization processing is carried out after feedforward network processing, and the addition normalization processing is output to the next encoder layer for processing until a vector is output from the last encoder layer;
after the self-attention processing and the addition normalization processing are carried out on the vector output by the last encoder layer through the decoder layer of the neural network model, the coding and decoding attention processing and the addition normalization processing are carried out, then the addition normalization processing is carried out after the feedforward network processing, and the vector is output to the next decoder layer for processing until the knowledge point weight is output by the last decoding layer;
wherein the neural network model comprises an encoder comprising at least one identical encoder layer and a decoder comprising at least one identical decoder layer.
In an embodiment of the present invention, the learning data set includes test question data and learning content data.
In a second aspect, the present invention also provides a knowledge-graph generating system, comprising:
the data acquisition module is used for acquiring a learning data set of a target user in a preset time period, wherein the learning data set comprises at least one item of learning data;
a target feature vector generation module, configured to perform feature extraction processing on learning data in the learning data set to obtain a target feature vector, where the target feature vector is used to represent the learning data learned by the target user;
the weight generation module is used for analyzing and processing the knowledge points of the target characteristic vector by adopting a trained neural network model so as to output the weight of the knowledge points corresponding to the learning data set;
and the knowledge map updating module is used for updating the knowledge map of the target user according to the weight of the knowledge point and outputting the updated target knowledge map.
In a third aspect, the present invention also provides an electronic device, including a memory, a processor, and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to implement any one of the methods for generating a knowledge graph as described in the first aspect.
In a fourth aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements any one of the methods of knowledge-graph generation as described in the first aspect above.
The technical scheme provided by the invention can obtain the learning data set of the target user, generate the target characteristic vector based on the learning data set, input the target characteristic into the neural network model to generate the knowledge point weight, and update the target knowledge map by using the knowledge point weight.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other embodiments according to the drawings without creative efforts.
FIG. 1 is a schematic flow diagram of a method of knowledge-graph generation provided by the present invention;
FIG. 2 is a schematic diagram of the steps of updating a knowledge graph provided by the present invention;
FIG. 3 is a schematic diagram of the attention processing steps provided by the present invention for extracting text features;
FIG. 4 is a schematic diagram of the attention processing steps provided by the present invention;
FIG. 5 is a schematic diagram of the step of generating target feature vectors provided by the present invention;
FIG. 6 is a schematic diagram of an original knowledge-graph manually formulated by an expert provided by the present invention;
FIG. 7 is a schematic diagram of an encoder-decoder based neural network model provided by the present invention;
FIG. 8 is a schematic diagram of an updated target knowledge-graph provided by the present invention;
FIG. 9 is a schematic diagram of the structure of a knowledge-graph generation system provided by the present invention;
fig. 10 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," and the like in the description and in the claims, and in the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein.
A Knowledge Graph (Knowledge Graph) is a Graph-based Knowledge representation and organization method, and the core component of the Knowledge Graph is a semantic network, wherein nodes represent domain concepts (entities) and edges represent semantic relations between the concepts. Knowledge is present in the knowledge-graph in the form of triples (head entities, predicates, tail entities). However, the existing knowledge graph is generally constructed manually or obtained by directly crawling the knowledge graph edited manually from a network platform, so that the overall planned learning path is relatively fixed, and a large amount of workload is required for subsequent manual updating.
The invention provides a knowledge graph generation method, a knowledge graph generation system, a storage medium and electronic equipment, which can acquire a learning data set of a target user, generate a target feature vector based on the learning data set, input the target feature into a neural network model to generate knowledge point weights, and update the target knowledge graph by using the knowledge point weights so as to meet individual learning requirements of different users. The knowledge graph obtained according to the technical scheme provided by the invention can be applied to an individualized learning system, for example, a student can log in the individualized learning system by using intelligent terminal equipment, various test questions recommended by the system are completed, the self level ability test is completed based on the test of the test questions, then the system evaluates the personal level ability of the student according to the test result, obtains the knowledge weakness point of the student based on the evaluation result, and then updates the knowledge graph according to the knowledge weakness point of the student to provide an individualized learning path for subsequent use.
The knowledge-graph generating method, system, storage medium, and electronic device of the present invention are described below with reference to fig. 1-10.
Referring to fig. 1 and fig. 2, fig. 1 is a schematic flow chart of a knowledge graph generation method provided by the present invention, and fig. 2 is a schematic diagram of a step of updating a knowledge graph provided by the present invention. As shown in fig. 1-2, the method for generating a knowledge graph provided by the present invention comprises:
step 101, acquiring a learning data set of a target user in a preset time period, wherein the learning data set comprises at least one item of learning data.
Illustratively, the learning data set includes test question data and learning content data. The test question data may be test question data of each subject, such as math test question data, english test question data, and the like, and the learning content data may be data of knowledge point content of each subject, such as math knowledge point data, english knowledge point data, and the like.
For example, data of all exercise questions and content data of learning of a certain time period of the current user are acquired.
And 102, performing feature extraction processing on the learning data in the learning data set to obtain a target feature vector, wherein the target feature vector is used for representing the learning data learned by the target user.
And 103, carrying out knowledge point analysis processing on the target characteristic vector by adopting the trained neural network model so as to output the knowledge point weight corresponding to the learning data set.
And 104, updating the knowledge graph of the target user according to the weight of the knowledge point, and outputting the updated target knowledge graph.
The following will specifically describe the above steps 102 to 104.
In step 102, the performing feature extraction processing on the learning data in the learning data set to obtain the target feature vector includes:
step 1021, extracting text features from each item of learning data in the learning data set and inputting the text features into a pre-training model to obtain a first feature vector, where the first feature vector is used to represent information of each item of learning data.
And 1022, performing attention processing on the first feature vector to obtain a second feature vector, where the second feature vector is used to represent a relationship between each item of learning data.
And 1023, acquiring the feature vector of the target knowledge graph and combining the feature vector with the second feature vector to obtain the target feature vector, wherein the target feature vector is used for representing information of knowledge points.
In the knowledge graph generation method provided by the invention, a pre-training model is introduced to extract features of natural language to obtain a first feature vector, a self-attention processing mechanism is used to obtain a second feature vector representing the relevance among a preset number of test questions and/or learning contents, the second feature vector and the feature vector of a target knowledge graph are combined to obtain a target feature vector used for inputting a trained neural network, then the neural network processes the target feature vector and outputs knowledge point weights, and the target knowledge graph is updated by using the knowledge point weights so that a target user can learn to meet personalized requirements according to a learning path defined by the updated target knowledge graph.
The following is a detailed description of the above step 1021.
Illustratively, in the knowledge graph generation method provided by the present invention, text features may be extracted according to a preset number M of test question data and list information of question surface, answer and answer analysis, and the extracted text features may be input into the pre-training model to obtain a first feature vector.
For example, data of all exercise questions and/or learning content data of a certain time period of the current user are obtained and input into the pre-training model according to the time sequence. For example, assuming that a set of test papers includes 100 test questions, text data and/or learning content data of 100 test questions (i.e., M ═ 100) may be input at a time, and knowledge points included in the set of test papers are manually marked on pages of each set of test papers.
It should be noted that the trained pre-training model provided by the invention does not need to label the test questions manually, and the text data of the current user's historical test questions can be directly input.
Illustratively, the Pre-training Models proposed by the present invention include, but are not limited to, ELMo (electromagnetic from Language Models, ELMo for short), GPT (general Pre-training Transformer, GPT for short), BERT (Bidirectional Encoder replication from transformations, BERT for short), and the like.
Illustratively, in one implementation of the invention, the pre-training model employs a BERT model, which is a language characterization model used for pre-training. The BERT pre-training model has the following advantages:
first, bi-directional transformations (which is a Seq2Seq model consisting of both encoder and decoder) can be pre-trained using mlm (masked Language model) to generate deep bi-directional Language tokens.
Second, after pre-training, better performance can be achieved in a variety of downstream tasks by only adding an additional output layer for fine-tuning, in which no task-specific structural modifications to the BERT are required.
As can be seen from the above, the invention adopts the BERT pre-training model, so that the effect of natural language processing can be greatly improved.
Illustratively, in one implementation of the present invention, a first feature vector is output through preprocessing of the pre-training model, and a dimension of the first feature vector is M × N;
where M represents the number of learning data, N represents the preset dimension of the pre-trained model, and M, N is a positive integer.
Illustratively, in one implementation of the present invention, the step 1021 includes: and inputting the learning data contained in the learning data set into a pre-training model so as to extract text features from unlabeled text data forming the learning data and output a first feature vector with preset dimensionality.
For example, assuming that the preset dimension N of the BERT pre-trained model is 1024 dimensions, and the number M of input learning data is 100 questions, the first feature vector output by the BERT pre-trained model has 100 × 1024 dimensions.
Exemplarily, in the technical solution disclosed in the present invention, since M vectors in the first feature vector are independent from each other, attention (attention) processing is required, so that the M vectors (for example, test questions) can be associated with each other, as shown in fig. 3. Wherein start represents a start label, context 1-N represents texts, Answer 1-N represents answers, Answer key 1-N represents Answer analysis, Delim represents concatenation, and Extract represents an action, that is, an action of extracting the characteristics of the BERT text.
The following is described in detail with respect to the foregoing step 1022.
Referring to fig. 4, fig. 4 is a schematic diagram of attention processing steps provided by the present invention. Illustratively, the technical solution disclosed in the present invention uses a scaled dot-Product attention (scaled dot-Product attention) mode to perform similarity calculation. As shown in fig. 4, scaling the dot product attention mechanism includes dot-product the first vector and the second vector of the input, scaling, masking (optional operation), normalizing operation, and dot-product the result of the normalizing operation with the third vector.
Illustratively, in one implementation of the present invention, step 1022 includes:
inputting the first feature vector into an attention processing function, and performing normalization processing on the attention processing function according to the first feature vector and an adjustment factor to obtain a second feature vector representing the relevance between learning data in the learning data set
Wherein the attention processing function may be described as mapping a query (query) and a set of key-value pairs to an output, wherein the query, key, value and output are vectors. The output is calculated as a weighted sum of values, where the weight assigned to each value is calculated by the compatibility function of the query with the corresponding key. A single-headed attention processing function is represented by:
Figure BDA0003371881560000101
wherein, the input values of Q, K and V are all the first characteristic vector, dkDenotes the adjustment factor and softmax denotes the normalization function.
When the first feature vector of the input is large, the softmax function may have a very small gradient and is difficult to learn effectively, so the embodiment of the invention adds dkThe adjustment factor is adjusted to prevent the dot product of Q, K from being too large.
Illustratively, the softmax normalization function may be an extension of the logistic function over multiple classes of problems, i.e.
Figure BDA0003371881560000102
For example, in one implementation of the present invention, vectors of [100 × 1024] dimension formed by text features of 100 questions are subjected to attention processing, so as to obtain second feature vectors having relevance among the 100 questions, where the dimension of the second feature vectors is [100 × 1024 ].
The following is a detailed description of the above step 1023.
Referring to fig. 5, fig. 5 is a schematic diagram illustrating a step of generating a target feature vector according to the present invention. Illustratively, the step 1023 includes a step 10231 of obtaining the target knowledge-graph.
Illustratively, the obtaining the target knowledge-graph comprises:
when the target knowledge graph is acquired for the target user for the first time, taking a preset original knowledge graph (for example, an original knowledge graph manually made by an expert) as the target knowledge graph;
when the target knowledge-graph is acquired for the target user for a non-first time, the target knowledge-graph with the updated history (for example, the updated last time) is used as the target knowledge-graph.
Illustratively, the default weight for each knowledge point of the original knowledge-graph is 0 (as shown in FIG. 6).
It should be noted that the second feature vector is merged with the feature vector of the target knowledge graph to obtain the target feature vector, so as to merge the knowledge point features of the target knowledge graph and the features of the learning data together. Therefore, the weight of the knowledge points output by the neural network can be favorably matched with the knowledge points of the knowledge graph.
Illustratively, fig. 6 is a schematic diagram of a knowledge graph manually created by an expert, as shown in fig. 6, the knowledge graph manually created by the expert illustratively comprises primary knowledge points, each of the primary knowledge points comprises a secondary knowledge point, each of the secondary knowledge points comprises a tertiary knowledge point, and a weight corresponding to each of the tertiary knowledge points.
Illustratively, the knowledge-graph may be a knowledge-graph of various disciplines, e.g., the primary knowledge points of a knowledge-graph of a mathematical discipline are four arithmetic operations, the primary knowledge points comprising secondary knowledge points, e.g., the four arithmetic operations comprise a sequence of addition, subtraction, multiplication, division, and mix operations; each secondary knowledge point contains three levels of knowledge points, for example, an "addition" knowledge point contains "and-addend + addend" knowledge points and "addend-and-another" knowledge points.
It should be noted that the contents and the number of levels of the first-level knowledge points, the second-level knowledge points and the third-level knowledge points may be set according to the specific situation of each subject, and the present invention is not limited to the knowledge points of the mathematical subject "four arithmetic operations" shown in fig. 6 and the example of dividing the knowledge points into three levels.
And 10232, merging the feature vector and the second feature vector to obtain the target feature vector, wherein the dimension of the target feature vector is M X (X + N).
Wherein M represents the number of learning data, X represents the dimensionality of the target knowledge graph, N represents the preset dimensionality of the pre-trained model, and X, M, N is a positive integer.
Exemplarily, assuming that the dimension X of the target knowledge graph is 2018, the number M of the test questions is 100, and N is 1024 dimensions, merging the feature vector of the target knowledge graph with the second feature vector to obtain a target feature vector, where the dimension of the target feature vector is 100 (2018+1024), that is, 100 × 3042.
For example, the embodiment of the present invention may also convert the second feature vector into a high-dimensional feature, and then combine the feature vector and the second feature vector to obtain the target feature vector, where the target feature vector is used as an input of a neural network model.
The above step 103 is specifically described below.
Exemplarily, the aforementioned step 103 includes:
and carrying out knowledge point analysis processing on the target characteristic vector by adopting a trained neural network model so as to output the knowledge point weight corresponding to the learning data set. The neural network model is trained by taking learning data of preset users (such as users with obviously improved performance) in a preset time period as training sample data.
For example, the exercise history test question data of the user in the previous time period (which may be one year or one month) of the user with significantly improved performance in a preset time period, the learned content data sequence and the feature vector of the knowledge graph updated by the user last time are used as input layers, the knowledge points of the exercise test question data of the user in the next year (which may be one year or one month) of the indicated user are used as output results of the output layers, and mass data are used for model training to fit the whole neural network.
Illustratively, the trained neural network model includes an encoder and a decoder.
Referring to fig. 7, fig. 7 is a schematic diagram of a neural network model based on an encoder-decoder according to the present invention. The neural network model containing the Encoder-Decoder (Encoder-Decoder) is a general framework under which different algorithms can be used to solve different tasks. Illustratively, the encoder is composed of at least one identical encoder layer (encoder #1 and encoder #2 are shown in the figure), and the decoder is composed of at least one identical decoder layer (decoder #1 and decoder #2 are shown in the figure). The number of encoder layers and decoder layers is not limited in the present invention.
As shown in fig. 7, each encoder layer (e.g., encoder #1) illustratively includes a self-attention layer, an addition & normalization layer, a feed-forward network layer, and an addition & normalization layer, and each decoder layer (e.g., decoder #1) includes a self-attention layer, an addition & normalization layer, an encoding-decoding attention layer, an addition & normalization layer, a feed-forward network layer, and an addition & normalization layer.
In step 103, the performing knowledge point analysis processing on the target feature vector by using the trained neural network model to output the knowledge point weight corresponding to the learning data set includes:
and after the self-attention processing and the addition normalization processing are carried out on the input target characteristic vector through the encoder layer of the neural network model, the addition normalization processing is carried out after the feedforward network processing, and the addition normalization processing is output to the next encoder layer for processing, namely the next encoder layer also executes the operation until the vector is output by the last encoder layer.
And after the self-attention processing and the addition normalization processing are carried out on the vector output by the last encoder layer through the decoder layer of the neural network model, the coding and decoding attention processing and the addition normalization processing are carried out, the addition normalization processing is carried out after the feedforward network processing, and the vector is output to the next decoder layer for processing, namely the next decoder layer also executes the operation until the knowledge point weight is output by the last decoder layer.
Wherein, the Encoder (Encoder) is used for outputting the input (such as text, picture, audio) in the form of vector after being processed by the Encoder. The Decoder (Decoder) is used for outputting the input (vector) in the form of text and the like after being processed by the Decoder. The input of the neural network based on the Encode-Decoder structure is a sequence, the output of the neural network is also a sequence, the Encode converts a variable-length signal sequence into a fixed-length vector expression, and the Decode converts the fixed-length vector into a variable-length signal sequence of a target.
Illustratively, the present invention can select different encoders and decoders according to different tasks, for example, but not limited to, the Neural Network model proposed by the present invention can adopt RNN (Recurrent Neural Network), LSTM (Long Short-Term Memory) or gru (gate recovery unit) model structure, and can also adopt Transformers model structure. Therefore, the present invention does not limit the neural network model of the encoder and decoder structures.
When the transforms neural network model structure is adopted, the input values of the Q, K, and V parameters in the attention processing functions in the encoder and the decoder are the target feature vectors. If the RNN model structure is adopted, the self-attention processing layer is not required to be arranged.
The above step 104 is described in detail below.
In the above step 104, the knowledge graph of the target user is updated according to the weight of the knowledge point, and the updated target knowledge graph is output.
Illustratively, fig. 8 is a schematic diagram of the updated target knowledge-graph of the present invention, which comprises the third-level knowledge points and knowledge point weights corresponding to each third-level knowledge point, as shown in fig. 8, wherein the updated knowledge point weights correspond to the black circles shown as "updated target knowledge-graph" on the right side of fig. 2.
That is, the knowledge-graph of the target user is updated according to the weight of the knowledge point output by the neural network (e.g. the middle column of the table in fig. 8) (e.g. the right column of the table in fig. 8, i.e. the updated weight of the knowledge point), and the updated target knowledge-graph is output. Therefore, the technical scheme disclosed by the invention plans the personalized learning path by updating the target knowledge map according to the self learning characteristics of the target user, namely the personalized learning path formulation and resource recommendation can be finally realized according to the detection and analysis of the individual weak knowledge points, thereby playing an important role in the aspects of burden reduction and efficiency improvement of students and avoiding the blind theme training of the students.
The invention also provides a knowledge graph generation system, and the knowledge graph generation system described below and the knowledge graph generation method described above can be correspondingly referred to each other.
Referring to fig. 9, fig. 9 is a schematic structural diagram of a knowledge-graph generating system provided by the present invention. As shown in fig. 9, the system 900 for generating a knowledge graph according to the present invention includes a data obtaining module 901, a target feature vector generating module 902, a weight generating module 903, and a knowledge graph updating module 904.
The data obtaining module 901 is configured to obtain a learning data set of a target user in a preset time period, where the learning data set includes at least one item of learning data. The target feature vector generation module 902 is configured to perform feature extraction processing on learning data in the learning data set to obtain a target feature vector, where the target feature vector is used to represent the learning data learned by the target user; the weight generation module 903 is configured to perform knowledge point analysis processing on the target feature vector by using a trained neural network model to output a knowledge point weight corresponding to the learning data set; the knowledge graph updating module 904 is configured to update the knowledge graph of the target user according to the weight of the knowledge point, and output the updated target knowledge graph.
Illustratively, the target feature vector generation module 902 is further configured to:
extracting text features from each item of learning data in the learning data set and inputting the text features into a pre-training model to obtain a first feature vector, wherein the first feature vector is used for representing information of each item of learning data;
performing attention processing on the first feature vector to obtain a second feature vector representing the relevance between the learning data in the learning data set, wherein the second feature vector is used for representing the relation between each item of learning data;
and acquiring the feature vector of the target knowledge graph and combining the feature vector with the second feature vector to obtain the target feature vector.
Illustratively, the target feature vector generation module 902 is further configured to:
when the target knowledge graph is acquired for the target user for the first time, taking a preset original knowledge graph as the target knowledge graph; and
and when the target knowledge graph is not acquired for the target user for the first time, taking the target knowledge graph with updated history as the target knowledge graph.
Illustratively, the target feature vector generation module 902 is further configured to:
and inputting the learning data contained in the learning data set into a trained pre-training model so as to extract text features from unlabeled text data forming the learning data and output the first feature vector with preset dimensionality.
Illustratively, the target feature vector generation module 902 is further configured to:
extracting text features from learning data of a preset number M and inputting the text features into the pre-training model;
outputting the first feature vector through preprocessing of the pre-training model, wherein the dimension of the first feature vector is M x N;
wherein N represents a preset dimension of the pre-training model, and M, N is a positive integer.
Illustratively, the target feature vector generation module 902 is further configured to:
and inputting the first feature vector into an attention processing function, and performing normalization processing on the attention processing function according to the first feature vector and an adjusting factor to obtain a second feature vector representing the relevance between the learning data in the learning data set.
Illustratively, the target feature vector generation module 902 is further configured to:
acquiring a feature vector of the target knowledge graph;
merging the feature vector and the second feature vector to obtain the target feature vector, wherein the dimension of the target feature vector is M (X + N);
wherein M represents the number of learning data, X represents the dimensionality of the target knowledge graph, N represents the preset dimensionality of the pre-trained model, and X, M, N is a positive integer.
Illustratively, the target knowledge-graph includes a plurality of knowledge points and a weight corresponding to each knowledge point.
Illustratively, the weight generating module 903 is further configured to:
and inputting the target characteristic vector into a neural network model based on an encoder-decoder structure, wherein the neural network model is trained by taking learning data of a user with obviously improved performance in a preset time period as training sample data.
Illustratively, the weight generating module 903 is further configured to:
after self-attention processing and addition normalization processing are carried out on the input target characteristic vector through an encoder layer of the neural network model, addition normalization processing is carried out after feedforward network processing, and the addition normalization processing is output to the next encoder layer for processing until a vector is output from the last encoder layer;
after the self-attention processing and the addition normalization processing are carried out on the vector output by the last encoder layer through the decoder layer of the neural network model, the coding and decoding attention processing and the addition normalization processing are carried out, then the addition normalization processing is carried out after the feedforward network processing, and the vector is output to the next decoder layer for processing until the knowledge point weight is output by the last decoding layer;
wherein the neural network model comprises an encoder comprising at least one identical encoder layer and a decoder comprising at least one identical decoder layer.
Illustratively, the learning data set includes test question data and learning content data.
Other aspects of the knowledge-graph generating system provided by the invention are the same as or similar to the knowledge-graph generating method described above, and are not described in detail herein.
In another aspect, the present invention further provides an electronic device, which includes a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the computer program to implement any one of the above-described methods for generating a knowledge graph.
Fig. 10 illustrates a schematic structural diagram of an electronic device, which may include, as shown in fig. 10: a Processor (Processor)1010, a communication Interface (Communications Interface)1020, a Memory (Memory)1030, and a communication bus 1040, wherein the Processor 1010, the communication Interface 1020, and the Memory 1030 communicate with each other via the communication bus 1040. Processor 1010 may invoke logic instructions in memory 1030 to perform the method of knowledge-graph generation
Illustratively, the logic instructions in the memory 1030 may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program that, when executed by a processor, is implemented to perform any of the above-described methods of knowledge-graph generation.
The above-described embodiments are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only for illustrating the technical solutions of the present invention, but not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (14)

1. A method of knowledge-graph generation, the method comprising:
acquiring a learning data set of a target user in a preset time period, wherein the learning data set comprises at least one item of learning data;
performing feature extraction processing on the learning data in the learning data set to obtain a target feature vector, wherein the target feature vector is used for representing the learning data learned by the target user
Adopting a trained neural network model to analyze and process the knowledge points of the target characteristic vector so as to output the weight of the knowledge points corresponding to the learning data set;
and updating the knowledge graph of the target user according to the weight of the knowledge point, and outputting the updated target knowledge graph.
2. The knowledge graph generation method according to claim 1, wherein the performing feature extraction processing on the learning data in the learning data set to obtain a target feature vector comprises:
extracting text features from each item of learning data in the learning data set, and inputting the text features into a pre-training model to obtain a first feature vector, wherein the first feature vector is used for representing information of each item of learning data;
performing attention processing on the first feature vector to obtain a second feature vector, wherein the second feature vector is used for representing the relationship between each item of learning data;
and acquiring the feature vector of the target knowledge graph and combining the feature vector with the second feature vector to obtain the target feature vector.
3. The method of generating a knowledge-graph of claim 2 wherein said obtaining the target knowledge-graph comprises:
when the target knowledge graph is acquired for the target user for the first time, taking a preset original knowledge graph as the target knowledge graph; and when the target knowledge graph is not acquired for the target user for the first time, taking the target knowledge graph with updated history as the target knowledge graph.
4. The method of generating a knowledge graph according to claim 3, wherein extracting a text feature from each item of learning data of the set of learning data and inputting the text feature into a pre-trained model to obtain a first feature vector comprises:
and inputting the learning data contained in the learning data set into a trained pre-training model so as to extract text features from unlabeled text data forming the learning data and output the first feature vector with preset dimensionality.
5. The method of generating a knowledge graph according to claim 4, wherein extracting text features from each item of learning data of the set of learning data and inputting the text features into a pre-trained model to obtain a first feature vector comprises:
extracting text features from learning data of a preset number M and inputting the text features into the pre-training model;
outputting the first feature vector through preprocessing of the pre-training model, wherein the dimension of the first feature vector is M x N;
wherein N represents a preset dimension of the pre-training model, and M, N is a positive integer.
6. The method of generating a knowledge graph according to claim 2, wherein the attention processing the first feature vector to obtain a second feature vector representing a correlation between learning data in the set of learning data includes:
and inputting the first feature vector into an attention processing function, and performing normalization processing on the attention processing function according to the first feature vector and an adjusting factor to obtain a second feature vector representing the relevance between the learning data in the learning data set.
7. The method of generating a knowledge-graph of claim 2, wherein the obtaining and combining the feature vector of the target knowledge-graph with the second feature vector to obtain the target feature vector comprises:
acquiring a feature vector of the target knowledge graph;
merging the feature vector and the second feature vector to obtain the target feature vector, wherein the dimension of the target feature vector is M (X + N);
wherein M represents the number of learning data, X represents the dimensionality of the target knowledge graph, N represents the preset dimensionality of the pre-trained model, and X, M, N is a positive integer.
8. The method of generating a knowledge-graph of claim 1 wherein the target knowledge-graph comprises a plurality of knowledge points and a weight corresponding to each knowledge point.
9. The method of generating a knowledge graph according to claim 1, wherein the performing a point-of-knowledge analysis process on the target feature vector by using a trained neural network model to output a weight of a knowledge point corresponding to the learning data set comprises:
and inputting the target feature vector into a neural network model based on an encoder-decoder structure, wherein the neural network model is trained by taking learning data of a preset user in a preset time period as training sample data.
10. The method of generating a knowledge graph according to claim 1, wherein the performing a point-of-knowledge analysis process on the target feature vector by using a trained neural network model to output a weight of a knowledge point corresponding to the learning data set comprises:
after self-attention processing and addition normalization processing are carried out on the input target characteristic vector through an encoder layer of the neural network model, addition normalization processing is carried out after feedforward network processing, and the addition normalization processing is output to the next encoder layer for processing until a vector is output from the last encoder layer;
after the self-attention processing and the addition normalization processing are carried out on the vector output by the last encoder layer through the decoder layer of the neural network model, the coding and decoding attention processing and the addition normalization processing are carried out, then the addition normalization processing is carried out after the feedforward network processing, and the vector is output to the next decoder layer for processing until the knowledge point weight is output by the last decoding layer;
wherein the neural network model comprises an encoder comprising at least one identical encoder layer and a decoder comprising at least one identical decoder layer.
11. The knowledge-graph generating method according to claim 1, wherein the learning data set includes test question data and learning content data.
12. A knowledge-graph generating system, comprising:
the data acquisition module is used for acquiring a learning data set of a target user in a preset time period, wherein the learning data set comprises at least one item of learning data;
a target feature vector generation module, configured to perform feature extraction processing on learning data in the learning data set to obtain a target feature vector, where the target feature vector is used to represent the learning data learned by the target user;
the weight generation module is used for analyzing and processing the knowledge points of the target characteristic vector by adopting a trained neural network model so as to output the weight of the knowledge points corresponding to the learning data set;
and the knowledge map updating module is used for updating the knowledge map of the target user according to the weight of the knowledge point and outputting the updated target knowledge map.
13. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of knowledge-graph generation of any of claims 1 to 11 when executing the program.
14. A processor-readable storage medium, characterized in that the processor-readable storage medium stores a computer program for causing the processor to execute the knowledge-graph generating method according to any one of claims 1 to 11.
CN202111403397.5A 2021-11-24 2021-11-24 Knowledge graph generation method, knowledge graph generation system, storage medium and electronic equipment Pending CN114297399A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111403397.5A CN114297399A (en) 2021-11-24 2021-11-24 Knowledge graph generation method, knowledge graph generation system, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111403397.5A CN114297399A (en) 2021-11-24 2021-11-24 Knowledge graph generation method, knowledge graph generation system, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN114297399A true CN114297399A (en) 2022-04-08

Family

ID=80966187

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111403397.5A Pending CN114297399A (en) 2021-11-24 2021-11-24 Knowledge graph generation method, knowledge graph generation system, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN114297399A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115329200A (en) * 2022-08-26 2022-11-11 国家开放大学 Teaching resource recommendation method based on knowledge graph and user similarity
CN116932926A (en) * 2023-09-14 2023-10-24 深圳酷宅科技有限公司 Data analysis method and system applied to intelligent home control
CN117172978A (en) * 2023-11-02 2023-12-05 北京国电通网络技术有限公司 Learning path information generation method, device, electronic equipment and medium
CN115329200B (en) * 2022-08-26 2024-04-26 国家开放大学 Teaching resource recommendation method based on knowledge graph and user similarity

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115329200A (en) * 2022-08-26 2022-11-11 国家开放大学 Teaching resource recommendation method based on knowledge graph and user similarity
CN115329200B (en) * 2022-08-26 2024-04-26 国家开放大学 Teaching resource recommendation method based on knowledge graph and user similarity
CN116932926A (en) * 2023-09-14 2023-10-24 深圳酷宅科技有限公司 Data analysis method and system applied to intelligent home control
CN116932926B (en) * 2023-09-14 2023-11-17 深圳酷宅科技有限公司 Data analysis method and system applied to intelligent home control
CN117172978A (en) * 2023-11-02 2023-12-05 北京国电通网络技术有限公司 Learning path information generation method, device, electronic equipment and medium
CN117172978B (en) * 2023-11-02 2024-02-02 北京国电通网络技术有限公司 Learning path information generation method, device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
CN108628823B (en) Named entity recognition method combining attention mechanism and multi-task collaborative training
CN108363743B (en) Intelligent problem generation method and device and computer readable storage medium
CN107798140B (en) Dialog system construction method, semantic controlled response method and device
Zhang et al. Deep Learning+ Student Modeling+ Clustering: A Recipe for Effective Automatic Short Answer Grading.
CN111966812B (en) Automatic question answering method based on dynamic word vector and storage medium
CN108549658A (en) A kind of deep learning video answering method and system based on the upper attention mechanism of syntactic analysis tree
CN111831789A (en) Question-answer text matching method based on multilayer semantic feature extraction structure
CN108491515B (en) Sentence pair matching degree prediction method for campus psychological consultation
CN110796160A (en) Text classification method, device and storage medium
CN114297399A (en) Knowledge graph generation method, knowledge graph generation system, storage medium and electronic equipment
CN111125333B (en) Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism
US11727915B1 (en) Method and terminal for generating simulated voice of virtual teacher
CN111930914A (en) Question generation method and device, electronic equipment and computer-readable storage medium
CN113239169A (en) Artificial intelligence-based answer generation method, device, equipment and storage medium
CN113901191A (en) Question-answer model training method and device
CN111523328B (en) Intelligent customer service semantic processing method
Puscasiu et al. Automated image captioning
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
CN110765241B (en) Super-outline detection method and device for recommendation questions, electronic equipment and storage medium
CN113011196B (en) Concept-enhanced representation and one-way attention-containing subjective question automatic scoring neural network model
CN114282592A (en) Deep learning-based industry text matching model method and device
Mandge et al. Revolutionize cosine answer matching technique for question answering system
Bai et al. Gated character-aware convolutional neural network for effective automated essay scoring
CN116822530A (en) Knowledge graph-based question-answer pair generation method
CN115796141A (en) Text data enhancement method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination