CN116257659A - Dynamic diagram embedding method and system of intelligent learning guiding system - Google Patents
Dynamic diagram embedding method and system of intelligent learning guiding system Download PDFInfo
- Publication number
- CN116257659A CN116257659A CN202310342163.7A CN202310342163A CN116257659A CN 116257659 A CN116257659 A CN 116257659A CN 202310342163 A CN202310342163 A CN 202310342163A CN 116257659 A CN116257659 A CN 116257659A
- Authority
- CN
- China
- Prior art keywords
- nodes
- matrix
- dynamic
- embedding
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 238000010586 diagram Methods 0.000 title claims abstract description 16
- 239000011159 matrix material Substances 0.000 claims abstract description 102
- 230000003068 static effect Effects 0.000 claims abstract description 25
- 230000009466 transformation Effects 0.000 claims abstract description 22
- 238000011176 pooling Methods 0.000 claims abstract description 21
- 238000012549 training Methods 0.000 claims abstract description 17
- 238000004364 calculation method Methods 0.000 claims abstract description 6
- 230000006870 function Effects 0.000 claims description 41
- 239000013598 vector Substances 0.000 claims description 14
- 230000002776 aggregation Effects 0.000 claims description 10
- 238000004220 aggregation Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims description 2
- 230000008859 change Effects 0.000 abstract description 5
- 230000008569 process Effects 0.000 description 11
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000005295 random walk Methods 0.000 description 2
- 238000011524 similarity measure Methods 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000005755 formation reaction Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 235000002020 sage Nutrition 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/15—Correlation function computation including computation of convolution operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Databases & Information Systems (AREA)
- Strategic Management (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Tourism & Hospitality (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Algebra (AREA)
- Educational Technology (AREA)
- Computing Systems (AREA)
- Educational Administration (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of intelligent guide, in particular to a dynamic diagram embedding method and system of an intelligent guide system, comprising the following steps: performing time sequence expansion on the dynamic nodes to generate time sequence sequences of the dynamic nodes; generating a first embedding matrix based on the sequence of static nodes and the time sequence of dynamic nodes; carrying out pooling treatment on the first embedded matrix through the transformation matrix, and outputting a second embedded matrix; for each node in the second embedding matrix, acquiring the convergence characteristics of all nodes based on the node-level attention characteristics and the relationship-level attention characteristics and generating a final embedding matrix; based on the nodes finally embedded in the matrix, an objective function is constructed, and the result of the training done by the learner is predicted through the objective function. According to the invention, on the basis of ensuring the model prediction accuracy, the calculation scale of the model is reduced, the continuous change of the learner in the actual situation is effectively simulated through the random time sequence pooling, and the performance of the learner can be more stable along with the time.
Description
Technical Field
The invention relates to the technical field of intelligent guide, in particular to a dynamic diagram embedding method and system of an intelligent guide system.
Background
In recent years, with the continuous fusion of artificial intelligence and online education, intelligent guide system (ITS) is being promoted to develop rapidly. ITS plays the role of a virtual mentor to design a personalized learning path for a learner. A large amount of learning behavior data is generated from ITS every day, based on which developers build many intelligent learning services, including cognitive diagnostics, knowledge tracking, and resource recommendation services, to help learners improve learning efficiency.
Modeling the graph of educational data can greatly enhance the performance of educational services. As a common data structure, the graph is widely used and used in many real scenes, such as a biological protein network, an information network, a knowledge graph and the like. Real-time problems are modeled based on graph perspectives, and then representation vectors of nodes in the graph are learned by Graph Embedding (GE) to feed back to downstream machine learning tasks, thereby improving final efficiency and performance.
In the prior art, this graph model constructed from ITS data is referred to as an intelligent guide graph (ITG). Notably, the ITG is a heterogeneous graph in that it contains various entities such as resources, exercises, learners, classes, and knowledge concepts, as well as various relationships between them. Thus, ITG is handled by generally using correlation methods applicable to heterograms.
However, as learning progresses, the learner's knowledge state also changes and affects their interactive decisions, i.e., ITG is dynamic, and thus cannot be modeled using traditional Dynamic Graph Embedding (DGE) methods. The prior art typically models a dynamic graph as a sequence of static snapshot graphs or a sequence of neighborhood formations, samples random walks in time sequence, these methods simply capture the sequential evolution of static structures throughout the snapshot graph sequence, while the evolution of ITG differs from these processes in that it has the characteristics of locality, independence and smoothness.
Wherein, locality is: an ITG contains not only dynamic nodes, such as learners, but also many static nodes that do not change over time; in order to cope with the changing dynamic knowledge state, it is necessary to consider how to coordinate invariance of static nodes and local evolution of dynamic nodes, which is not shown in the conventional dynamic graph embedding method. The independence is: the evolution process of the nodes in the ITG is mutually independent; that is, each evolving learner has a unique timeline that describes the entire learning process from start to finish. The smoothness is: the evolution of each learner in the ITG is stable and smooth, rather than abrupt, consistent with the learner's progressive knowledge; while the objects used for traditional dynamic graph modeling may be various types of nodes that do not result in smooth changes over time.
In summary, the conventional dynamic graph embedding method cannot adapt to the characteristics of locality, independence and smoothness of dynamic heterogeneous graphs, and the conventional method cannot model the ITG in the educational scene.
Disclosure of Invention
The invention provides a dynamic diagram embedding method and system of an intelligent guide system, which are used for solving the defects in the prior art.
The invention provides a dynamic image data embedding method of an intelligent guide system, which comprises the following steps:
performing time sequence expansion on dynamic nodes in a heterogeneous evolution network of the intelligent guide system to obtain an expanded graph data structure, wherein the expanded graph data structure comprises an expanded node set and an expanded edge set, and generating a time sequence of the dynamic nodes;
randomly projecting all nodes, and generating a first embedding matrix based on the sequence of the static nodes and the time sequence of the dynamic nodes;
dividing the time sequence into a plurality of continuous intervals with equal length, and embedding an average value of nodes in each interval as an interval to generate a transformation matrix; the first embedded matrix is subjected to pooling treatment through the transformation matrix to reduce the number of nodes, and a second embedded matrix is output;
for each node in the second embedding matrix, acquiring the convergence characteristics of all nodes based on the node-level attention characteristics and the relationship-level attention characteristics and generating a final embedding matrix;
and carrying out edge reconstruction and weight regression based on the nodes finally embedded in the matrix, constructing an objective function, and predicting the training result of the learner through the objective function.
According to the method for embedding the dynamic graph data of the intelligent guide system, which is provided by the invention, the dynamic nodes in the heterogeneous evolution network of the intelligent guide system are subjected to time sequence expansion, and the method comprises the following steps:
Creating an expansion node for each dynamic edge, changing the dynamic node connected with the dynamic edge into a corresponding expansion node, creating the expansion edge to link adjacent expansion nodes, and generating a time sequence of the dynamic nodes
wherein ,represents the set of initial nodes, ε represents the set of initial edges, +.>As a dynamic node at the initial moment in time,is an extended node that evolves over time.
According to the dynamic graph data embedding method of the intelligent guide system, provided by the invention, all nodes are randomly projected, and a first embedding matrix is generated based on the sequence of static nodes and the time sequence of dynamic nodes, and the method comprises the following steps:
node setComprising dynamic node set->And static node set->Static node set based->Is embedded statically->Dynamic node set based->Dynamic embedding of (a)>Based on the static embedding and the dynamic embeddingThe state embedding generates the first embedding matrix:
according to the dynamic diagram data embedding method of the intelligent guide system provided by the invention, for a given time sequences is the number of nodes in each section, and s>1, presetting the size of the first interval as s 1 ,s 1 Is a discrete random variable obeying the average and distribution, s 1 ∈{1,…,s};
When s is 1 Is determined, the total number of all intervals is:
according to the dynamic image data embedding method of the intelligent guide system provided by the invention, outputting the second embedding matrix comprises the following steps:
the first embedded matrix is subjected to pooling treatment through the transformation matrix to reduce the number of nodes, and a second embedded matrix is outputComprising the following steps:
wherein ,representation s 1 Transformation matrix at=n, ++>Representation s 1 Second embedding matrix at=n.
According to the dynamic graph data embedding method of the intelligent guide system provided by the invention, for each node in the second embedding matrix, node-level attention features and relationship-level attention features are calculated respectively, and the convergence features of all the nodes are acquired to generate a final embedding matrix, which comprises the following steps:
let node v i The characteristic of the layer I is that The following formula is satisfied and the following formula is adopted,
v is i Attention of layer I under relation r, < ->For aggregate features at node level, b r,l As a parameter of the model, it is possible to provide,the method comprises the following steps:
for weighted summation of neighbor features under all relations, +.>V is j Features of the previous layer,/->Is a set of neighbors under relationship r;
ρ is a linear rectifying unit with leakage, the symbol || represents the connection between vectors, a r,l Attention vectors belonging to model parameters;
H l =A l H l-1 ;
wherein ,for convergence feature of layer I, A l Is a convergence matrix of layer l,>is A l I th row, j th column of (2) satisfies +.>All other non-existent items are 0;
the aggregation characteristics of all layers are connected in series, and a final embedded matrix Z is obtained:
Z=[H 0 ,H 1 ,…,H L ];
wherein ,H0 =D,z i I-th row of Z.
According to the method for embedding the dynamic graph data of the intelligent guide system, provided by the invention, based on the nodes in the final embedded matrix, edge reconstruction and weight regression are carried out, an objective function is constructed, and the result of training completion of a learner is predicted through the objective function, and the method comprises the following steps:
the correlation between nodes is measured by similarity measurement, the larger the inner product of the corresponding vector is obtained, and the inner product is mapped to the range [0,1 ] by a sigmoid function delta]Acquiring a first objective function
P n (v) Is the noise distribution, M e Is the negative sampling rate;
performing weight regression of the edges by using a triplet o ij =<v i ,v j ,y ij >To describe the relationship of nodes to predicted outcomes:
v i representing learner, v j Representing exercise, y ij Representing the predicted result, if y ij =1 indicates that the learner has correct answer, if y ij =0 indicates learner answer error;
is made up of all triples o in a given dataset ij A second objective function is acquired by the composed set:
wherein ,pij Probability of correctly completing the exercise for the learner.
According to the dynamic graph data embedding method of the intelligent guide system, provided by the invention, based on the nodes in the final embedding matrix, edge reconstruction and weight regression are carried out, an objective function is constructed, the result of training completion of a learner is predicted through the objective function, and the prediction probability p is calculated ij Comprising:
p ij =sigmoid(W 2 ·σ(W 1 ·[z i ||z j ]+b 1 )+b 2 );
wherein ,W1 、W 2 、b 1 、b 2 All are parameters of the model, sigma is a nonlinear activation function, and the final objective function is obtained by:
where λ is the equilibrium coefficient.
On the other hand, the invention also provides a dynamic diagram embedding system of the intelligent guide system, which comprises:
the time sequence expansion module is used for performing time sequence expansion on the dynamic nodes in the heterogeneous evolution network of the intelligent guide system, acquiring an expanded graph data structure, comprising an expanded node set and an expanded edge set, and generating a time sequence of the dynamic nodes;
the time sequence expansion module is also used for carrying out random projection on all the nodes and generating a first embedded matrix based on the sequence of the static nodes and the time sequence of the dynamic nodes;
the random time sequence pooling module is used for dividing the time sequence into a plurality of continuous intervals with equal length, and embedding an average value of nodes in each interval as an interval to generate a transformation matrix; the first embedded matrix is subjected to pooling treatment through the transformation matrix to reduce the number of nodes, and a second embedded matrix is output;
the heterogeneous convergence network module is used for acquiring convergence characteristics of all nodes based on the node-level attention characteristics and the relationship-level attention characteristics for each node in the second embedding matrix and generating a final embedding matrix;
and the loss calculation module is used for carrying out edge reconstruction and weight regression based on the nodes finally embedded in the matrix, constructing an objective function and predicting the training result of the learner through the objective function.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method for embedding dynamic images of an intelligent lead-through system as described in any of the above.
The method and the system for embedding the dynamic diagram of the intelligent guide system provided by the invention have at least the following technical effects:
(1) The task of the automatic knowledge point marking of the test questions of the representation vector obtained by the embedding method provided by the invention is far superior to other methods in recall rate and accuracy, the dynamic evolution of learners along with time can be better adapted based on the time sequence expansion of the nodes, and the dynamic connection between the nodes can be more effectively obtained along with the smooth change of time.
(2) On the task of learner performance prediction, the method provided by the invention can obtain the representation with stronger prediction capability, the time sequence is expanded, and the performance of learners in each stage can be effectively improved through the time sequence expansion; the links between different nodes are considered through the heterogeneous aggregation network, and the influence of the links on the nodes is considered.
(3) On the basis of guaranteeing the model prediction accuracy, the calculation scale of the model is greatly reduced, continuous change of learners in actual conditions can be effectively simulated through random time sequence pooling, and the performances of the learners can be more stable along with the time. And the number of nodes is effectively reduced, the time and the memory overhead are saved, and better experience can be provided for users.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a dynamic diagram embedding method of an intelligent guide system provided by the invention;
FIG. 2 is a second flow chart of the method for embedding dynamic diagram of the intelligent learning guiding system provided by the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that, as for the current graph data embedding method, three types of graph types can be classified, i.e., conventional graph embedding, heterogeneous graph embedding, and dynamic graph embedding.
The most intuitive of the conventional graph embedding methods is matrix decomposition. Matrix factorization converts a graph into a vector space in an explicit manner by using adjacency matrices. It can derive a deterministic solution, but it is difficult to apply to large-scale graphics because its complexity is proportional to the cubic of the number of nodes. Random walk-based methods achieve good performance in some cases, where the paths are sampled on a given graph according to certain criteria, such as deep and Node2vec; with further fusion of deep learning and graph models, graph Neural Networks (GNNs) have become increasingly important means of graph embedding, which can generate higher-level features, such as graph roll-up networks (GCNs), graph annotation networks (GATs), and graph sage, by aggregating neighbor information;
heterogeneous graphs, also known as Heterogeneous Information Networks (HIN), contain many types of nodes whose characteristics may not match due to problems with different dimensions or different representations. Therefore, the GE method of the homogeneity map cannot be widely used in HIN. To solve this problem, researchers have proposed a meta-path based approach using MP2 vec.
Because of the sequential nature of the dynamic graph, DGE focuses on preserving structural and temporal semantic information in the network, providing a new perspective for dynamic data analysis, making graph embedding more practical in the real world. Early approaches often relied on Recurrent Neural Networks (RNNs) to embed, ignoring structural attributes. The factorization-based method can find low-rank decomposition of similarity measures which change with time to generate node embedding which changes with time; GNN-based methods optimize GCN parameters by capturing dynamic changes in graph sequences using RNNs.
None of the methods in the prior art can model an intelligent guide graph (ITG) in an educational scene, because the ITG in the educational scene has locality, independence and smoothness, and the traditional heterogeneous method and the dynamic method can not effectively learn the dynamic evolution process of a learner in the ITG.
In one embodiment, as shown in fig. 1, the present invention provides a dynamic diagram embedding method of an intelligent learning guiding system, which includes the steps of:
performing time sequence expansion on dynamic nodes in a heterogeneous evolution network of the intelligent guide system to obtain an expanded graph data structure, wherein the expanded graph data structure comprises an expanded node set and an expanded edge set, and generating a time sequence of the dynamic nodes;
randomly projecting all nodes, and generating a first embedding matrix based on the sequence of the static nodes and the time sequence of the dynamic nodes;
dividing the time sequence into a plurality of continuous intervals with equal length, and embedding an average value of nodes in each interval as an interval to generate a transformation matrix; the first embedded matrix is subjected to pooling treatment through the transformation matrix to reduce the number of nodes, and a second embedded matrix is output;
for each node in the second embedding matrix, acquiring the convergence characteristics of all nodes based on the node-level attention characteristics and the relationship-level attention characteristics and generating a final embedding matrix;
and carrying out edge reconstruction and weight regression based on the nodes finally embedded in the matrix, constructing an objective function, and predicting the training result of the learner through the objective function.
It should be noted that the Heterogeneous Evolved Network (HEN) is defined as a heterogeneous information network Representing a set of nodes, ε representing a set of edges;
the information networkThere is a mapping from node to node type +.>And a mapping of edge-to-edge type +.> whereinAnd->Respectively representing the type set and the edge type set of the node, and meeting the requirements
To any sideSatisfy->Wherein the node set can be divided into two parts, dynamic node set +.>And static node set->There is->If->Or->Then define a mapping function +.> whereinIs a collection of times;
the dynamic graph embedding method of the intelligent guide system is essentially a time sequence expansion graph neural network and mainly comprises three steps of time sequence expansion, random time sequence pooling and heterogeneous aggregation;
in one embodiment, the time sequence expansion is performed after the original image data section, so that the nodes and edges of the original image data structure are expanded, and the method comprises the following steps:
for a given heterogeneous information networkTE is defined as +.> Representing the expanded graph, < >> And epsilon + Respectively representing the set of the expanded nodes and the set of the edges;
creating an extension node for each dynamic edge, representing a transition state in evolution, the connection of the dynamic edge needs to be changed from the dynamic node to its extension node, and the extension edge needs to be constructed to connect adjacent extension nodes to each other, thereby indicating the continuity of the node and the edge over time evolution process, namely, satisfying the following conditions:
Based on static node setIs embedded statically->Dynamic node set based->Is embedded dynamicallyGenerating the first embedding matrix based on the static embedding and the dynamic embedding:
representing embedded satisfaction of static node-> Representing the time sequence +.>Is embedded in (i)>E is a parameter of the model, and can be gradually optimized along with gradient descent;
because a large number of new nodes are created after time sequence expansion, each node needs to be represented by a unique vector, which leads to a large number of model parameters, and training the large-scale parameters leads to excessive calculation, so that extra time expenditure is added to train the model, and higher requirements are put forward on hardware equipment, and extra memory is needed to store a large number of variables in a heterogeneous aggregation network;
in one embodiment, to reduce the number of nodes, reduce the number of computations, avoid overfitting, and expand perception, further comprising a random pooling based downsampling method:
sequence the time sequenceDividing the time period into a plurality of continuous intervals with equal length, taking the average value of nodes in each interval as the embedding of the interval, and taking each interval as the representation of the average state of the time period;
the training process is greatly reduced in scale and smoothed in sequence through a random time sequence pooling operation, so that the training process is continuous, and main information is reserved at the same time, and the training process comprises the following steps:
D=PE
transforming the acquired first embedding matrix E into a second embedding matrix by a transformation matrix PD, and satisfyN represents the total number of nodes and P can be expressed as the following formula:
Preferably, since m/s cannot be guaranteed to be an integer, there may be a section where the number of nodes is less than s; the size of the preset initial interval is s 1 ,s 1 Is a discrete random variable obeying the average and distribution, s 1 ∈{1,…,s};
When s is 1 Is determined, the total number of all intervals will be determined;
due to s 1 Is a random variable, s for each time series before each gradient drop during training 1 Will be regenerated; thus different s 1 Will be taken so that eventually each node can learn a unique representation; conversely, if s 1 Is always fixed to a specific value, so that the time sequence expansion is meaningless, and the nodes of each interval are converged;
wherein ,representation s 1 Transformation matrix at=n, ++>Representation s 1 A second embedding matrix when n,
so long as itFor a matrix of full rank, each node can learn a unique representation, obviously +.>Can be treated as an upper triangular matrix, i.e. full rank.
The random timing pooling operation described above is only used for extended nodes in dynamic nodes, while static nodes remain unchanged, since extended nodes are typically much more than static nodes, the total number of nodes can be cut down exponentially by random timing pooling;
in one embodiment, after performing the random timing pooling operation, performing a heterogeneous aggregation operation includes:
it should be noted that, since the attribute of the node includes two parts, the characteristics of the node and the characteristics of the environment in which the node is located, the latter can be constructed by feature aggregation; in heterogeneous graphs, the diversity of nodes and edges makes the aggregation process challenging, any node may be connected to several different types of edges, resulting in various relationships,
thus, attention at both node level and relationship level is considered in performing the aggregation operation to let node v i The characteristic of the layer I is that The following formula is satisfied:
v is i Attention of the first layer under relationship r;Is the converging characteristic of the node level;
thus, if a relationship does not belong to this set, then the corresponding attention is considered to be 0.b k,l Corresponds to only one relationship, not a specific node, and the relationship is directional, e.g., knowledge of interest in exercise may be different than exercise of interest in knowledge;
further:
for weighted summation of neighbor features under all relations, +.>V is j Features of the previous layer,/->Is a set of neighbors under relationship r;
ρ is a linear rectifying unit with leakage, i.e., leakyReLU; the symbol i represents the connection between vectors, and a r,l Is an attention vector belonging to the model parameters.
H l =A l H l-1 ;
wherein ,for convergence feature of layer I, A l Is a convergence matrix of layer l,>is A l I th row, j th column of (2) satisfies +.>All other non-existent items are 0;
the aggregated features of all layers are concatenated to derive a final representation matrix, i.e., a final embedding matrix Z, to contain information at each level:
Z=[H 0 ,H 1 ,…,H L ];
in one embodiment, after heterogeneous aggregation, outputting the objective function includes:
an objective function for reconstructing edges during graph learning, i.e. a first objective function for edge reconstructionAnd (3) with
The objective function for fitting the weights of the edges, i.e. the weight regressing the second objective function
By using phasesThe similarity measure measures the correlation between nodes, i.e. the larger the inner product of the vectors, the higher the probability that an edge exists, since the inner product has a range of all real numbers, we need to map the inner product to [0,1 ] by a sigmoid function δ]The obtained objective functionThe method comprises the following steps:
wherein ,Pn (v) Is the noise distribution, M e Is the negative example sampling rate.
Performing edge weight regression to obtain an objective function for fitting the weight of the edgeComprising the following steps:
through triplet o ij =<v i ,v j ,y ij >To describe the relationship of nodes to predicted outcomes: v i Representing learner, v j Representing exercise, y ij Representing the predicted result, if y ij =1 indicates that the learner has correct answer, if y ij =0 indicates learner answer error;is made up of all triples o in a given dataset ij A second objective function is acquired by the composed set: />
wherein ,pij Probability of correctly completing exercise for learner;
the second cross loss is the cross entropy loss of the binary prediction problem;
preferably, the predictive probability p is generated ij The visual method is as follows:
vectors connecting nodes, the prediction probabilities are directly derived using a multi-layer perceptron (MLP) predictor, as follows:
p ij =sigmoid(W 2 ·σ(E 1 ·[z i ||z j ]+b 1 )+b 2 );
W 1 ,W 2 ,b 1 ,b 2 are parameters of the model. Sigma is a nonlinear activation function (we use tanh);
the final objective function is obtained as follows:
λ is a balance coefficient;
in order to minimize the variance of parameter updates, the convergence process is stabilized while reducing storage overhead, batch training is performed to build the target by drawing a specified number of edges and triples, thereby ensuring that all edges and triples are involved in the training process after several rounds of gradient descent.
On the other hand, the invention also provides a dynamic diagram embedding system of the intelligent learning guiding system, and the dynamic diagram embedding system and the dynamic diagram embedding method described below can be correspondingly referred to each other, and specifically comprise a time sequence expanding module, a random time sequence pooling module, a heterogeneous convergence network module and a loss calculating module:
the timing sequence expansion module is used for performing timing sequence expansion on dynamic nodes in a heterogeneous evolution network of the intelligent guide system, acquiring an expanded graph data structure, comprising an expanded node set and an expanded edge set, and generating a timing sequence of the dynamic nodes;
the time sequence expansion module is also used for carrying out random projection on all the nodes and generating a first embedded matrix based on the sequence of the static nodes and the time sequence of the dynamic nodes;
the random time sequence pooling module is used for dividing the time sequence into a plurality of continuous intervals with equal length, and embedding an average value of nodes in each interval as an interval to generate a transformation matrix; the first embedded matrix is subjected to pooling treatment through the transformation matrix to reduce the number of nodes, and a second embedded matrix is output;
the heterogeneous convergence network module is used for acquiring convergence characteristics of all nodes based on the node-level attention characteristics and the relationship-level attention characteristics for each node in the second embedding matrix and generating a final embedding matrix;
the loss calculation module is used for carrying out edge reconstruction and weight regression based on the nodes finally embedded in the matrix, constructing an objective function, and predicting the training result of the learner through the objective function.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, are capable of performing the steps of the method for embedding a dynamic diagram of an intelligent learning guide system provided by the above methods.
In yet another aspect, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the steps of the method for dynamic graph embedding of the intelligent guide system provided above.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (10)
1. The dynamic image data embedding method of the intelligent guide system is characterized by comprising the following steps of:
performing time sequence expansion on dynamic nodes in a heterogeneous evolution network of the intelligent guide system to obtain an expanded graph data structure, wherein the expanded graph data structure comprises an expanded node set and an expanded edge set, and generating a time sequence of the dynamic nodes;
randomly projecting all nodes, and generating a first embedding matrix based on the sequence of the static nodes and the time sequence of the dynamic nodes;
dividing the time sequence into a plurality of continuous intervals with equal length, and embedding an average value of nodes in each interval as an interval to generate a transformation matrix; the first embedded matrix is subjected to pooling treatment through the transformation matrix to reduce the number of nodes, and a second embedded matrix is output;
for each node in the second embedding matrix, acquiring the convergence characteristics of all nodes based on the node-level attention characteristics and the relationship-level attention characteristics and generating a final embedding matrix;
and carrying out edge reconstruction and weight regression based on the nodes finally embedded in the matrix, constructing an objective function, and predicting the training result of the learner through the objective function.
2. The method for embedding dynamic graph data in an intelligent learning guiding system according to claim 1, wherein the step of performing time sequence expansion on dynamic nodes in a heterogeneous evolution network of the intelligent learning guiding system comprises the steps of:
Creating an expansion node for each dynamic edge, changing the dynamic node connected with the dynamic edge into a corresponding expansion node, creating the expansion edge to link adjacent expansion nodes, and generating a time sequence of the dynamic nodes
3. The method for embedding dynamic graph data in an intelligent learning system according to claim 2, wherein the step of randomly projecting all nodes and generating a first embedding matrix based on a sequence of static nodes and a time sequence of dynamic nodes comprises:
node setComprising dynamic node set->And static node set->Static node set based->Is embedded statically->Dynamic node set based->Dynamic embedding of (a)>Based on the static embedding and the dynamic embeddingForming the first embedding matrix:
4. a method of embedding dynamic image data of an intelligent learning system according to claim 3, wherein for a given time sequenceS is the number of nodes in each section, and S>1, presetting the size of the first interval as s 1 ,s 1 Is a discrete random variable obeying the average and distribution, s 1 ∈{1,…,s};
When s is 1 Is determined, the total number of all intervals is:
5. the method of claim 4, wherein outputting the second embedding matrix comprises:
the first embedded matrix is subjected to pooling treatment through the transformation matrix to reduce the number of nodes, and a second embedded matrix is outputEmbedding matrixComprising the following steps:
6. The method for embedding dynamic graph data in an intelligent learning guiding system according to claim 5, wherein for each node in the second embedding matrix, calculating a node level attention feature and a relationship level attention feature, and obtaining the aggregate features of all nodes to generate a final embedding matrix, comprises:
let node v i The characteristic of the layer I is that The following formula is satisfied and the following formula is adopted,
v is i Attention of layer I under relation r, < ->For aggregate features at node level, b r,l For parameters of the model, +.>The method comprises the following steps:
for weighted summation of neighbor features under all relations, +.>V is j Features of the previous layer,/->Is a set of neighbors under relationship r;
ρ is a linear rectifying unit with leakage, the symbol || represents the connection between vectors, a r,l Attention vectors belonging to model parameters;
H l =A l H l-1 ;
wherein ,for convergence feature of layer I, A l Is a convergence matrix of layer l,>is A l I th row, j th column of (2) satisfies +.>All other non-existent items are 0;
the aggregation characteristics of all layers are connected in series, and a final embedded matrix Z is obtained:
Z=[H 0 ,H 1 ,…,H L ];
wherein ,H0 =D,z i I-th row of Z.
7. The method for embedding dynamic graph data in an intelligent learning guide system according to claim 6, wherein performing edge reconstruction and weight regression based on nodes in the final embedding matrix to construct an objective function, and predicting a result of completion of training by a learner through the objective function, comprises:
the correlation between nodes is measured by similarity measurement, the larger the inner product of the corresponding vector is obtained, and the inner product is mapped to the range [0,1 ] by a sigmoid function delta]Acquiring a first objective function
P n (v) Is the noise distribution, M e Is the negative sampling rate;
performing weight regression of the edges by using a triplet o ij =<v i ,v j ,y ij >To describe the relationship of nodes to predicted outcomes:
v i representing learner, v j Representing exercise, y ij Representing the predicted result, if y ij =1 indicates that the learner has correct answer, if y ij =0 indicates learner answer error;
is made up of all triples o in a given dataset ij A second objective function is acquired by the composed set:
wherein ,pij Probability of correctly completing the exercise for the learner.
8. The method for embedding dynamic graph data in intelligent learning guide system according to claim 7, wherein based on the nodes in the final embedding matrix, edge reconstruction and weight regression are performed to construct an objective function, the result of learning by the learner is predicted by the objective function, and the prediction probability p is calculated ij Comprising:
p ij =sigmoid(W 2 ·σ(W 1 ·[z i ||z j ]+b 1 )+b 2 );
wherein ,W1 、W 2 、b 1 、b 2 Are parameters of the model, and sigma is notThe linear activation function, the final objective function is obtained as follows:
where λ is the equilibrium coefficient.
9. A dynamic diagram embedding system of an intelligent learning guiding system, comprising:
the time sequence expansion module is used for performing time sequence expansion on the dynamic nodes in the heterogeneous evolution network of the intelligent guide system, acquiring an expanded graph data structure, comprising an expanded node set and an expanded edge set, and generating a time sequence of the dynamic nodes;
the time sequence expansion module is also used for carrying out random projection on all the nodes and generating a first embedded matrix based on the sequence of the static nodes and the time sequence of the dynamic nodes;
the random time sequence pooling module is used for dividing the time sequence into a plurality of continuous intervals with equal length, and embedding an average value of nodes in each interval as an interval to generate a transformation matrix; the first embedded matrix is subjected to pooling treatment through the transformation matrix to reduce the number of nodes, and a second embedded matrix is output;
the heterogeneous convergence network module is used for acquiring convergence characteristics of all nodes based on the node-level attention characteristics and the relationship-level attention characteristics for each node in the second embedding matrix and generating a final embedding matrix;
and the loss calculation module is used for carrying out edge reconstruction and weight regression based on the nodes finally embedded in the matrix, constructing an objective function and predicting the training result of the learner through the objective function.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the method for dynamic image data embedding of an intelligent guide system according to any of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310342163.7A CN116257659A (en) | 2023-03-31 | 2023-03-31 | Dynamic diagram embedding method and system of intelligent learning guiding system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310342163.7A CN116257659A (en) | 2023-03-31 | 2023-03-31 | Dynamic diagram embedding method and system of intelligent learning guiding system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116257659A true CN116257659A (en) | 2023-06-13 |
Family
ID=86682639
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310342163.7A Pending CN116257659A (en) | 2023-03-31 | 2023-03-31 | Dynamic diagram embedding method and system of intelligent learning guiding system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116257659A (en) |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112215337A (en) * | 2020-09-30 | 2021-01-12 | 江苏大学 | Vehicle trajectory prediction method based on environment attention neural network model |
CN112395466A (en) * | 2020-11-27 | 2021-02-23 | 上海交通大学 | Fraud node identification method based on graph embedded representation and recurrent neural network |
CN112732932A (en) * | 2021-01-08 | 2021-04-30 | 西安烽火软件科技有限公司 | User entity group recommendation method based on knowledge graph embedding |
CN112801355A (en) * | 2021-01-20 | 2021-05-14 | 南京航空航天大学 | Data prediction method based on multi-graph fusion space-time attention of long-short-term space-time data |
CN113095439A (en) * | 2021-04-30 | 2021-07-09 | 东南大学 | Heterogeneous graph embedding learning method based on attention mechanism |
CN113312498A (en) * | 2021-06-09 | 2021-08-27 | 上海交通大学 | Text information extraction method for embedding knowledge graph by undirected graph |
US20220038341A1 (en) * | 2015-09-11 | 2022-02-03 | Ayasdi Ai Llc | Network representation for evolution of clusters and groups |
US20220058221A1 (en) * | 2015-11-25 | 2022-02-24 | Steven Ganz | Methods to Support Version Control and Conflict Resolution in a Database using a Hierarchical Log |
CN114553718A (en) * | 2022-02-20 | 2022-05-27 | 武汉大学 | Network traffic matrix prediction method based on self-attention mechanism |
CN114626618A (en) * | 2022-03-17 | 2022-06-14 | 南开大学 | Student class withdrawal behavior interpretable prediction method based on self-attention mechanism |
CN115082147A (en) * | 2022-06-14 | 2022-09-20 | 华南理工大学 | Sequence recommendation method and device based on hypergraph neural network |
CN115082896A (en) * | 2022-06-28 | 2022-09-20 | 东南大学 | Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network |
CN115329959A (en) * | 2022-07-19 | 2022-11-11 | 华中师范大学 | Learning target recommendation method based on double-flow knowledge embedded network |
CN115545155A (en) * | 2022-09-21 | 2022-12-30 | 华中师范大学 | Multi-level intelligent cognitive tracking method and system, storage medium and terminal |
-
2023
- 2023-03-31 CN CN202310342163.7A patent/CN116257659A/en active Pending
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220038341A1 (en) * | 2015-09-11 | 2022-02-03 | Ayasdi Ai Llc | Network representation for evolution of clusters and groups |
US20220058221A1 (en) * | 2015-11-25 | 2022-02-24 | Steven Ganz | Methods to Support Version Control and Conflict Resolution in a Database using a Hierarchical Log |
CN112215337A (en) * | 2020-09-30 | 2021-01-12 | 江苏大学 | Vehicle trajectory prediction method based on environment attention neural network model |
CN112395466A (en) * | 2020-11-27 | 2021-02-23 | 上海交通大学 | Fraud node identification method based on graph embedded representation and recurrent neural network |
CN112732932A (en) * | 2021-01-08 | 2021-04-30 | 西安烽火软件科技有限公司 | User entity group recommendation method based on knowledge graph embedding |
CN112801355A (en) * | 2021-01-20 | 2021-05-14 | 南京航空航天大学 | Data prediction method based on multi-graph fusion space-time attention of long-short-term space-time data |
CN113095439A (en) * | 2021-04-30 | 2021-07-09 | 东南大学 | Heterogeneous graph embedding learning method based on attention mechanism |
CN113312498A (en) * | 2021-06-09 | 2021-08-27 | 上海交通大学 | Text information extraction method for embedding knowledge graph by undirected graph |
CN114553718A (en) * | 2022-02-20 | 2022-05-27 | 武汉大学 | Network traffic matrix prediction method based on self-attention mechanism |
CN114626618A (en) * | 2022-03-17 | 2022-06-14 | 南开大学 | Student class withdrawal behavior interpretable prediction method based on self-attention mechanism |
CN115082147A (en) * | 2022-06-14 | 2022-09-20 | 华南理工大学 | Sequence recommendation method and device based on hypergraph neural network |
CN115082896A (en) * | 2022-06-28 | 2022-09-20 | 东南大学 | Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network |
CN115329959A (en) * | 2022-07-19 | 2022-11-11 | 华中师范大学 | Learning target recommendation method based on double-flow knowledge embedded network |
CN115545155A (en) * | 2022-09-21 | 2022-12-30 | 华中师范大学 | Multi-level intelligent cognitive tracking method and system, storage medium and terminal |
Non-Patent Citations (2)
Title |
---|
孙建文等: "基于多阶拟合机制的深度认知追踪方法", 《现代教育技术》, pages 103 - 109 * |
李智杰等: "面向图嵌入的改进图注意机制模型", 《计算机工程与应用》, pages 152 - 158 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230153622A1 (en) | Method, Apparatus, and Computing Device for Updating AI Model, and Storage Medium | |
CN112116090B (en) | Neural network structure searching method and device, computer equipment and storage medium | |
CN111104595A (en) | Deep reinforcement learning interactive recommendation method and system based on text information | |
CN109120462A (en) | Prediction technique, device and the readable storage medium storing program for executing of opportunistic network link | |
CN112967088A (en) | Marketing activity prediction model structure and prediction method based on knowledge distillation | |
CN110889450B (en) | Super-parameter tuning and model construction method and device | |
CN114639483A (en) | Electronic medical record retrieval method and device based on graph neural network | |
CN111967271A (en) | Analysis result generation method, device, equipment and readable storage medium | |
CN111695046A (en) | User portrait inference method and device based on spatio-temporal mobile data representation learning | |
CN114358250A (en) | Data processing method, data processing apparatus, computer device, medium, and program product | |
CN114637911A (en) | Next interest point recommendation method of attention fusion perception network | |
US20230206054A1 (en) | Expedited Assessment and Ranking of Model Quality in Machine Learning | |
CN113065321B (en) | User behavior prediction method and system based on LSTM model and hypergraph | |
CN114861917A (en) | Knowledge graph inference model, system and inference method for Bayesian small sample learning | |
CN114493674A (en) | Advertisement click rate prediction model and method | |
CN110866866B (en) | Image color imitation processing method and device, electronic equipment and storage medium | |
CN113408721A (en) | Neural network structure searching method, apparatus, computer device and storage medium | |
CN116975686A (en) | Method for training student model, behavior prediction method and device | |
CN117834852A (en) | Space-time video quality evaluation method based on cross-attention multi-scale visual transformer | |
CN116257659A (en) | Dynamic diagram embedding method and system of intelligent learning guiding system | |
WO2022127603A1 (en) | Model processing method and related device | |
CN116976402A (en) | Training method, device, equipment and storage medium of hypergraph convolutional neural network | |
CN117010480A (en) | Model training method, device, equipment, storage medium and program product | |
CN115564532A (en) | Training method and device of sequence recommendation model | |
CN115908600A (en) | Massive image reconstruction method based on prior regularization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |