CN115115862A - High-order relation knowledge distillation method and system based on heterogeneous graph neural network - Google Patents

High-order relation knowledge distillation method and system based on heterogeneous graph neural network Download PDF

Info

Publication number
CN115115862A
CN115115862A CN202210553500.2A CN202210553500A CN115115862A CN 115115862 A CN115115862 A CN 115115862A CN 202210553500 A CN202210553500 A CN 202210553500A CN 115115862 A CN115115862 A CN 115115862A
Authority
CN
China
Prior art keywords
model
knowledge
student
teacher
heterogeneous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210553500.2A
Other languages
Chinese (zh)
Inventor
刘静
郝沁汾
叶笑春
范东睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN202210553500.2A priority Critical patent/CN115115862A/en
Publication of CN115115862A publication Critical patent/CN115115862A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a high-order relation knowledge distillation method and system based on a heterogeneous graph neural network. Specifically, the method comprises the steps of coding single node semantics of a pre-trained heterogeneous teacher model by performing node-level knowledge distillation; and modeling semantic relationships among different types of nodes of the pre-trained heterogeneous teacher model by performing relationship-level knowledge distillation. By integrating node-level knowledge distillation and system-level knowledge distillation, the high-order relation knowledge distillation method becomes a practical and universal training method, is suitable for any heterogeneous graph neural network, not only improves the performance and generalization capability of a heterogeneous student model, but also ensures the node-level and relation-level knowledge extraction of the heterogeneous graph neural network.

Description

High-order relation knowledge distillation method and system based on heterogeneous graph neural network
Technical Field
The invention relates to the field of graph data mining, in particular to the field of heterogeneous graph data mining, and more particularly relates to a high-order relation knowledge distillation method and system based on a heterogeneous graph neural network.
Background
The ubiquitous presence of heteromorphic images in academic and industrial fields, a great number of heteromorphic image neural networks (HGNN) have been proposed recently, and learning of node representations in heteromorphic images is a hot spot of current research. Heterogeneous graph modeling has the advantage of integrating more information than homogeneous graphs. However, how to embed rich structural and semantic information in an anomaly graph into a low-dimensional node representation is a serious challenge.
In recent years, in order to solve the problem of heterogeneity of nodes and edges in a heterogeneous graph, researchers have proposed many HGNN-based methods, mainly classified into a meta-path-based method and an edge-relationship-based method. In order to capture the heterogeneity of edges, the edge relation-based method directly utilizes a specific relation matrix to process edge relations of various node types in different measurement spaces, such as heterogeneous graph neural network models of RGCN, HGT, HGConv and the like. However, the edge relation-based method can capture only local structural information of the heteromorphic image. In order to be able to encode rich semantic information in an anomaly map, a meta-path based approach is proposed. The meta-path is an effective semantic mining tool, and can capture more complex and richer high-order semantic information among nodes in a heterogeneous graph. Among them, HAN is a pioneering work based on the meta-path method.
While existing HGNNs have achieved good performance, their representation capabilities are limited by (1) imprecise data labeling. In general, the HGNN training method belongs to semi-supervised learning, and therefore, the performance of the HGNN training method is highly dependent on a large amount of high-quality label data. However, fuzzy data labeling will become a bottleneck for HGNN modeling; (2) the semantic relationship between different types of nodes is difficult to model. Although meta-paths are used for higher-order semantic modeling in metamorphic graphs, meta-path selection in different domains is still challenging because it requires sufficient domain knowledge.
In recent years, Knowledge Distillation (KD) techniques in deep learning have shown some advantages in improving the performance of models. Currently, there are some efforts to combine the knowledge distillation method with the neural network of the graph for application. They are designed for homogeneous neural networks where each node or edge in the processed data is of the same type.
Disclosure of Invention
The invention aims to overcome two defects of inaccuracy in data annotation and difficulty in semantic relationship modeling faced by HGNN in the prior art, and provides a high-order relationship knowledge distillation method based on a heterogeneous graph neural network, which comprises the following steps:
step S1, respectively obtaining a heterogeneous graph neural network model of knowledge to be distilled as a teacher model, obtaining a heterogeneous graph neural network model of knowledge to be received as a student model, and obtaining model prediction values of output layers of the teacher model and the student model and the heterogeneous node embedded representation of a middle graph convolution layer;
step S2, extracting first-order node-level soft label knowledge of the teacher model through node-level knowledge distillation based on the model predicted values of the teacher model and the student model;
step S3, based on the teacher model and the student model, the intermediate graph convolution layer embedding representation, and extracting the second-order relation level heterogeneous semantic knowledge of the teacher model through relation level knowledge distillation;
and step S4, integrating the first-order node-level soft label knowledge and the second-order relation-level heterogeneous semantic knowledge to obtain high-order relation knowledge, training the student model based on the high-order relation knowledge, and using the trained student model for a specified task.
The distillation method based on the higher-order relation knowledge of the neural network of the heterogeneous map, wherein the step S1 comprises the following steps:
acquiring a heterogeneous data set D which comprises n training set samples, wherein the characteristic dimension of each sample is D dimension; constructing a teacher model T and a student model S with the same configuration, wherein each model comprises 5 layers: an input layer, a first layer convolution layer, a second layer convolution layer, an MLP linear transformation layer and a Softmax output layer; the neural network parameters of the teacher and the student are respectively W t And W s The activation function RELU used by the convolution layer is f (x) max (x, 0);
the intermediate graph convolution layer heterogeneous node embedded representation of the teacher model and the student model comprises:
the input sample is characterized by h 0 The expression of convolutional layer is h, then h t =RELU(W t *h 0 ),h s =RELU(W s *h 0 ) (ii) a The output expression of the MLP linear transformation layer is z, and the output expressions of the linear transformation layers of the teacher and student models are z respectively t And z s
The model predicted values of the teacher model and the student model include: the expression of the Softmax output layer is p, then p t =Softmax(z t ),p s =Softmax(z s )。
The distillation method based on the higher-order relation knowledge of the neural network of the heterogeneous map, wherein the step S2 comprises the following steps:
predicting value p by adopting teacher and student models t ,p s Transferring the soft label knowledge in the teacher model to the student model by using a node level knowledge distillation method to obtain a first-order node level distillation loss L NKD As the first-order node-level soft label knowledge:
L NKD =(1-α)L CE +αL KD
wherein
Figure BDA0003653959000000031
Basic cross entropy loss and distillation loss, respectively, alpha is a hyperparameter of equilibrium cross entropy loss and distillation loss, and D (-) is a KL metric function; in addition, the
Figure BDA0003653959000000032
Is the sfotmax probability output scaled by the temperature coefficient τ.
The distillation method based on the higher-order relation knowledge of the neural network of the heterogeneous map, wherein the step S3 comprises the following steps:
the expression h is embedded by a convolution layer between a teacher and a student t ,h s Transferring high-order semantic relation knowledge in the teacher model to the student model by using a relation-level knowledge distillation method;
the correlation matrix MetaCorr of the teacher and student network models is:
Figure BDA0003653959000000033
Figure BDA0003653959000000034
wherein
Figure BDA0003653959000000035
k is the total number of types of the heterogeneous nodes corresponding to the corresponding heterogeneous data set D, and i and j represent nodes of different types;
Figure BDA0003653959000000036
is a Gaussian kernel function;
the intermediate layer embedding is nonlinearly transformed, and then a shared attention vector q is applied to obtain the attention value of the student model
Figure BDA0003653959000000037
Figure BDA0003653959000000038
Wherein W s Is a weight matrix of the teacher model, b s Is a deviation vector;
to the attention valueCarrying out normalization processing to obtain a final attention coefficient through a softmax function
Figure BDA0003653959000000039
Figure BDA00036539590000000310
Obtaining the second order relation knowledge distillation loss L RKD As second order relationship level heterogeneous semantic knowledge;
Figure BDA00036539590000000311
where D is the mean square error.
The distillation method based on the higher-order relation knowledge of the neural network of the heterogeneous map, wherein the step S4 comprises the following steps:
integration of L NKD And L RKD Obtaining the final total loss L of the distillation scheme of the high-order relation knowledge as the high-order relation knowledge so as to train the student model end to end;
L=L NKD +βL RKD
wherein beta is L NKD And L RKD Is determined.
The method comprises the steps that a training set sample comprises a movie name, a director, actors and movie categories, and the specified task comprises the step of inputting the movie name to be classified and/or the director and/or the actors into the student model to obtain the movie category to which the students belong.
The invention also provides a high-order relation knowledge distillation system based on the neural network of the heterogeneous map, which comprises the following components:
the model acquisition module is used for respectively acquiring a heterogeneous graph neural network model of knowledge to be distilled as a teacher model, acquiring a heterogeneous graph neural network model of knowledge to be received as a student model, and acquiring model prediction values of output layers of the teacher model and the student model and the embedded expression of heterogeneous nodes of a middle graph convolution layer;
the first knowledge extraction module is used for extracting first-order node-level soft label knowledge of the teacher model through node-level knowledge distillation according to the model prediction values of the teacher model and the student models;
the second knowledge extraction module is used for extracting second-order relation-level heterogeneous semantic knowledge of the teacher model through relation-level knowledge distillation based on the built-in expression of the intermediate graph convolution layer of the teacher model and the student model;
the training module is used for integrating the first-order node-level soft label knowledge and the second-order relation-level heterogeneous semantic knowledge to obtain high-order relation knowledge, training the student model based on the high-order relation knowledge, and using the trained student model for a specified task;
the model acquisition module is configured to:
acquiring a heterogeneous data set D which comprises n training set samples, wherein the characteristic dimension of each sample is D dimension; constructing a teacher model T and a student model S with the same configuration, wherein each model comprises 5 layers: an input layer, a first layer convolution layer, a second layer convolution layer, an MLP linear transformation layer and a Softmax output layer; the neural network parameters of the teacher and the student are respectively W t And W s The activation function RELU used by the convolution layer is f (x) max (x, 0);
the intermediate graph convolution layer heterogeneous node embedded representation of the teacher model and the student model comprises:
the input sample is characterized by h 0 The expression of convolutional layer is h, then h t =RELU(W t *h 0 ),h s =RELU(W s *h 0 ) (ii) a The output expression of the MLP linear transformation layer is z, and the output expressions of the linear transformation layers of the teacher and student models are z respectively t And z s
The model predicted values of the teacher model and the student model include: the expression of the Softmax output layer is p, then p t =Softmax(z t ),p s =Softmax(z s );
The first knowledge extraction module to:
by teachingTeacher and student model prediction value p t ,p s Transferring the soft label knowledge in the teacher model to the student model by using a node level knowledge distillation method to obtain a first-order node level distillation loss L NKD As the first-order node-level soft label knowledge:
L NKD =(1-α)L CE +αL KD
wherein
Figure BDA0003653959000000051
Is the basic cross entropy loss and distillation loss, respectively, alpha is the hyperparameter of the equilibrium cross entropy loss and distillation loss, and D (-) is a KL metric function; in addition, the
Figure BDA0003653959000000052
Is sfotmax probability input convex with temperature coefficient tau scaling
The second knowledge extraction module is configured to:
the expression h is embedded by a convolution layer between a teacher and a student t ,h s Transferring high-order semantic relation knowledge in the teacher model to the student model by using a relation-level knowledge distillation method;
the correlation matrix MetaCorr of the teacher and student network models is:
Figure BDA0003653959000000053
Figure BDA0003653959000000054
wherein
Figure BDA0003653959000000055
k is the total number of the types of the heterogeneous nodes corresponding to the corresponding heterogeneous data set D, and i, j represent nodes of different types;
Figure BDA0003653959000000056
is a Gaussian kernel function;
the intermediate layer embedding is nonlinearly transformed, and then a shared attention vector q is applied to obtain the attention value of the student model
Figure BDA0003653959000000057
Figure BDA0003653959000000058
Wherein W s Is a weight matrix of the teacher model, b s Is a deviation vector;
the attention value is normalized, and the final attention coefficient is obtained through a softmax function
Figure BDA0003653959000000059
Figure BDA00036539590000000510
Obtaining the second order relation knowledge distillation loss L RKD As second order relationship level heterogeneous semantic knowledge;
Figure BDA00036539590000000511
wherein D is the mean square error;
the training module is configured to:
integration of L NKD And L RKD Obtaining the final total loss L of the distillation scheme of the high-order relation knowledge as the high-order relation knowledge so as to train the student model end to end;
L=L NKD +βL RKD
wherein beta is L NKD And L RKD Is determined.
The higher-order knowledge distillation system based on the heterogeneous graph neural network is characterized in that the training set samples comprise movie names, directors, actors and movie categories, and the specified task comprises inputting the movie names and/or the directors and/or the actors to be classified into the student model to obtain the movie categories to which the student model belongs.
The invention also provides a storage medium for storing a program for executing the any one of the higher-order relation knowledge distillation methods based on the heterogeneous graph neural network.
The invention also provides a client used for the distillation system based on the high-order relation knowledge of the neural network of the heterogeneous map.
The embodiment of the invention provides a high-order relation knowledge distillation method, which is used for applying knowledge distillation to a heterogeneous graph neural network for the first time and fills the blank of extracting knowledge from a heterogeneous graph model. The scheme combines first-order node-level knowledge distillation and second-order relation-level knowledge distillation, and can be flexibly applied to any HGNN model. Through the scheme, the student model can fully utilize and extract the soft tag knowledge and the high-order heterogeneous relation knowledge hidden in the HGNN. Therefore, the generalization capability of the student model is improved, and the performance is remarkably superior to that of the corresponding teacher model.
Drawings
FIG. 1 is a schematic flow chart of a method for performing distillation of high-order knowledge based on a neural network of an isomerous diagram according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a high-order knowledge distillation method based on a neural network of a heterogeneous diagram according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a method for performing high-order relationship knowledge distillation based on an isomorphic neural network according to an embodiment of the present invention.
Detailed Description
The invention provides a high-order relation knowledge distillation method and system based on a heterogeneous graph neural network. The specific technical solution is as follows, the description is given by taking classic heterogeneous data sets such as IMDB (including three types of heterogeneous nodes of movie, director and actor), ACM (including three types of heterogeneous nodes of paper, author and field), and DBLP (including four types of heterogeneous nodes of paper, conference, author and keyword) as examples:
according to the first aspect of the invention, aiming at the problem of inaccurate labeling of data labels, a first-order node-level knowledge distillation (NKD) method is introduced, soft labels of target nodes (such as movies in movie data) are transmitted to students, and general supervision information is provided for downstream tasks (such as node classification). The method comprises the following steps:
step S1: respectively constructing abnormal graph neural network models of teachers and students, and obtaining predicted values of output layer models of the teachers and the students and embedded expression of middle graph convolution layer heterogeneous nodes;
step S2: adopting the model prediction values of the teacher and student networks obtained in the step 1, and transferring the first-order node-level soft label knowledge of the pre-trained teacher model to the student models by using node-level knowledge distillation;
step S3: adopting the intermediate graph convolution layer embedded expression of the teacher and student network obtained in the step 1, and transferring the second-order relation-level high-order heterogeneous semantic knowledge of the pre-trained teacher model to the student model by using relation-level knowledge distillation (RKD);
step S4: and (3) integrating the node-level knowledge and the relation-level knowledge in the steps 2 and 3 to obtain a final high-level relation knowledge distillation scheme, further training the student model, and finally obtaining a trained student model through minimizing loss until the student network converges, so that the student model can be used for different downstream tasks. Downstream tasks of the ACM data set on the fields to which the papers belong comprise tasks of classification, clustering, visualization and the like; the IMDB performs downstream tasks including classification, clustering, visualization and other tasks on the film; DBLP performs downstream tasks such as classification, clustering, visualization, etc. on the author's research neighborhood.
In an embodiment of the present invention, the step S1 further includes: inputting heterogeneous data sets and constructing neural network models of teacher and student heterogeneous image, wherein the specific data sets and the models are set to
Preparing a heterogeneous data set D (such as IMDB, ACM, DBLP and other classical heterogeneous data), wherein n training set samples are provided, and the characteristic dimension of each sample is D dimension; constructing reference teacher and student models T and S with the same configuration, wherein the models T and S comprise 5 layers: input layer, first layer convolution layer, second layer convolution layer, MLP linear transformationA layer and a Softmax output layer; the parameters of the neural network of the teacher and the student are respectively recorded as W t And W s The activation function used by the convolutional layer is RELU, in the form of f (x) max (x, 0).
In one embodiment of the present invention, step S1 further includes: calculating the predicted value of the output layer model of the teacher and the student and the embedded expression of the intermediate graph convolution layer heterogeneous nodes, and specifically calculating the method
The input sample characteristic is recorded as h 0 The expression of convolutional layer is h, then h t =RELU(W t *h 0 ),h s =RELU(W s *h 0 ) (ii) a The output expression of the MLP linear transformation layer is recorded as z, and the probability output of the teacher model and the probability output of the student model are respectively z t And z s Noting that the expression of the Softmax output layer is p, then p t =Softmax(z t ),p s =Softmax(z s )。
In one embodiment of the present invention, the step S2 includes: predicting value p by adopting teacher and student models t ,p s Transferring the soft label knowledge in the teacher model to the student model by using a node level knowledge distillation method to obtain a first-order node level distillation loss L NKD The loss function is
L NKD =(1-α)L CE +αL KD
Wherein
Figure BDA0003653959000000081
Basic cross entropy loss and distillation loss, respectively, i represents a node, α is a hyper-parameter of equilibrium cross entropy loss and distillation loss, and D (-) is a KL metric function; in addition, the
Figure BDA0003653959000000082
The sfotmax probability output is scaled by the temperature coefficient tau, the probability distribution on the class is smoother when the hyper-parameter tau is larger, and more smooth information is learned by a student model.
The step S3 includes: the expression h is embedded by a convolution layer between a teacher and a student t ,h s Distillation method using relationship level knowledgeHigher-order semantic relationship knowledge in the teacher model is transferred to the student model.
In an embodiment of the present invention, the step S3 further includes: in order that students can fully extract high-order semantic information hidden in HGNN from teachers, a MetaCorr correlation matrix is designed, relation-level knowledge between different types of nodes is coded from a pre-trained teacher model, and MetaCorr of the teacher and student network model is calculated as
Figure BDA0003653959000000083
Figure BDA0003653959000000084
Wherein
Figure BDA0003653959000000085
k is the total number of the types of the heterogeneous nodes corresponding to the corresponding heterogeneous data sets, and i, j represent nodes of different types;
Figure BDA0003653959000000086
is a Gaussian kernel function for measuring the similarity between two node-embedded representations, the larger of which
Figure BDA00036539590000000812
Indicating that the distance between the two node representations is relatively large. The reason for using the gaussian RBF kernel is that gaussian RBFs are more flexible and powerful in capturing complex non-linear relationships between nodes. To avoid dimension disaster, pair
Figure BDA00036539590000000811
A second order Taylor extension is selected.
Meanwhile, a type-related attention layer is introduced behind the convolutional layer, and the importance of different node types is automatically learned. Firstly, the middle layer embedding is subjected to nonlinear transformation, and then a shared attention vector q is applied to obtain the attention value of the student model
Figure BDA0003653959000000087
Figure BDA0003653959000000088
Wherein W s Is a weight matrix of the teacher model, b s Is a deviation vector. Then, the attention value is normalized, and a final attention coefficient is obtained through a softmax function
Figure BDA0003653959000000089
Figure BDA00036539590000000810
Obviously, higher alpha indicates more critical nodes, and alpha can be dynamically adjusted in the model training process. Finally, the distillation loss L of the second order relation knowledge is obtained RKD The loss function is
Figure BDA0003653959000000091
Where D is the mean square error MSE penalty.
In one embodiment of the present invention, the step S4 includes: integration of node-level knowledge distillation loss L NKD And relationship-level knowledge distillation loss L RKD Obtaining the final total loss L of the distillation scheme with the higher-order relation knowledge, wherein the loss function is
L=L NKD +βL RKD
Wherein beta is a hyperparameter balancing first order node-level knowledge distillation and second order relationship-level knowledge distillation.
According to the total loss L, end-to-end training can be carried out on the student model, the loss L is minimized until the student network converges, and finally a trained student model is obtained, so that the student model can be used for different downstream tasks.
According to a second aspect of the present invention, there is provided a computer readable storage medium having stored therein one or more computer programs which, when executed, are for implementing the higher order knowledge of relationship distillation method based on a neural network of an isomerous graph of the present invention.
According to a third aspect of the invention there is provided a computing system comprising: a storage device, and one or more processors; wherein the storage device is configured to store one or more computer programs that, when executed by the processor, are configured to implement the higher order knowledge of relationship distillation method based on a neural network of a heterogeneous map according to the present invention.
In order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.
It can be seen from the background that the performance of the existing HGNN model is limited to: (1) data annotation is not accurate; (2) high-order relational semantic modeling is inadequate. Inspired by the successful application of knowledge distillation technology in deep learning, for example, certain advantages are shown in the performance of the improved model, and some work attempts are made to combine the knowledge distillation method and the graph neural network for application. However, these methods are designed for homogeneous neural networks, and each node or edge in the processed data is of the same type.
Aiming at the two problems faced by HGNN, the inventor conducts research and designs a higher-order relation knowledge distillation method facing to a heterogeneous graph neural network so as to improve the performance of a student heterogeneous graph neural network model. In summary, the method of the present invention is shown in fig. 1, and step S1: based on the constructed abnormal graph neural network models of the teacher and the students, respectively obtaining output layer model prediction values of the teacher and the students and the middle graph convolution layer heterogeneous node embedded representation; step S2: then, based on the obtained model prediction values of the teacher and student networks, transferring the first-order node-level soft label knowledge of the pre-trained teacher model to the student models by adopting node-level knowledge distillation; step S3: then, based on the intermediate graph convolution layer embedded representation of teachers and students, transferring second-order relation level high-order heterogeneous semantic knowledge of a pre-trained teacher model to a student model by adopting relation level knowledge distillation; step S4: and finally, integrating the previous node-level knowledge and the relation-level knowledge to obtain final high-level relation knowledge, training the student model, and obtaining a trained student model by minimizing student loss, so that the student model can be used for different downstream tasks.
The invention is described in detail below with reference to the accompanying drawings, fig. 2 shows a high-order relation knowledge distillation method based on a heterogeneous graph neural network provided by the invention, fig. 3 shows a high-order relation knowledge distillation system based on a heterogeneous graph neural network formed by a teacher model and a student model in an embodiment of the invention, and the method comprises the following 4 steps:
step S1: and respectively constructing abnormal graph neural network models of the teacher and the students, and obtaining the predicted values of the output layer models of the teacher and the students and the embedded representation of the abnormal nodes of the convolution layer of the intermediate graph.
According to one embodiment of the invention, heterogeneous data D is input and T and S are constructed: preparing a heterogeneous data set D, wherein n training set samples are provided, and the characteristic dimension of each sample is D dimension; constructing reference teacher and student models T, S (see figure 3) with the same configuration, wherein the model comprises 5 layers: an input layer, a first layer convolution layer, a second layer convolution layer, an MLP linear transformation layer and a Softmax output layer; the parameters of the neural network of the teacher and the student are respectively marked as W t And W s The activation function used by the convolutional layer is RELU, in the form of f (x) max (x, 0).
Calculating the predicted value p of the output layer model of T and S and the heterogeneous node embedding representation h of the convolution layer of the intermediate graph, and specifically calculating as follows: the input sample characteristic is recorded as h 0 The expression of convolutional layer is h, then h t =RELU(W t *h 0 ),h s =RELU(W s *h 0 ) (ii) a Recording the output expression of the MLP linear transformation layer as z, the probability output of the teacher model and the probability output of the student model are respectively z t And z s Noting that the expression of the Softmax output layer is p, then p t =Softmax(z t ),p s =Softmax(z s )。
Step S2: using p obtained in step 1 t ,p s Transferring the knowledge of the first-order node-level soft label of the pre-training T to S by using node-level knowledge distillation to obtain the distillation loss L of the first-order node-level NKD The loss function is
L NKD =(1-α)L CE +αL KD
Wherein
Figure BDA0003653959000000101
Basic cross entropy loss and distillation loss, respectively, alpha is a hyperparameter of equilibrium cross entropy loss and distillation loss, and D (-) is a KL metric function; in addition, the
Figure BDA0003653959000000102
The sfotmax probability output is scaled by a temperature coefficient tau, and the probability distribution on the class is smoother when the tau is larger, so that the student model is promoted to learn more smooth information.
Step S3: and (2) embedding the convolution layer of the T and S network intermediate graph obtained in the step (1) into a representation h, and transferring the second-order relation-level high-order heterogeneous semantic knowledge of the pre-trained T into the S model by using relation-level knowledge distillation.
Wherein, in order that students can fully extract high-order semantic information hidden in the heteromorphic neural network from teachers, a MetaCorr correlation matrix is designed, the pre-trained T encodes the relation-level knowledge between different types of nodes, and the MetaCorr of the T and S network model is calculated as
Figure BDA0003653959000000111
Figure BDA0003653959000000112
Wherein
Figure BDA0003653959000000113
k is the total number of types of the heterogeneous nodes corresponding to the corresponding heterogeneous data set, i, j represents notNodes of the same type;
Figure BDA0003653959000000114
is a Gaussian kernel function for measuring the similarity between two node-embedded representations, the larger of which
Figure BDA00036539590000001110
Indicating that the distance between the two node representations is relatively large. The reason for using the gaussian RBF kernel is that gaussian RBFs are more flexible and powerful in capturing complex non-linear relationships between nodes. To avoid dimension disaster, pair
Figure BDA00036539590000001111
A second order Taylor extension is selected.
Meanwhile, a type-related attention layer is introduced behind the convolutional layer, and the importance of different node types is automatically learned. Firstly, the middle layer embedding is subjected to nonlinear transformation, and then a shared attention vector q is applied to obtain the attention value of the student model
Figure BDA0003653959000000115
Figure BDA0003653959000000116
Wherein W s Is the weight matrix of the T model, b s Is a deviation vector. Then, the attention value is normalized, and a final attention coefficient is obtained through a softmax function
Figure BDA0003653959000000117
Figure BDA0003653959000000118
Obviously, higher alpha indicates more critical nodes, and alpha can be dynamically adjusted in the model training process. Finally, the distillation loss L of the second order relation knowledge is obtained RKD The loss function is
Figure BDA0003653959000000119
Where D is the mean square error.
Step S4: integrating node-level knowledge L of step 2 and step 3 NKD And relationship level knowledge L RKD Obtaining the final total loss L of the distillation scheme with the higher-order relation knowledge, wherein the loss function is
L=L NKD +βL RKD
Wherein beta is a hyperparameter balancing first order node-level knowledge distillation and second order relationship-level knowledge distillation.
According to the total loss L, the S model can be trained end to end, and a trained student model is obtained by minimizing the loss L until the S converges, so that the student model can be used for different downstream tasks.
To illustrate the effectiveness of the above-described scheme of embodiments of the present invention, experiments are described in conjunction with the following detailed description, the experiments being developed over several classical heterogeneous graph datasets:
data set
The experiment involved 3 reference data sets, including 2 citation network (ACM and DBLP) and 1 movie network (IMDB) data sets, described in relation to table 1 below:
table 1 3 heterogeneous graph data sets adopted in this scheme
Figure BDA0003653959000000121
The meaning of the meta path column is the meta path type of the corresponding data set, and is represented by the node type passing through the meta path.
Two, reference model
To verify the effectiveness of the distillation scheme of the invention, this experiment will be tested on classical heterogeneous models, RGCN, HAN, HGT and HGConv heterogeneous patterned neural network models, respectively.
Third, experimental results
The method applies a designed high-order relational knowledge distillation algorithm to RGCN, HAN, HGT and HGConv heterogeneous graph neural network models, and carries out node classification on three data sets of ACM, IMDB and DBLP, wherein the classification index is Micro-F1. The specific experimental effects are shown in table 2:
table 2 classification effect of the present solution on heterogeneous data sets based on various heterogeneous graph neural networks
Figure BDA0003653959000000131
From table 2, it can be found that by using the high-order relationship knowledge distillation scheme related by the invention, the performances of the neural networks of the heterogeneous graphs are all improved remarkably and consistently, and the improvement amplitude is 0.5% -9.6%.
The following are system examples corresponding to the above method examples, and this embodiment can be implemented in cooperation with the above embodiments. The related technical details mentioned in the above embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the above-described embodiments.
The invention also provides a high-order relation knowledge distillation system based on the neural network of the heterogeneous map, which comprises the following components:
the model acquisition module is used for respectively acquiring a heterogeneous graph neural network model of knowledge to be distilled as a teacher model, acquiring a heterogeneous graph neural network model of knowledge to be received as a student model, and acquiring model prediction values of output layers of the teacher model and the student model and the embedded expression of heterogeneous nodes of a middle graph convolution layer;
the first knowledge extraction module is used for extracting first-order node-level soft label knowledge of the teacher model through node-level knowledge distillation according to the model predicted values of the teacher model and the student model;
the second knowledge extraction module is used for extracting second-order relation-level heterogeneous semantic knowledge of the teacher model through relation-level knowledge distillation based on the built-in expression of the intermediate graph convolution layer of the teacher model and the student model;
the training module is used for integrating the first-order node-level soft label knowledge and the second-order relation-level heterogeneous semantic knowledge to obtain high-order relation knowledge, training the student model based on the high-order relation knowledge, and using the trained student model for a specified task;
the model acquisition module is configured to:
acquiring a heterogeneous data set D which comprises n training set samples, wherein the characteristic dimension of each sample is D dimension; constructing a teacher model T and a student model S with the same configuration, wherein each model comprises 5 layers: an input layer, a first layer convolution layer, a second layer convolution layer, an MLP linear transformation layer and a Softmax output layer; the neural network parameters of the teacher and the student are respectively W t And W s The activation function RELU used for the convolution layer is f (x) max (x, 0);
the intermediate graph convolution layer heterogeneous node embedded representation of the teacher model and the student model comprises:
the input sample is characterized by h 0 The expression of convolutional layer is h, then h t =RELU(W t *h 0 ),h s =RELU(W s *h 0 ) (ii) a The output expression of the MLP linear transformation layer is z, and the output expressions of the linear transformation layers of the teacher and student models are z respectively t And z s
The model predicted values of the teacher model and the student model include: the expression of the Softmax output layer is p, then p t =Softmax(z t ),p s =Softmax(z s );
The first knowledge extraction module is configured to:
predicting value p by adopting teacher and student models t ,p s Transferring the soft label knowledge in the teacher model to the student model by using a node level knowledge distillation method to obtain a first-order node level distillation loss L NKD As the first-order node-level soft label knowledge:
L NKD =(1-α)L CE +αL KD
wherein
Figure BDA0003653959000000141
Basic cross entropy loss and distillation loss, respectively, alpha is a hyperparameter of equilibrium cross entropy loss and distillation loss, and D (-) is a KL metric function; in addition, the
Figure BDA0003653959000000142
Is sfotmax probability output with temperature coefficient τ scaling;
the second knowledge extraction module is to:
the expression h is embedded by a convolution layer between a teacher and a student t ,h s Transferring high-order semantic relation knowledge in the teacher model to the student model by using a relation-level knowledge distillation method;
the correlation matrix MetaCorr of the teacher and student network models is:
Figure BDA0003653959000000143
Figure BDA0003653959000000144
wherein
Figure BDA0003653959000000145
k is the total number of the types of the heterogeneous nodes corresponding to the corresponding heterogeneous data set D, and i, j represent nodes of different types;
Figure BDA0003653959000000146
is a Gaussian kernel function;
the intermediate layer embedding is nonlinearly transformed, and then a shared attention vector q is applied to obtain the attention value of the student model
Figure BDA0003653959000000147
Figure BDA0003653959000000148
Wherein W s Is a weight matrix of the teacher model, b s Is a deviation vector;
the attention value is normalized, and the final attention coefficient is obtained through a softmax function
Figure BDA0003653959000000149
Figure BDA00036539590000001410
Obtaining the second order relation knowledge distillation loss L RKD As second order relationship level heterogeneous semantic knowledge;
Figure BDA0003653959000000151
wherein D is the mean square error;
the training module is configured to:
integration of L NKD And L RKD Obtaining the final total loss L of the distillation scheme of the high-order relation knowledge as the high-order relation knowledge so as to train the student model end to end;
L=L NKD +βL RKD
wherein beta is L NKD And L RKD Is determined.
The system comprises a training set sample, a student model and a heterogeneous graph neural network, wherein the training set sample comprises a movie name, a director, an actor and a movie category, and the specified task comprises inputting the movie name to be classified and/or the director and/or the actor into the student model to obtain the movie category to which the student model belongs.
The invention also provides a storage medium for storing a program for executing the any one of the higher-order relation knowledge distillation methods based on the heterogeneous graph neural network.
The invention also provides a client used for the distillation system based on the high-order relation knowledge of the neural network of the heterogeneous map.

Claims (10)

1. A higher-order relation knowledge distillation method based on a heterogeneous graph neural network is characterized by comprising the following steps:
step S1, respectively obtaining a heterogeneous graph neural network model of knowledge to be distilled as a teacher model, obtaining a heterogeneous graph neural network model of knowledge to be received as a student model, and obtaining model prediction values of output layers of the teacher model and the student model and the heterogeneous node embedded representation of a middle graph convolution layer;
step S2, extracting first-order node-level soft label knowledge of the teacher model through node-level knowledge distillation based on the model prediction values of the teacher model and the student models;
step S3, based on the teacher model and the student model, the intermediate graph convolution layer embedding representation, and extracting the second-order relation level heterogeneous semantic knowledge of the teacher model through relation level knowledge distillation;
and step S4, integrating the first-order node-level soft label knowledge and the second-order relation-level heterogeneous semantic knowledge to obtain high-order relation knowledge, training the student model based on the high-order relation knowledge, and using the trained student model for a specified task.
2. The method of claim 1, wherein the step S1 comprises:
acquiring a heterogeneous data set D which comprises n training set samples, wherein the characteristic dimension of each sample is D dimension; constructing a teacher model T and a student model S with the same configuration, wherein each model comprises 5 layers: an input layer, a first layer convolution layer, a second layer convolution layer, an MLP linear transformation layer and a Softmax output layer; the neural network parameters of the teacher and the student are respectively W t And W s The activation function RELU used by the convolution layer is f (x) max (x, 0);
the intermediate graph convolution layer heterogeneous node embedded representation of the teacher model and the student model comprises:
the input sample is characterized by h 0 Expression of convolutional layer ash, then h t =RELU(W t *h 0 ),h s =RELU(W s *h 0 ) (ii) a The output expression of the MLP linear transformation layer is z, and the output expressions of the linear transformation layers of the teacher and student models are z respectively t And z s
The model predicted values of the teacher model and the student model include: the expression of the Softmax output layer is p, then p t =Softmax(z t ),p s =Softmax(z s )。
3. The method of claim 2, wherein the step S2 comprises:
predicting value p by adopting teacher and student models t ,p s Transferring the soft label knowledge in the teacher model to the student model by using a node level knowledge distillation method to obtain a first-order node level distillation loss L NKD As the first-order node-level soft label knowledge:
L NKD =(1-α)L CE +αL KD
wherein
Figure FDA0003653958990000021
Is the basic cross entropy loss and distillation loss, respectively, alpha is the hyperparameter of the equilibrium cross entropy loss and distillation loss, and D (-) is a KL metric function; in addition, the
Figure FDA0003653958990000022
Is the sfotmax probability output scaled by the temperature coefficient τ.
4. The method of claim 3, wherein the step S3 comprises:
the expression h is embedded by a convolution layer between a teacher and a student t ,h s Transferring high-order semantic relation knowledge in the teacher model to the student model by using a relation-level knowledge distillation method;
the correlation matrix MetaCorr of the teacher and student network models is:
Figure FDA0003653958990000023
Figure FDA0003653958990000024
wherein
Figure FDA0003653958990000025
k is the total number of the types of the heterogeneous nodes corresponding to the corresponding heterogeneous data set D, and i, j represent nodes of different types;
Figure FDA0003653958990000026
is a Gaussian kernel function;
the intermediate layer embedding is nonlinearly transformed, and then a shared attention vector q is applied to obtain the attention value of the student model
Figure FDA0003653958990000027
Figure FDA0003653958990000028
Wherein W s Is a weight matrix of the teacher model, b s Is a deviation vector;
the attention value is normalized, and a final attention coefficient is obtained through a softmax function
Figure FDA0003653958990000029
Figure FDA00036539589900000210
Get the second order relation level knowledgeDistillation loss L RKD As second order relationship level heterogeneous semantic knowledge;
Figure FDA00036539589900000211
where D is the mean square error.
5. The method of claim 4, wherein the step S4 comprises:
integration of L NKD And L RKD Obtaining the final total loss L of the distillation scheme of the high-order relation knowledge as the high-order relation knowledge so as to train the student model end to end;
L=L NKD +βL RKD
wherein beta is L NKD And L RKD Is determined.
6. The method according to any one of claims 2 to 4, wherein the training set samples comprise movie names, directors, actors and movie categories, and the assignment task comprises inputting the movie names and/or directors and/or actors to be classified into the student model to obtain the movie category to which the student model belongs.
7. A higher order knowledge of relationships distillation system based on a neural network of heterogeneous maps, comprising:
the model acquisition module is used for respectively acquiring a heterogeneous graph neural network model of knowledge to be distilled as a teacher model, acquiring a heterogeneous graph neural network model of knowledge to be received as a student model, and acquiring model prediction values of output layers of the teacher model and the student model and the embedded expression of heterogeneous nodes of a middle graph convolution layer;
the first knowledge extraction module is used for extracting first-order node-level soft label knowledge of the teacher model through node-level knowledge distillation according to the model prediction values of the teacher model and the student models;
the second knowledge extraction module is used for extracting second-order relation-level heterogeneous semantic knowledge of the teacher model through relation-level knowledge distillation based on the built-in expression of the intermediate graph convolution layer of the teacher model and the student model;
the training module is used for integrating the first-order node-level soft label knowledge and the second-order relation-level heterogeneous semantic knowledge to obtain high-order relation knowledge, training the student model based on the high-order relation knowledge, and using the trained student model for a specified task;
the model acquisition module is configured to:
acquiring a heterogeneous data set D which comprises n training set samples, wherein the characteristic dimension of each sample is D dimension; constructing a teacher model T and a student model S with the same configuration, wherein each model comprises 5 layers: an input layer, a first layer convolution layer, a second layer convolution layer, an MLP linear transformation layer and a Softmax output layer; the neural network parameters of the teacher and the student are respectively W t And W s The activation function RELU used by the convolution layer is f (x) max (x, 0);
the intermediate graph convolution layer heterogeneous node embedded representation of the teacher model and the student model comprises:
the input sample is characterized by h 0 The expression of convolutional layer is h, then h t =RELU(W t *h 0 ),h s =RELU(W s *h 0 ) (ii) a The output expression of the MLP linear transformation layer is z, and the output expressions of the linear transformation layers of the teacher and student models are z respectively t And z s
The model predicted values of the teacher model and the student model include: the expression of the Softmax output layer is p, then p t =Softmax(z t ),p s =Softmax(z s );
The first knowledge extraction module to:
predicting value p by adopting teacher and student models t ,p s Transferring the soft label knowledge in the teacher model to the student model by using a node level knowledge distillation method to obtain a first-order node level distillation loss L NKD As the first-order node-level soft label knowledge:
L NKD =(1-α)L CE +αL KD
wherein
Figure FDA0003653958990000041
Basic cross entropy loss and distillation loss, respectively, alpha is a hyperparameter of equilibrium cross entropy loss and distillation loss, and D (-) is a KL metric function; in addition, the
Figure FDA0003653958990000042
Is sfotmax probability output with scaling of the temperature coefficient τ;
the second knowledge extraction module is to:
the expression h is embedded by a convolution layer between a teacher and a student t ,h s Transferring high-order semantic relation knowledge in the teacher model to the student model by using a relation-level knowledge distillation method;
the correlation matrix MetaCorr of the teacher and student network models is:
Figure FDA0003653958990000043
Figure FDA0003653958990000044
wherein
Figure FDA0003653958990000045
k is the total number of the types of the heterogeneous nodes corresponding to the corresponding heterogeneous data set D, and i, j represent nodes of different types;
Figure FDA0003653958990000046
is a Gaussian kernel function;
the intermediate layer embedding is nonlinearly transformed, and then a shared attention vector q is applied to obtain the attention value of the student model
Figure FDA0003653958990000047
Figure FDA0003653958990000048
Wherein W s Is a weight matrix of the teacher model, b s Is a deviation vector;
the attention value is normalized, and the final attention coefficient is obtained through a softmax function
Figure FDA00036539589900000411
Figure FDA0003653958990000049
Obtaining the second order relation knowledge distillation loss L RKD As second order relationship level heterogeneous semantic knowledge;
Figure FDA00036539589900000410
wherein D is the mean square error;
the training module is used for:
integration of L NKD And L RKD Obtaining the final total loss L of the distillation scheme of the high-order relation knowledge as the high-order relation knowledge so as to train the student model end to end;
L=L NKD +βL RKD
wherein beta is L NKD And L RKD Is determined.
8. The system of claim 7, wherein the training set samples comprise movie names, directors, actors, and movie categories, and the assignment task comprises inputting the movie names and/or directors and/or actors to be classified into the student model to obtain the movie category to which the student model belongs.
9. A storage medium storing a program for executing the higher order relation knowledge distillation method based on the neural network of the heterogeneous map according to any one of claims 1 to 7.
10. A client for the higher order knowledge of relationship distillation system based on a neural network of a heterogeneous map according to claim 8 or 9.
CN202210553500.2A 2022-05-20 2022-05-20 High-order relation knowledge distillation method and system based on heterogeneous graph neural network Pending CN115115862A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210553500.2A CN115115862A (en) 2022-05-20 2022-05-20 High-order relation knowledge distillation method and system based on heterogeneous graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210553500.2A CN115115862A (en) 2022-05-20 2022-05-20 High-order relation knowledge distillation method and system based on heterogeneous graph neural network

Publications (1)

Publication Number Publication Date
CN115115862A true CN115115862A (en) 2022-09-27

Family

ID=83326995

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210553500.2A Pending CN115115862A (en) 2022-05-20 2022-05-20 High-order relation knowledge distillation method and system based on heterogeneous graph neural network

Country Status (1)

Country Link
CN (1) CN115115862A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115761654A (en) * 2022-11-11 2023-03-07 中南大学 Map-oriented neural network accelerated MLP (Multi-level Path) construction method and vehicle re-identification method
CN115907001A (en) * 2022-11-11 2023-04-04 中南大学 Knowledge distillation-based federal diagram learning method and automatic driving method
CN117253611A (en) * 2023-09-25 2023-12-19 四川大学 Intelligent early cancer screening method and system based on multi-modal knowledge distillation
CN117952024A (en) * 2024-03-26 2024-04-30 中国人民解放军国防科技大学 Construction method and application of prior model of heterogeneous data fusion solid engine

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115761654A (en) * 2022-11-11 2023-03-07 中南大学 Map-oriented neural network accelerated MLP (Multi-level Path) construction method and vehicle re-identification method
CN115907001A (en) * 2022-11-11 2023-04-04 中南大学 Knowledge distillation-based federal diagram learning method and automatic driving method
CN115907001B (en) * 2022-11-11 2023-07-04 中南大学 Knowledge distillation-based federal graph learning method and automatic driving method
CN117253611A (en) * 2023-09-25 2023-12-19 四川大学 Intelligent early cancer screening method and system based on multi-modal knowledge distillation
CN117253611B (en) * 2023-09-25 2024-04-30 四川大学 Intelligent early cancer screening method and system based on multi-modal knowledge distillation
CN117952024A (en) * 2024-03-26 2024-04-30 中国人民解放军国防科技大学 Construction method and application of prior model of heterogeneous data fusion solid engine

Similar Documents

Publication Publication Date Title
Logeswaran et al. Sentence ordering and coherence modeling using recurrent neural networks
CN115115862A (en) High-order relation knowledge distillation method and system based on heterogeneous graph neural network
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
Zhang et al. One-shot learning for question-answering in gaokao history challenge
CN114565808A (en) Double-action contrast learning method for unsupervised visual representation
CN115310520A (en) Multi-feature-fused depth knowledge tracking method and exercise recommendation method
CN114880307A (en) Structured modeling method for knowledge in open education field
CN108647295B (en) Image labeling method based on depth collaborative hash
Liu et al. Resume parsing based on multi-label classification using neural network models
Kung et al. Intelligent pig‐raising knowledge question‐answering system based on neural network schemes
Xu et al. Multi-guiding long short-term memory for video captioning
Li et al. Jointly learning knowledge embedding and neighborhood consensus with relational knowledge distillation for entity alignment
CN115934883A (en) Entity relation joint extraction method based on semantic enhancement and multi-feature fusion
Zhang et al. MULTIFORM: few-shot knowledge graph completion via multi-modal contexts
CN115630223A (en) Service recommendation method and system based on multi-model fusion
CN116266268A (en) Semantic analysis method and device based on contrast learning and semantic perception
CN111680163A (en) Knowledge graph visualization method for electric power scientific and technological achievements
CN113934922A (en) Intelligent recommendation method, device, equipment and computer storage medium
Zhang et al. Bi-directional capsule network model for chinese biomedical community question answering
Lee et al. Asynchronous edge learning using cloned knowledge distillation
CN117473083B (en) Aspect-level emotion classification model based on prompt knowledge and hybrid neural network
Li et al. Study on recommendation of personalised learning resources based on deep reinforcement learning
Fei et al. A Multi-teacher Knowledge Distillation Framework for Distantly Supervised Relation Extraction with Flexible Temperature
Zhao et al. LoCSGN: Logic-Contrast Semantic Graph Network for Machine Reading Comprehension
Xiang et al. Document similarity detection based on multi-feature semantic fusion and concept graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination