CN114154070A - MOOC recommendation method based on graph convolution neural network - Google Patents

MOOC recommendation method based on graph convolution neural network Download PDF

Info

Publication number
CN114154070A
CN114154070A CN202111489426.4A CN202111489426A CN114154070A CN 114154070 A CN114154070 A CN 114154070A CN 202111489426 A CN202111489426 A CN 202111489426A CN 114154070 A CN114154070 A CN 114154070A
Authority
CN
China
Prior art keywords
user
project
data
node
recommendation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111489426.4A
Other languages
Chinese (zh)
Inventor
王曙燕
郭睿涵
孙家泽
王小银
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Posts and Telecommunications
Original Assignee
Xian University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Posts and Telecommunications filed Critical Xian University of Posts and Telecommunications
Priority to CN202111489426.4A priority Critical patent/CN114154070A/en
Publication of CN114154070A publication Critical patent/CN114154070A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses an MOOC recommendation method based on a graph convolution neural network, which aims at the problem that a recommendation system is applied to MOOC course recommendation, and belongs to the field of deep learning recommendation systems. The method comprises the steps of firstly, performing data enhancement on a bipartite graph interacted between a user and a project on an MOOC platform to obtain two sub-views, extracting the representation of a node on the sub-views and an original bipartite graph by using a graph convolution neural network, then constructing a self-supervision learning auxiliary task, combining a supervision learning task and a self-supervision learning task of a recommendation system to form a multi-task learning mode, and finally recommending a target user. The method can relieve the problem of data sparsity in the field of online education, improve the recommendation accuracy of the model for the long-tailed projects and the robustness of the model for noise data, provide personalized recommended learning service for learners and realize personalized education.

Description

MOOC recommendation method based on graph convolution neural network
Technical Field
The invention belongs to the field of deep learning recommendation systems, particularly relates to an MOOC recommendation problem based on a graph neural network in the field of deep learning recommendation systems, and provides an MOOC recommendation method based on a graph convolution neural network.
Background
The data age personalized recommendation technology is widely applied to various fields of the internet, but is less applied to the education field. With the rapid growth of MOOC education big data, the massive data are analyzed and mined, valuable information is extracted from the massive data, and personalized recommended learning service is provided for learners, which is an important way for realizing personalized education, so that the method for recommending in the field of online education is very necessary and meaningful for providing efficient and personalized recommendation for student users.
The model structure of the recommendation system mainly comprises three major types, namely a shallow model, a neural model and a model based on a graph neural network. In the model based on the graph neural network, an end-to-end mode is provided by the model based on the graph convolution network, multi-hop neighbors are integrated into node representation learning, and the most advanced recommendation performance is realized. However, the model based on the graph convolution network still has some limitations, such as sparse interaction information of users and items, poor exposure of long-tail items and high-frequency items, noise in interaction records, and the like. According to the method, an auxiliary self-supervision learning task is added on the basis of the traditional recommendation supervision task, so that the recommendation accuracy of a model on a long-tail project on an MOOC (model object oriented object) data set and the robustness of the model on noise data are improved, and the self-discrimination capability of node characterization learning is enhanced.
Disclosure of Invention
The invention provides an MOOC recommendation method based on a graph convolution neural network, which constructs a self-supervised learning auxiliary task by using the thought of contrast learning in self-supervised learning and combines the contrast learning self-supervised task with the supervised learning task of a recommendation system to form a multi-task learning paradigm. The goal is to train the encoder by maximizing the consistency between the two extended views of an instance versus learning, taking advantage of the unlabeled data space by modifying the input data, and thereby achieving significant improvements in downstream tasks.
The invention comprises the following steps:
the method comprises the following steps: acquiring user project interaction data of each user of an MOOC data set, wherein the user project interaction data comprises a user ID, a project ID and project interaction record data;
step two: carrying out preprocessing operation on the MOOC data set: ID remapping, data screening and missing data filling to obtain a user and project interaction data set XdataWill beUser and project interaction data set XdataConverting into a bipartite graph G of user and project interaction;
step three: recording the data enhancement operation as s, and performing two completely independent data enhancement operations on the bipartite graph G of user and project interaction to form two views s1(G) And s2(G);
Step four: carrying out node coding on bipartite graph G interacted with user and project by using graph convolution neural network, wherein n is the number of layers of nodes, and Z is(n)Characterizing vectors, Z, for the nodes of the layer(n-1)For the node characterization vector of the previous layer, H represents a function of neighborhood aggregation, and the formula is as follows:
Z(n)=H(Z(n-1),G)
step five: two views s obtained by enhancing data of bipartite graph G of user and project interaction1(G) And s2(G) Performing graph convolution operation to obtain representation Z of node1 (n)And Z2 (n)The formula is as follows:
Figure BDA0003397825570000021
Figure BDA0003397825570000022
step six: carrying out node coding on bipartite graph G interacted with user and project by using graph convolution neural network to obtain representation Z of user nodeuAnd characterization of project node ZiWill predict the score
Figure BDA00033978255700000210
Representation Z as a user nodeuAnd characterization of project node ZiThe dot product of (a) is given by the formula:
Figure BDA0003397825570000023
step seven: by combining user and project modeling to construct a model and performing joint optimization on an objective function, the total loss function of the joint optimization is expressed as:
Figure BDA0003397825570000024
wherein
Figure BDA0003397825570000025
Representing the bayesian personalized ranking penalty commonly used by recommendation systems,
Figure BDA0003397825570000026
the goal of (a) is to make the difference in score between positive and negative samples as large as possible,
Figure BDA0003397825570000027
representing the information noise versus estimate loss function,
Figure BDA0003397825570000028
the goal of the optimization is to expect to maximize the similarity between the token vectors learned by the same node under different views through the convolutional neural network, minimize the similarity between the token vectors of different nodes, theta represents the trainable parameters in the model,
Figure BDA0003397825570000029
representing the L2 regular term to prevent the overfitting phenomenon, λ1,λ2Is a hyper-parameter;
step eight: and for any user, predicting whether the target user can interact with the project according to the result of the dot product of the node representation vector of the target user and the node representation vector of the target project.
The invention has the beneficial effects that:
the method adds an auxiliary task of self-supervised learning in a recommendation model based on a graph convolution neural network, trains the network through the supervision information constructed by the auxiliary task of self-supervised learning, enables the network to learn to the representation valuable to the downstream task, thereby relieving the problem of data sparsity, and improving the recommendation accuracy of the model to the long-tailed project and the robustness to noise data.
Drawings
FIG. 1 is a schematic diagram of a flow chart of a recommendation method of the present invention.
Detailed description of the invention
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to specific examples.
The method comprises the following steps: acquiring a user ID, an item ID and item interaction record data from the data set;
step two: performing data preprocessing operation on original data, including ID remapping, data screening and missing value filling, mapping complex project IDs in an original data set into simple digital IDs, screening out users and projects with interaction records larger than K through K-kernel data filtering, and then converting the data of a data set into a format friendly to a model, namely designing an intermediate data structure based on Python. The key value corresponds to the input characteristic, and the characteristic name can be conveniently used for indexing when an algorithm is written; the values correspond to tensors that will be used for calculations and parameter updates;
step three: the formula for constructing the graph convolution neural network model, learning the neighborhood aggregation function represented by the user and acquiring the final characterization vector of the user is as follows:
Figure BDA0003397825570000031
Figure BDA0003397825570000032
wherein z isuThe final characterization vector representing the user,
Figure BDA0003397825570000033
representing the node characterization vector of user u at n level, fcombineRepresenting a merging function, faggregateRepresenting a neighborhood aggregation function, freadoutThe representation of the read-out function is,
Figure BDA0003397825570000041
the node representing the user of the previous level characterizes the vector,
Figure BDA0003397825570000042
a node characterizing vector representing the items of the previous layer, i representing the items interacted with by the user,
Figure BDA0003397825570000043
n represents the number of layers of the graph convolution neural network for the set of items interacted with the target user;
step four: two completely independent data enhancement operations are carried out on a bipartite graph G of user and item interaction to form two views s1(G) and s2(G), wherein s comprises two methods of node random discarding and edge random discarding, and formulas of the node random discarding and the edge random discarding are as follows:
Figure BDA0003397825570000044
Figure BDA0003397825570000045
where M ', M' is e {0, 1}|ν|Two mask vectors, M, based on a set of nodes1,M2∈{0,1}|ε|Two mask vectors of an edge-based set, an indication of the product of two vectors,
Figure BDA0003397825570000046
representing a set of nodes, epsilon representing a set of edgesCombining;
step five: two views s obtained by enhancing data of bipartite graph G of user and project interaction1(G) And s2(G) Carrying out graph convolution operation to obtain a characterization vector Z of the node1 (n)And Z2 (n)The formula is as follows:
Figure BDA0003397825570000047
Figure BDA0003397825570000048
step six: carrying out graph convolution neural network on user and project interactive bipartite graph G to carry out node coding to obtain a characterization vector Z of a user nodeuAnd a characterization vector Z for the project nodeiWill predict the score
Figure BDA00033978255700000417
Characterization vector Z represented as a user nodeuAnd a characterization vector Z for the project nodeiThe dot product of (a) is given by the formula:
Figure BDA0003397825570000049
step seven: by combining user and project modeling to construct a model and performing joint optimization on an objective function, the total loss function of the joint optimization is expressed as:
Figure BDA00033978255700000410
wherein
Figure BDA00033978255700000411
Expressing the Bayes personalized ranking loss commonly used by a recommendation system, and the formula is as follows:
Figure BDA00033978255700000412
where O represents a data record in the user and project interaction data set, u represents a user, i represents a project that the user has interacted, j represents a project that the user has not interacted,
Figure BDA00033978255700000413
which represents the score of the positive sample,
Figure BDA00033978255700000414
score representing negative examples, Bayesian personalized rank loss
Figure BDA00033978255700000415
The goal of (1) is to make the difference in score between positive and negative samples as large as possible.
Figure BDA00033978255700000416
The user side information noise contrast estimation loss function is represented by the formula:
Figure BDA0003397825570000051
s represents a calculated function of the similarity between the user vectors, τ is a temperature parameter, in two views s1(G) And s2(G) Obtaining a node representation of the user u as z 'by carrying out graph convolution operation'u,z″u,z′u,z″uFor a node characterization vector learned by the same user u through a graph convolution neural network under different views, the user v (u ≠ v) is in the view s2(G) The representation learned by the neural network of the graph convolution is denoted as z ″)v,z′u,z″vAnd characterizing vectors for nodes learned by different user nodes through the graph convolution neural network under different views. The project side information noise contrast estimation loss function is consistent with the user side information noise contrast estimation loss function.
Figure BDA0003397825570000052
The loss formula is:
Figure BDA0003397825570000053
Figure BDA0003397825570000054
the optimization aims at expecting to maximize the similarity between the characteristic vectors which are learned by the same node through the graph convolution neural network under different views, and minimize the similarity between the characteristic vectors which are learned by different nodes through the graph convolution neural network;
step eight: and for any user, predicting whether the target user can interact with the project according to the result of the dot product of the node representation vector of the target user and the node representation vector of the target project.
The experimental results obtained by this method and several other recommended methods on the MOOC dataset are shown in table 1:
TABLE 1 data of the results
Method Recall@5 NDCG@5
DMF 0.3326 0.2897
GC-MC 0.3485 0.302
NGCF 0.3586 0.3097
LightGCN 0.3584 0.311
Method for producing a composite material 0.3863 0.3337
The data set is an online user learning record data set of a student hall, a first column in table 1 is a common recommendation method, a second column is an evaluation index Recall @5, Recall @5 represents the Recall ratio of the first 5 items, the Recall ratio represents the probability that a positive sample is correctly predicted to occupy the actual positive sample, the higher the Recall ratio is, the better the recommendation effect of the model is represented, a third column is an evaluation index NDCG @5, NDCG @5 represents the normalized breaking loss accumulated gain of the first 5 items, and when a result with high relevance appears at a position closer to the front, the higher the NDCG @5 index is, the better the recommendation effect is.
Screening the MOOC data set according to the interaction times of the items to obtain a data set of the long-tail item with less interaction times, wherein the experimental results obtained by the method and other recommendation methods on the long-tail item data set are shown in Table 2:
table 2 long-tailed article recommendation effect experimental result data
Method Recall@5
GC-MC 0.2835
NGCF 0.3042
LightGCN 0.3511
Method for producing a composite material 0.3706
The first column in table 2 is a common recommendation method, the second column is an evaluation index Recall @5, and the higher Recall @5 is, the better the recommendation effect of the model on the long-tail item is. As can be seen from the experimental results in Table 2, the method can improve the recommendation accuracy of the model for the long-tailed project.
A certain amount of randomly generated interaction record data of users and items is added into the MOOC data set to obtain a data set added with noise data, and experimental results obtained by the method and other recommendation methods on the noise data set are shown in Table 3:
TABLE 3 validation of model robustness test results data
Method Recall@5 NDCG@5
DMF 0.3005 0.2634
GC-MC 0.3122 0.2745
NGCF 0.3266 0.2842
LightGCN 0.3344 0.2876
Method for producing a composite material 0.3494 0.3022
The first column in table 3 is a common recommendation method, the second column is an evaluation index Recall @5, the higher Recall @5 is, the better the recommendation effect of the model is, the third column is an evaluation index NDCG @5, and the higher the NDCG @5 is, the better the recommendation effect of the model is. As can be seen from the experimental results in Table 3, the method can improve the robustness of the model to noise data.

Claims (1)

1. An MOOC recommendation method based on a graph convolution neural network is characterized by comprising the following steps:
the method comprises the following steps: acquiring user project interaction data of each user of an MOOC data set, wherein the user project interaction data comprises a user ID, a project ID and project interaction record data;
step two: carrying out preprocessing operation on the MOOC data set: ID remapping, data screening and missing data filling to obtain a user and project interaction data set XdataInteraction of user with project data set XdataConverting into a bipartite graph G of user and project interaction;
step three: recording the data enhancement operation as s, and taking the bipartite graph G of user and project interactionThe operation of two completely independent data enhancements forms two views s1(G) And s2(G);
Step four: carrying out node coding on bipartite graph G interacted with user and project by using graph convolution neural network, wherein n is the number of layers of nodes, and Z is(n)Characterizing vectors, Z, for the nodes of the layer(n-1)For the node characterization vector of the previous layer, H represents a function of neighborhood aggregation, and the formula is as follows:
Z(n)=H(Z(n-1),G)
step five: two views s obtained by enhancing data of bipartite graph G of user and project interaction1(G) And s2(G) Performing graph convolution operation to obtain representation Z of node1 (n)And Z2 (n),The formula is as follows:
Figure FDA0003397825560000011
Figure FDA0003397825560000012
step six: carrying out node coding on bipartite graph G interacted with user and project by using graph convolution neural network to obtain representation Z of user nodeuAnd characterization of project node ZiWill predict the score
Figure FDA0003397825560000013
Representation Z as a user nodeuAnd characterization of project node ZiThe dot product of (a) is given by the formula:
Figure FDA0003397825560000014
step seven: by combining user and project modeling to construct a model and performing joint optimization on an objective function, the total loss function of the joint optimization is expressed as:
Figure FDA0003397825560000015
wherein
Figure FDA0003397825560000016
Representing the bayesian personalized ranking penalty commonly used by recommendation systems,
Figure FDA0003397825560000017
the goal of (a) is to make the difference in score between positive and negative samples as large as possible,
Figure FDA0003397825560000018
representing the information noise versus estimate loss function,
Figure FDA0003397825560000019
the goal of the optimization is to expect to maximize the similarity between the token vectors learned by the same node under different views through the convolutional neural network, minimize the similarity between the token vectors of different nodes, theta represents the trainable parameters in the model,
Figure FDA00033978255600000110
representing the L2 regular term to prevent the overfitting phenomenon, λ1,λ2Is a hyper-parameter;
step eight: and for any user, predicting whether the target user can interact with the project according to the result of the dot product of the node representation vector of the target user and the node representation vector of the target project.
CN202111489426.4A 2021-12-07 2021-12-07 MOOC recommendation method based on graph convolution neural network Pending CN114154070A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111489426.4A CN114154070A (en) 2021-12-07 2021-12-07 MOOC recommendation method based on graph convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111489426.4A CN114154070A (en) 2021-12-07 2021-12-07 MOOC recommendation method based on graph convolution neural network

Publications (1)

Publication Number Publication Date
CN114154070A true CN114154070A (en) 2022-03-08

Family

ID=80453277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111489426.4A Pending CN114154070A (en) 2021-12-07 2021-12-07 MOOC recommendation method based on graph convolution neural network

Country Status (1)

Country Link
CN (1) CN114154070A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114880582A (en) * 2022-04-11 2022-08-09 中国科学院信息工程研究所 User item recommendation method
CN115329211A (en) * 2022-08-01 2022-11-11 山东省计算中心(国家超级计算济南中心) Personalized interest recommendation method based on self-supervision learning and graph neural network
CN117688248A (en) * 2024-02-01 2024-03-12 安徽教育网络出版有限公司 Online course recommendation method and system based on convolutional neural network

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112925977A (en) * 2021-02-26 2021-06-08 中国科学技术大学 Recommendation method based on self-supervision graph representation learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112925977A (en) * 2021-02-26 2021-06-08 中国科学技术大学 Recommendation method based on self-supervision graph representation learning

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114880582A (en) * 2022-04-11 2022-08-09 中国科学院信息工程研究所 User item recommendation method
CN114880582B (en) * 2022-04-11 2024-09-13 中国科学院信息工程研究所 User project recommendation method
CN115329211A (en) * 2022-08-01 2022-11-11 山东省计算中心(国家超级计算济南中心) Personalized interest recommendation method based on self-supervision learning and graph neural network
CN115329211B (en) * 2022-08-01 2023-06-06 山东省计算中心(国家超级计算济南中心) Personalized interest recommendation method based on self-supervision learning and graph neural network
CN117688248A (en) * 2024-02-01 2024-03-12 安徽教育网络出版有限公司 Online course recommendation method and system based on convolutional neural network
CN117688248B (en) * 2024-02-01 2024-04-26 安徽教育网络出版有限公司 Online course recommendation method and system based on convolutional neural network

Similar Documents

Publication Publication Date Title
CN112905900B (en) Collaborative filtering recommendation method based on graph convolution attention mechanism
CN108920641B (en) Information fusion personalized recommendation method
CN114154070A (en) MOOC recommendation method based on graph convolution neural network
CN109389151B (en) Knowledge graph processing method and device based on semi-supervised embedded representation model
CN110866145A (en) Co-preference assisted deep single-class collaborative filtering recommendation method
CN113704438B (en) Conversation recommendation method of abnormal picture based on layered attention mechanism
Alfarhood et al. DeepHCF: a deep learning based hybrid collaborative filtering approach for recommendation systems
CN116401542A (en) Multi-intention multi-behavior decoupling recommendation method and device
CN116071128A (en) Multitask recommendation method based on multi-behavioral feature extraction and self-supervision learning
CN114757271B (en) Social network node classification method and system based on multichannel graph convolutional network
CN114817712A (en) Project recommendation method based on multitask learning and knowledge graph enhancement
CN117349494A (en) Graph classification method, system, medium and equipment for space graph convolution neural network
Hao et al. Deep graph clustering with enhanced feature representations for community detection
CN117556148B (en) Personalized cross-domain recommendation method based on network data driving
CN114997476A (en) Commodity prediction method fusing commodity incidence relation
Zheng et al. Deep tabular data modeling with dual-route structure-adaptive graph networks
CN112965968B (en) Heterogeneous data pattern matching method based on attention mechanism
Liu et al. TCD-CF: Triple cross-domain collaborative filtering recommendation
CN112818256B (en) Recommendation method based on neural collaborative filtering
CN117892815A (en) Graph comparison recommendation method based on knowledge graph
CN115905617B (en) Video scoring prediction method based on deep neural network and double regularization
CN116955647A (en) Recommendation algorithm based on knowledge graph and neural network
CN117235375A (en) User multi-behavior recommendation method based on graphic neural network and element learning
Chu et al. Towards a deep learning autoencoder algorithm for collaborative filtering recommendation
He et al. Image quality assessment based on adaptive multiple Skyline query

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination