CN116932887A - Image recommendation system and method based on multi-modal image convolution - Google Patents

Image recommendation system and method based on multi-modal image convolution Download PDF

Info

Publication number
CN116932887A
CN116932887A CN202310669701.3A CN202310669701A CN116932887A CN 116932887 A CN116932887 A CN 116932887A CN 202310669701 A CN202310669701 A CN 202310669701A CN 116932887 A CN116932887 A CN 116932887A
Authority
CN
China
Prior art keywords
layer
representing
aggregation
image
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310669701.3A
Other languages
Chinese (zh)
Inventor
朱东杰
谭景元
丁卓
张立斌
鲁宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changjiang Shidai Communication Co ltd
Harbin Institute of Technology Weihai
Original Assignee
Changjiang Shidai Communication Co ltd
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changjiang Shidai Communication Co ltd, Harbin Institute of Technology Weihai filed Critical Changjiang Shidai Communication Co ltd
Priority to CN202310669701.3A priority Critical patent/CN116932887A/en
Publication of CN116932887A publication Critical patent/CN116932887A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image recommendation system and method based on multi-modal image convolution, and belongs to the technical field of computers. The method for aggregating the multi-mode features by comprehensively utilizing the graph convolution realizes recommendation of film and television works aiming at user preference based on the graph convolution architecture. Firstly, acquiring information such as evaluation records of the same user on film and television works, related posters of the film and television works and the like through a crawler algorithm; preprocessing a data set, enhancing the data by using a MixGen method, and expanding the data set; representing the image data mode and the text data mode into a vector form by using a linear transformation method and the like; extracting information of different modes, and respectively obtaining vector representations of a text mode and an image data mode in multiple modes; carrying out intra-layer and inter-layer node aggregation on the same mode by utilizing graph convolution, and extracting fine granularity intention of a user on a film; establishing a relationship between fine granularity and coarse granularity user intentions by utilizing interlayer aggregation, and establishing super node combination for processing of different modes; and (3) the characteristics of all modes obtained through aggregation are passed through an attention mechanism layer, interaction among different modes is enhanced, and finally a film and television work recommendation list is obtained. The method solves the problems that the existing multi-mode recommendation system is difficult to model the user preference under a specific mode and different mode data are difficult to interact.

Description

Image recommendation system and method based on multi-modal image convolution
Technical Field
The invention discloses an image recommendation system and method based on multi-modal image convolution, and belongs to the technical field of computers.
Background
A recommendation system is a technology widely used in the internet and other fields to predict items that a user may like by analyzing the user's behavior and interests, thereby providing personalized recommendations to the user. How to accurately recommend film and television works meeting the interests of users becomes an important task. However, the large number of movie works means complicated contents and meta information, which makes it difficult for the conventional recommendation system to well acquire the preference of the user.
Existing multimodal recommendation systems rely primarily on user behavior data (e.g., viewing history, scores, etc.) and content information (e.g., actors, genre, etc.) for movie works. However, this information tends to be high-dimensional, sparse and heterogeneous, which makes it very difficult to build an efficient user preference model. Furthermore, the fusion process of multimodal information often involves a large amount of parameter adjustment, which makes conventional multimodal recommendation systems poor in modeling user preferences in a particular mode.
Disclosure of Invention
The method solves the problem that the existing multi-mode recommendation system is difficult to model the user preference in the specific mode. An image recommendation system and method based on multi-modal image convolution are provided.
The invention discloses an image recommendation system and method based on multi-modal image convolution, which are realized by the following technical scheme:
step one, obtaining information such as evaluation records of the same user on the film and television works, related posters of the film and television works and the like through a crawler algorithm.
And step two, preprocessing a data set, enhancing the data by using a MixGen method, and expanding the data set.
And thirdly, extracting information of different modes, and representing the image data mode and the text data mode into a vector form by using methods such as linear transformation.
And fourthly, carrying out intra-layer and inter-layer node aggregation on the same mode by utilizing graph convolution, and extracting the fine granularity intention of a user on the film.
And fifthly, establishing super nodes for different modes, and establishing interlayer aggregation to establish a relationship between fine granularity and coarse granularity user intention.
And step six, the characteristics of all modes obtained through aggregation are enhanced through a self-attention mechanism layer, interaction among different modes is enhanced, and finally a film and television work recommendation list is obtained.
The invention has the most outstanding characteristics and remarkable beneficial effects that:
according to the image recommendation system and method based on multi-modal image convolution, information such as evaluation records of the same user on film and television works and related posters of the film and television works is obtained through a crawler algorithm; preprocessing a data set, enhancing the data by using a MixGen method, and expanding the data set; representing the image data mode and the text data mode into a vector form by using a linear transformation method and the like; extracting information of different modes, and respectively obtaining vector representations of a text mode and an image data mode in multiple modes; carrying out intra-layer and inter-layer node aggregation on the same mode by utilizing graph convolution, and extracting fine granularity intention of a user on a film; establishing a relationship between fine granularity and coarse granularity user intentions by utilizing interlayer aggregation, and establishing super node combination for processing of different modes; and (3) the characteristics of all modes obtained through aggregation are passed through an attention mechanism layer, interaction among different modes is enhanced, and finally a film and television work recommendation list is obtained. The method solves the problems that the existing multi-mode recommendation system is difficult to model the user preference under a specific mode and different mode data are difficult to interact.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a diagram of an overall model architecture of the multi-modal recommendation system of the present invention;
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
It will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
In order to better explain the present embodiment, the technical solutions in the present embodiment will be clearly and completely described below with reference to the drawings in the present embodiment.
The description of the present embodiment is given with reference to fig. 1, and the multi-modal knowledge graph representation method based on multi-prediction tasks provided in the present embodiment specifically includes the following steps:
step one, obtaining information such as evaluation records of the same user on the film and television works, related posters of the film and television works and the like through a crawler algorithm.
And secondly, in order to expand the data set when training the model, data enhancement is carried out on the multi-mode data. In the process of data enhancement, in order to preserve the features of images and texts as much as possible, a MixGen data enhancement method is used, whose expression is as follows:
I k =γ*I i +(1-γ)*I j
T k =T i ■T j
wherein ■ represents a concat connection, gamma represents a parameter between 0 and 1, and gamma takes on a value of 0.5,I k And T k Representing the image and text data resulting from the enhancement of the data. As can be seen from the formula, this retains the data of the image and text as much as possible.
Extracting features of the image and the text, extracting the features of the image, and dividing the input image into blocks with the same size to obtain a sequence containing N blocks: { x 1 ,x 2 ,…,x N },x i Represents p 2 * C-sized pictures, the side length of a square block is denoted as p, and the number of channels of the block is denoted as C; performing linear transformation on the block sequence to obtain a feature vector sequence: { v 1 ,v 2 ,…,v N -a }; adding position codes p to each feature vector i Obtaining an input vector sequence { x' 1 ,x′ 2 ,…,x′ N }。x′ 1 The dimension is denoted as d feature vector. Extracting text features, and word segmentation is carried out on an input text by WordPiece to obtain an input word sequence: { y 1 ,y 2 ,…,y N },y i Representing each text word; adding a tag [ CLS ] at the beginning of a word sequence]A start symbol representing a classification task; adding a tag [ SEP ] at the end of a sentence]Representing the end of the sentence; adding position-coding information q to each word of a word sequence i And mapping the word sequence into a word vector sequence: { y' 1 ,y′ 2 ,…,y′ N },y′ 1 Representing a feature vector of dimension d.
And fourthly, carrying out intra-layer and inter-layer node aggregation on the same mode by utilizing graph convolution, and extracting the fine granularity intention of a user on the film. Processing of different modal features as shown in fig. 2, the image and the features of the image are respectively input into different graph convolution neural networks, and a collaborative interaction graph g= { X, a }, wherein X represents the extracted features of the text and the image, and a is adjacent to the matrix. Each layer of nodes represents different modal characteristics under the agreeing mode, and whether interactions exist among the characteristics is represented by a matrix A, which is defined as follows:
wherein N (v) i ) Representing a neighbor set. The in-layer aggregation is realized through an aggregation function, the topological structure of each node neighborhood and the distribution condition of node characteristics in the neighborhood are known, and the aggregation function of text and image modes is expressed as follows:
wherein the method comprises the steps ofRepresenting adjacency matrix->Degree matrix of->Adjacency matrix representing layer I text features, < >>Representing the results of the aggregation of the first layer text features. />Representing adjacency matrix->Degree matrix of->Adjacency matrix representing layer I text features, < >>Representing the results of the aggregation of the first layer text features.
And fifthly, establishing super nodes for different modes, and establishing interlayer aggregation to establish a relationship between fine granularity and coarse granularity user intention. The representation of the established supernode is as follows:
wherein the method comprises the steps ofRepresenting a set of (l+1) -th layer graph-rolled supernodes in the aggregation of text features, +.>Representing the kth supernode of the (l+1) th layer graph convolution in the aggregation of text features, wherein the dimension of each supernode is d. K (K) (l +1) Representing the number of super nodes at layer (l+1). Wherein->Representing a set of (l+1) -th layer graph rolling super nodes in the aggregation of image features, +.>Representing the kth supernode of the (l+1) th layer graph convolution in the aggregation of the image features, wherein the dimension of each supernode is d. Then performing an inner product on the super node and the nodes aggregated by each graph convolution layer, wherein the inner product is formed by:
wherein the method comprises the steps ofRepresenting the ith node of the convolution layer of the first layer graph in the aggregation of text features, r i,k,t Representing affinity scores for the kth supernode and the ith node in text processing,/->Representing the ith node of the convolution layer of the first layer graph in the aggregation of image features, r i,k,v Representing affinity scores for the kth supernode and the ith node in the image processing. The score is then normalized to calculate the weight assigned to each node.
Wherein the method comprises the steps ofRepresenting the assignment of weights of the ith node established for text feature points to the kth established supernode, wherein +.>And the distribution weight of the ith node established for the image characteristic points to the kth established super node is expressed, and the relation between the bottom layer graph and the high layer graph can be constructed by utilizing the distribution weight. The high-level adjacency matrix can be obtained by the low-level adjacency matrix in the form as follows:
wherein A is t Representing the adjacency matrix of layer 0 in aggregating text features,representing a weight matrix.
And step six, the characteristics of all modes obtained through aggregation are enhanced through a self-attention mechanism layer, interaction among different modes is enhanced, and finally a film and television work recommendation list is obtained. The nodes of each layer in each mode are obtained through continuous iteration, and the specific representation forms are as follows:
wherein the method comprises the steps ofAn ith node representing a text feature at a first level in a graph volume, +.>Representing the ith node of the ith layer of text features in the graph convolution. The acquired nodes of each layer pass through a self-attention mechanism layer, and the expression of the self-attention mechanism is as follows:
wherein K is s 、Q s And V s Representing the feature vector of each node, and K s =Q s =V s . Then, the nodes output by the child attention mechanism layer are spliced, and the expression is as follows:
the method comprises the steps of (1) connecting a symbol, splicing the characteristics finally obtained by aggregation of all modes, and constructing a vector which can be learned by different modes, wherein the expression is as follows:
H i =h i,t +h i,v
u i =u i,t +u i,v
finally, the recommendation is personalized and ranked by using Bayesian personalized ranking, so that a user, observed items and unobserved item triples are constructed, wherein the form is as follows:
τ={(U,H p ,H q )∣A i,p =1,A i,q =0}
wherein U is a leachable vector representing a user, H p Represents the observed item, H q Representing unobserved items, to mitigate unnecessary overlap, a cross entropy loss build model loss function is introduced:
for training τ, the following loss function is used:
the loss function of the final model training is:
where α is an adjustable parameter.
The present invention is capable of other and further embodiments and its several details are capable of modification and variation in light of the present invention, as will be apparent to those skilled in the art, without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (6)

1. An image recommendation system and method based on multi-modal image convolution is characterized by comprising the following steps:
step one, obtaining information such as evaluation records of the same user on the film and television works, related posters of the film and television works and the like through a crawler algorithm.
And step two, preprocessing the data set, enhancing the data by using a MixGen method, and expanding the data set.
And thirdly, extracting information of different modes to respectively obtain vector representations of a text mode and an image data mode in the multiple modes.
And fourthly, carrying out intra-layer and inter-layer node aggregation on the same mode by utilizing graph convolution, and extracting the fine granularity intention of a user on the film.
And fifthly, establishing a relation between fine granularity and coarse granularity user intention by utilizing interlayer aggregation, and establishing super node combination for processing of different modes.
And step six, the characteristics of all modes obtained through aggregation are subjected to an attention mechanism layer, interaction among different modes is enhanced, and finally a film and television work recommendation list is obtained.
2. The image recommendation system and method based on multi-modal image convolution according to claim 1, wherein the image data modality and the text data modality are obtained and expressed in a vector form by using the methods of linear transformation and the like in the second step.
3. The method of claim 2, wherein the step two uses a MixGen data enhancement method. In the process of data enhancement, in order to preserve the features of images and texts as much as possible, a MixGen data enhancement method is used, whose expression is as follows:
I k =γ*I i +(1-γ)*I j
T k =T i ■T j
wherein ■ represents a concat connection, gamma represents a parameter between 0 and 1, gamma takes on a value of 0.5, I k And T k Representing the image and text data resulting from the enhancement of the data. As can be seen from the formula, this retains the data of the image and text as much as possible.
4. A method for extracting fine granularity intention of a user on a movie according to claim 3, wherein in the fourth step, the intra-layer node aggregation method and the inter-layer node aggregation method are performed on the same modality by using graph convolution, features of images and texts are respectively input into different graph convolution neural networks, and a collaborative interaction graph g= { X, a }, wherein X represents the extracted features of the texts and the images, and a adjacency matrix is constructed. Each layer of nodes represents different modal characteristics under the agreeing mode, and whether interactions exist among the characteristics is represented by a matrix A, which is defined as follows:
wherein N (v) i ) Representing a neighbor set. The in-layer aggregation is realized through an aggregation function, the topological structure of each node neighborhood and the distribution condition of node characteristics in the neighborhood are known, and the aggregation function of text and image modes is expressed as follows:
wherein the method comprises the steps ofRepresenting adjacency matrix->Degree matrix of->Adjacency matrix representing layer I text features, < >>Representing the results of the aggregation of the first layer text features. />Representing adjacency matrix->Degree matrix of->Adjacency matrix representing layer I text features, < >>Representing the results of the aggregation of the first layer text features.
5. The method for establishing a relationship between fine-granularity and coarse-granularity user intentions by interlayer aggregation according to claim 4, wherein in the fifth step, super nodes are established for different modalities, and the established super nodes are represented as follows:
wherein the method comprises the steps ofRepresenting a set of (l+1) -th layer graph-rolled supernodes in the aggregation of text features, +.>Representing the kth supernode of the (l+1) th layer graph convolution in the aggregation of text features, wherein the dimension of each supernode is d. K (K) (+1) Representing the number of super nodes at layer (l+1). Wherein->Representing a set of (l+1) -th layer graph rolling super nodes in the aggregation of image features, +.>Representing the kth supernode of the (l+1) th layer graph convolution in the aggregation of the image features, wherein the dimension of each supernode is d. Then performing an inner product on the super node and the nodes aggregated by each graph convolution layer, wherein the inner product is formed by:
wherein the method comprises the steps ofRepresenting the ith section of the convolution layer of the first layer graph in the aggregation of text featuresPoint, r i,k,t Representing affinity scores for the kth supernode and the ith node in text processing,/->Representing the ith node of the convolution layer of the first layer graph in the aggregation of image features, r i,k,v Representing affinity scores for the kth supernode and the ith node in the image processing. The score is then normalized to calculate the weight assigned to each node.
Wherein the method comprises the steps ofRepresenting the assignment of weights of the ith node established for text feature points to the kth established supernode, wherein +.>And the distribution weight of the ith node established for the image characteristic points to the kth established super node is expressed, and the relation between the bottom layer graph and the high layer graph can be constructed by utilizing the distribution weight. The high-level adjacency matrix can be obtained by the low-level adjacency matrix in the form as follows:
wherein the method comprises the steps ofA t Representing the adjacency matrix of layer 0 in aggregating text features,representing a weight matrix.
6. The method of claim 5, wherein the interaction between different modalities is enhanced by a self-attention mechanism in step six. The nodes of each layer in each mode are obtained through continuous iteration, and the specific representation forms are as follows:
wherein the method comprises the steps ofAn ith node representing a text feature at a first level in a graph volume, +.>Representing the ith node of the ith layer of text features in the graph convolution. The acquired nodes of each layer pass through a self-attention mechanism layer, and the expression of the self-attention mechanism is as follows:
where K, Q and V represent feature vectors for each node, and k=q=v. Then, the nodes output by the child attention mechanism layer are spliced, and the expression is as follows:
the method comprises the steps of (1) connecting a symbol, splicing the characteristics finally obtained by aggregation of all modes, and constructing a vector which can be learned by different modes, wherein the expression is as follows:
H i =h i,t +h i,v
u i =u i,t +u i,v
finally, the recommendation is personalized and ranked by using Bayesian personalized ranking, so that a user, observed items and unobserved items are constructed, wherein the form is as follows:
τ={(U,V p ,V q )∣A i,p =1,A i,q =0}。
CN202310669701.3A 2023-06-07 2023-06-07 Image recommendation system and method based on multi-modal image convolution Pending CN116932887A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310669701.3A CN116932887A (en) 2023-06-07 2023-06-07 Image recommendation system and method based on multi-modal image convolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310669701.3A CN116932887A (en) 2023-06-07 2023-06-07 Image recommendation system and method based on multi-modal image convolution

Publications (1)

Publication Number Publication Date
CN116932887A true CN116932887A (en) 2023-10-24

Family

ID=88374548

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310669701.3A Pending CN116932887A (en) 2023-06-07 2023-06-07 Image recommendation system and method based on multi-modal image convolution

Country Status (1)

Country Link
CN (1) CN116932887A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382309A (en) * 2020-03-10 2020-07-07 深圳大学 Short video recommendation method based on graph model, intelligent terminal and storage medium
CN112948708A (en) * 2021-03-05 2021-06-11 清华大学深圳国际研究生院 Short video recommendation method
CN114676315A (en) * 2022-01-28 2022-06-28 齐鲁工业大学 Method and system for constructing attribute fusion interaction recommendation model based on enhanced graph convolution
US20220207587A1 (en) * 2020-12-30 2022-06-30 Beijing Wodong Tianjun Information Technology Co., Ltd. System and method for product recommendation based on multimodal fashion knowledge graph
WO2023024017A1 (en) * 2021-08-26 2023-03-02 Ebay Inc. Multi-modal hypergraph-based click prediction
CN115952307A (en) * 2022-12-30 2023-04-11 合肥工业大学 Recommendation method based on multimodal graph contrast learning, electronic device and storage medium
CN115964560A (en) * 2022-12-07 2023-04-14 南京擎盾信息科技有限公司 Information recommendation method and equipment based on multi-mode pre-training model
CN116186301A (en) * 2022-12-30 2023-05-30 合肥工业大学 Multi-mode hierarchical graph-based multimedia recommendation method, electronic equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382309A (en) * 2020-03-10 2020-07-07 深圳大学 Short video recommendation method based on graph model, intelligent terminal and storage medium
US20220207587A1 (en) * 2020-12-30 2022-06-30 Beijing Wodong Tianjun Information Technology Co., Ltd. System and method for product recommendation based on multimodal fashion knowledge graph
CN112948708A (en) * 2021-03-05 2021-06-11 清华大学深圳国际研究生院 Short video recommendation method
WO2023024017A1 (en) * 2021-08-26 2023-03-02 Ebay Inc. Multi-modal hypergraph-based click prediction
CN114676315A (en) * 2022-01-28 2022-06-28 齐鲁工业大学 Method and system for constructing attribute fusion interaction recommendation model based on enhanced graph convolution
CN115964560A (en) * 2022-12-07 2023-04-14 南京擎盾信息科技有限公司 Information recommendation method and equipment based on multi-mode pre-training model
CN115952307A (en) * 2022-12-30 2023-04-11 合肥工业大学 Recommendation method based on multimodal graph contrast learning, electronic device and storage medium
CN116186301A (en) * 2022-12-30 2023-05-30 合肥工业大学 Multi-mode hierarchical graph-based multimedia recommendation method, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
韩滕跃 等: ""基于对比学习的多模态序列推荐算法"", 《计算机应用》, vol. 42, no. 6, pages 1683 - 1688 *

Similar Documents

Publication Publication Date Title
CN111382309B (en) Short video recommendation method based on graph model, intelligent terminal and storage medium
CN111581510B (en) Shared content processing method, device, computer equipment and storage medium
CN108920641B (en) Information fusion personalized recommendation method
Zhou et al. Predicting movie box-office revenues using deep neural networks
CN108537624B (en) Deep learning-based travel service recommendation method
Katarya et al. Capsmf: a novel product recommender system using deep learning based text analysis model
WO2021139415A1 (en) Data processing method and apparatus, computer readable storage medium, and electronic device
CN112287170B (en) Short video classification method and device based on multi-mode joint learning
CN111143705A (en) Recommendation method based on graph convolution network
US20220253722A1 (en) Recommendation system with adaptive thresholds for neighborhood selection
CN111949885A (en) Personalized recommendation method for scenic spots
CN115964560A (en) Information recommendation method and equipment based on multi-mode pre-training model
Song et al. Coarse-to-fine: A dual-view attention network for click-through rate prediction
Wang et al. An enhanced multi-modal recommendation based on alternate training with knowledge graph representation
Wang et al. Deep Meta-learning in Recommendation Systems: A Survey
Chakder et al. Graph network based approaches for multi-modal movie recommendation system
US20240037133A1 (en) Method and apparatus for recommending cold start object, computer device, and storage medium
Sangeetha et al. Predicting personalized recommendations using GNN
CN115269984A (en) Professional information recommendation method and system
CN115391555A (en) User-perceived knowledge map recommendation system and method
CN116932887A (en) Image recommendation system and method based on multi-modal image convolution
RahmatAbadi et al. Leveraging deep learning techniques on collaborative filtering recommender systems
CN115238191A (en) Object recommendation method and device
Lu Design of a music recommendation model on the basis of multilayer attention representation
Low et al. Recent developments in recommender systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination