CN116561446B - Multi-mode project recommendation method, system and device and storage medium - Google Patents

Multi-mode project recommendation method, system and device and storage medium Download PDF

Info

Publication number
CN116561446B
CN116561446B CN202310834248.7A CN202310834248A CN116561446B CN 116561446 B CN116561446 B CN 116561446B CN 202310834248 A CN202310834248 A CN 202310834248A CN 116561446 B CN116561446 B CN 116561446B
Authority
CN
China
Prior art keywords
user
project
mode
item
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310834248.7A
Other languages
Chinese (zh)
Other versions
CN116561446A (en
Inventor
蔡娟娟
任亚琦
李传珍
张洋
杜怀昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Communication University of China
Original Assignee
Communication University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Communication University of China filed Critical Communication University of China
Priority to CN202310834248.7A priority Critical patent/CN116561446B/en
Publication of CN116561446A publication Critical patent/CN116561446A/en
Application granted granted Critical
Publication of CN116561446B publication Critical patent/CN116561446B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44222Analytics of user selections, e.g. selection of programs or purchase activity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/4508Management of client data or end-user data
    • H04N21/4532Management of client data or end-user data involving end-user characteristics, e.g. viewer profile, preferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4662Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms
    • H04N21/4663Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms involving probabilistic networks, e.g. Bayesian networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4662Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms
    • H04N21/4666Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms using neural networks, e.g. processing the feedback provided by the user
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4667Processing of monitored end-user data, e.g. trend analysis based on the log file of viewer selections
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4668Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Probability & Statistics with Applications (AREA)
  • Social Psychology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a multi-modal item recommendation method, a system, equipment and a storage medium, which belong to the technical field of computers, and personalized item recommendation is realized by constructing a recommendation model based on fusion of multi-relationship item different composition and multi-modal preference; aiming at the multi-relation project heterogeneous composition, acquiring project multi-relation heterogeneous characteristics under different modes on the basis of a drawing meaning network; aiming at the user-project bipartite graph, capturing user mode preference information of injection high-order interaction information in different mode user-project bipartite graph, carrying out attention fusion on multiple mode preferences of the user, enhancing multi-mode preference expression learning of the user, and finally improving accuracy of recommendation results. According to the recommendation system and the recommendation method, the problems of data sparsity and insufficient utilization of the multi-modal information in the existing recommendation system are solved by utilizing the multi-modal information and the personalized recommendation algorithm technology, so that recommendation results are more accurate and interpretable, and the operation effect and the use experience of users are improved.

Description

Multi-mode project recommendation method, system and device and storage medium
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a multi-mode project recommending method, a multi-mode project recommending system, multi-mode project recommending equipment and a storage medium.
Background
With the development of the Internet, the arrival of the information explosion age has completed the transition from information starvation to information overload. The personalized recommendation system relieves the pressure of information overload and helps users to obtain information really helpful to themselves from massive data. The multi-mode information is used as auxiliary information, and is introduced into a recommendation system, so that the problem of data sparsity suffered by a collaborative filtering algorithm can be effectively relieved, the recommendation effect is improved, and the method becomes a research hotspot in the field of the current recommendation system. In addition, various graph structures exist in the multi-mode recommendation scene, and students can capture and learn various complex relations from the graph neural network technology, so that the accuracy and the interpretability of the recommendation system are improved.
The deep neural network benefits from the excellent nonlinear fitting capability, so that not only can deep potential characteristic representations of users and projects be learned, but also complex nonlinear interaction characteristics among the user projects can be learned, and further user preference is analyzed, and therefore the deep neural network is widely focused by students. Traditional deep learning methods such as convolutional neural networks (Convolutional Neural Networks, CNN) and cyclic neural networks (Recurrent Neural Network, RNN) have excellent performance in extracting European spatial data features. Therefore, in multi-modal recommendation based on the conventional neural network, the conventional neural network is generally used as a feature extractor to extract multi-modal features of users and items, and then the user preference representation is constructed by using the historical interaction information of the users, the item attribute information and the multi-modal features. However, the conventional neural network cannot fully mine the rich non-European space diagram data in the multi-modal recommendation scene.
The appearance of the graph neural network provides a new thought for extracting graph data characteristics, solves the problem that the characteristics of non-European data cannot be extracted, and is widely applied to the field of multi-mode recommendation. The graph neural network is essentially a connection model that captures the dependencies between nodes in the graph through the transfer of messages between the nodes in the graph. Through the graph neural network, the recommendation model can more flexibly utilize interaction behavior information of the user and the object to model vector representation of the user and the object, and graph data information in a multi-mode recommendation scene is fully mined, so that the expression capability and the interpretability of the model are improved.
The existing multi-modal recommendation related research based on the graph neural network can be divided into two stages: early research focused on designing complex user-project aggregation strategies for user-project bipartite graphs of different modalities to help model user-project interactions; on the basis of the prior study, projects and users under different modes are further subjected to relation mining and feature modeling by constructing various graph structure data.
The embedded representation of the item and the user is decisive for the recommendation effect of the multimodal recommendation model. In early multi-mode recommendation algorithm research based on the graph neural network, high-order interaction information between a user and a project is captured only through user-project bipartite graphs under different modes, and different graph relations in a multi-mode recommendation scene, such as inter-project relations, inter-user relations and the like, are ignored, so that multi-mode feature representation of the project and the user is imperfect. Aiming at the problem, the prior study carries out multi-mode recommendation study based on a multi-graph neural network by simultaneously utilizing different types of graph data, such as project graphs, user-project bipartite graphs, user graphs and the like, utilizes the multi-graph data to mine potential information between projects and users, and perfects multi-mode embedded representation study of the projects and the users.
After analysis of the existing multi-modal recommendation algorithm based on the graph neural network, two problems are found to exist: firstly, in terms of project representation modeling, when the conventional research is used for project relation mining, the noise problem caused by complex side relation is considered, the project relation is single in use, and polygonal heterogeneous information among projects is not fully mined and utilized; secondly, in the aspect of user representation modeling, in order to reduce information loss, most of the existing researches adopt splicing or summation operation to fuse the user multi-modal characteristic representations, and the weight problem of different modal characteristic representations is ignored. Based on the two problems, the recommendation result of the multi-mode recommendation algorithm in the prior art has the problem of inaccuracy.
Disclosure of Invention
Based on the current situation of the conventional multi-mode recommendation algorithm based on the graph neural network, the invention provides a multi-mode item recommendation method, a system, equipment and a storage medium, which are used for overcoming at least one technical problem in the prior art.
In order to achieve the above object, the present invention provides a multi-modal item recommendation method, including:
constructing a graph through user project interaction data and project multi-mode features to obtain multi-relationship project different graphs and user-project bipartite graphs under different modes;
Drawing characteristics of the multi-relation project iso-graph and the user-project bipartite graph under different modes are extracted, and a project multi-mode representation and a user single-mode preference characteristic set are obtained;
fusing the user single-mode preference feature sets by adopting an attention mechanism technology to obtain a user multi-mode preference feature set;
inputting the item multi-modal representation feature set and the user multi-modal preference feature set into a preset prediction model, and performing Top-K recommendation on a user through prediction function calculation to obtain a primary model of an MRIH-MPF model;
performing model training on a primary model of the MRIH-MPF model through a pre-acquired user project interaction data training set and a project multi-mode training set to obtain the MRIH-MPF model;
and generating a target item recommendation result for the target user through the MRIH-MPF model according to the acquired target user item interaction data and the target item multi-modal data.
In order to solve the above problems, the present invention further provides a multi-modal item recommendation system, the system comprising:
the composition module is used for constructing a graph through user project interaction data and project multi-mode characteristics to obtain multi-relationship project different compositions and user-project bipartite graphs under different modes;
The feature extraction module is used for extracting graph features of the multi-relation project iso-graph and the user-project bipartite graph under different modes to obtain project multi-mode representation and a user single-mode preference feature set;
the feature fusion module is used for fusing the user single-mode preference feature sets by adopting an attention mechanism technology so as to obtain user multi-mode preference feature sets;
the prediction module is used for inputting the item multi-modal representation feature set and the user multi-modal preference feature set into a preset prediction model, and performing Top-K recommendation on the user through prediction function calculation to obtain a primary model of the MRIH-MPF model;
the model training module is used for carrying out model training on the primary model of the MRIH-MPF model through a pre-acquired user project interaction data training set and a project multi-mode training set to obtain the MRIH-MPF model;
and the recommendation module is used for inputting the acquired target user item interaction data and the target item multi-mode data into the MRIH-MPF model to generate a target item recommendation result for the target user.
In order to solve the above problems, the present invention also provides an electronic device including:
at least one processor; the method comprises the steps of,
A memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform steps in the multimodal item recommendation method as previously described.
In order to solve the above-mentioned problems, the present invention further provides a computer readable storage medium, in which at least one instruction is stored, which when executed by a processor in an electronic device, implements the above-mentioned multi-modal item recommendation method.
According to the Multi-mode item recommendation method, system and device and storage medium provided by the invention, personalized item recommendation is realized by constructing an MRIH-MPF model (namely, a recommendation model based on fusion of Multi-relation item iso-composition and Multi-mode preference is called as short term of Multi-relation Item Heterogeneous Graph and Multi-modal Preference Fusion Recommendation), and item Multi-relation heterogeneous characteristics under different modes are obtained from the Multi-relation item iso-composition based on the constructed Multi-relation item iso-composition and a user-item bipartite graph; from the user-project two-part diagram, the user mode preference information of the injected high-order interaction information in the user-project two-part diagram of different modes can be captured through a multi-layer diagram convolution network, attention fusion is carried out on the multi-mode preference of the user, multi-mode preference representation learning of the user is enhanced, and finally accuracy of a recommendation result is improved; according to the recommendation system and the recommendation method, the problems of data sparsity and insufficient utilization of the multi-modal information in the existing recommendation system are solved by utilizing the multi-modal information and the personalized recommendation algorithm technology, so that recommendation results are more accurate and interpretable, and the operation effect and the use experience of users are improved.
In addition, the MRIH-MPF model constructed by the invention can be used for recommending IPTV (interactive network television) personalized multi-mode movies or short videos by combining with Web development technology, thereby improving the recommendation accuracy of IPTV movies or short videos.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart illustrating a multi-modal project recommendation method according to an embodiment of the present invention;
FIG. 2 is a block diagram of an MRIH-MPF model according to an embodiment of the present invention;
FIG. 3 is a flow chart illustrating the construction of a multi-relational project iso-graph according to an embodiment of the present invention;
FIG. 4 is a diagram of user-project bipartite graph and high-level connectivity graph according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of feature extraction of a heterogeneous text modal item provided by an embodiment of the present invention;
FIG. 6 is a schematic diagram of a dual-layer attention module according to an embodiment of the present invention;
FIG. 7 is a flowchart of a movie poster crawling process according to an embodiment of the present invention;
FIG. 8 is a block diagram of a recommendation system according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of a multi-modal project recommendation system according to an embodiment of the present invention;
fig. 10 is a schematic diagram of an internal structure of an electronic device for implementing a multi-mode item recommendation method according to an embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Based on the problems in the prior art, the invention mainly provides a multi-mode item recommending method, a system, equipment and a storage medium, and the main purpose of the invention is to solve the problems of insufficient data sparsity and multi-mode information utilization in a recommending system in the prior art.
Fig. 1 is a flowchart of a multi-modal item recommendation method according to an embodiment of the invention. The method may be performed by a system, which may be implemented in software and/or hardware.
FIG. 1 depicts a multi-modal item recommendation method in its entirety. As shown in FIG. 1, in the present embodiment, the multi-modal item recommendation method includes steps S110 to S160.
S110, constructing a graph through user project interaction data and project multi-mode features to obtain multi-relationship project different graphs and user-project bipartite graphs under different modes.
Specifically, in the embodiment of the present invention, the user item interaction data and the items in the item multimodal feature may be movie recommendation, short video recommendation, and other recommended items that may be completed according to the acquired user interaction data and the user preference. The method and the device mainly carry out deep learning modeling on the multi-modal representation of the project and the multi-modal preference representation of the user in the model by excavating edge relations in the structural data of various graphs, and the model is used in the project recommendation process after the model is trained. Therefore, it is necessary to construct various graph structure data, specifically including multi-relationship project iso-graphs and user-project bipartite graphs in different modalities.
As an optional embodiment of the invention, constructing the graph through the user project interaction data and the project multi-mode characteristics to obtain the multi-relationship project abnormal graph and the user-project bipartite graph under different modes comprises the following steps:
Extracting interaction information of user project interaction data, extracting multi-modal information of project multi-modal features, and calculating side relation of the project and sampling neighbor nodes through the extracted interaction information and the multi-modal information, so that a project graph under each relation is obtained;
aiming at the project diagrams under each relation, fusing the similar semantic project diagrams of the modes under different modes and the co-occurrence collaborative project diagrams to obtain multi-relation project different patterns under different modes;
embedding interest preference of the user node as a building target according to the user project interaction data, and building a corresponding mode-level user-project bipartite graph for each mode.
Specifically, the invention takes the user-project interaction information and the multi-modal information of the project, which are embodied in the user project interaction data, as input to construct multi-relationship project heterograms containing modal similar semantic information and co-occurrence cooperative information under different modes. The specific construction flow is shown in fig. 3, firstly, user-project interaction information extraction and project multi-mode feature extraction are carried out through data processing, then project side relation calculation and neighbor node sampling are carried out through the extracted interaction information and the multi-mode information, thus obtaining project diagrams under each relation, and finally, the similar semantic project diagrams and co-occurrence collaborative project diagrams under different modes are fused, and project abnormal patterns under each mode are obtained.
Is provided withFor the project composer in the corresponding single mode, in the project composer,/>Representing all item nodes under the modality, including image modality->And text modality/>,/>Representing the edge relationship between two nodes in the modality. The project heterograms under a single mode mainly comprise two types of edges, namely representing co-occurrence cooperative edge relationships among the projects and corresponding similar semantic edge relationships under the single mode, respectively using + -> and />To represent. The whole project heterogeneous graph is obtained by co-occurrence and similarity calculation and neighbor project node Top-K sampling, and a construction algorithm is shown in a table 1. Table 1 shows a multiple relationship project heterogeneous graph construction algorithm.
TABLE 1
Co-occurrence collaborative edge relationship between itemsThe method is obtained by calculating the co-occurrence times among the projects by using the user project interaction data and performing Top-K sampling. If two items frequently appear in pairs in the interaction list of all users, some modal level collaboration information between the items is also hidden in the co-occurrence relationship of such items, and when one item is clicked, the probability of the other item being clicked is also greatly increased. In the embodiment of the invention, the co-occurrence times between every two items are calculated through the user item interaction data, then the co-occurrence times of Top-K are sampled for each item, the noise item with lower co-occurrence times is removed, and the highest co-occurrence degree is reserved KIs a collaborative similar item. And finally, forming a cooperative edge relation between each neighbor item and the target item, wherein the direction points to the target item node from the neighbor item node.
Similar semantic edge relationships between itemsBy using the project features under different modes (obtained through pre-training, which may also be referred to as project pre-training features), the project multi-mode features are also obtained through pre-training, and may specifically include the project features under different modes or the project pre-training features, for example, the project image pre-training features, the project text pre-training features in fig. 2) to calculate the similarity between the project single-mode features and obtain the similarity through Top-K sampling. Some similar semantic information exists among items with higher similarity in different modes, and similar semantic edge relations are obtained from the items with similar mode semantic information. These items with modal semantic similarity content are more easily interesting to users with similar modal preferences, helping to extract item preference information for users with modal content similarity in that modality. In the embodiment of the invention, text merging and word segmentation are carried out on the text part of the project, and a Sentence-Bert pre-training model is adopted (the Sentence-Bert is a twin network based on the pre-trained BERT, so that a chapter vector with enough meaning in terms of semanteme can be obtained) to obtain the text modal pre-training characteristics of the project; inputting the image file of the project into a pre-trained Resnet-50 model on the image set to obtain the visual modality pre-training characteristics. Then, calculating the feature similarity between every two items under the mode through the mode pre-training features of the items, then carrying out Top-K similarity sampling on each item, and reserving the mode with the highest semantic similarity with the current item KItems. And finally, each item obtained by sampling and the current item form a single-mode semantic similar side relation, and the direction points to the current item node from the neighbor item node.
User-project bipartite graphIs constructed by historical interaction data of the user and the project (obtained from the interaction data of the user project), as shown in the left sub-graph of FIG. 4, wherein +.> and />Representing user nodes and item nodes interacted with, respectively, side +.>Representing that there is a connective interaction between pairs of user items. When interest preference embedding of user nodes is taken as a building target, the left sub-graph of the graph of FIG. 4 is expanded to obtain +.>A graph is expanded for the tree structure of vertices. At this time, high-order connectivity means that node is reached from any node in the tree diagram +.>The depth of the path is larger than 1, and rich high-order cooperative interaction information is hidden in the high-order connection path. For example, second order path->←/>←/>Represents->And->Behavioral similarity between them because both users are associated with the item +.>There is an interaction. Third order Path->←/>←/>←/>Then indicate->Would like->Because of its similar user->And (2) with project->There is an interaction. From->Is looking at the overall tree diagram of->Possibly pair->Is greater than +. >Because there are two paths connected</>,/> >But only one path connecting</>,/> >. By introducing the graph neural network into the recommendation system and utilizing an iterative propagation mechanism of the graph neural network, the graph neural network can be effectively extractedAnd high-order interaction information in the historical interaction data is used for improving the model recommendation effect.
In multi-mode recommendation based on the graph neural network, certain difference exists in preference information of the same user in different modes. In order to inject high-order connectivity information into the extraction process of user preference information of different modes, in the embodiment of the invention, a corresponding mode-level user-project bipartite graph is constructed for each mode through historical interaction information of a user (namely user project historical interaction information, from user project interaction data). Mode in the embodiment of the inventionMIncluding visionVAnd textTThe same item node in the two-part diagram of the two modes carries different mode content information, and the same user node also carries different mode preference information.
In the embodiment of the invention, the history interaction information of the user is subjected to the two-tuple<,/> >Construction of-> and />The two groups represent edge relations between the two groups, which are nodes in the user-project bipartite graph respectively, and the graph is constructed as an undirected graph because of the bidirectional relation between the two groups. Two built graphs of different modes are respectively input into a graph convolution network, the communication characteristic of the graph neural network is utilized, the receptive field of nodes in the graph is enlarged by increasing the layer number of graph convolution, and high-order interaction information between users and projects under different modes is effectively captured to model user single-mode preference expression.
And S120, extracting graph characteristics of the multi-relation project iso-graph and the user-project bipartite graph under different modes to obtain a project multi-mode representation characteristic set and a user single-mode preference characteristic set.
Specifically, a multi-relationship project different composition module can be constructed, and the text and image pre-training features of the project can be input into the multi-relationship project different composition under the corresponding mode constructed in the step S110 by using the multi-relationship project different composition moduleAnd extracting the multi-relation characteristics of the project, so as to fully mine hidden information among different relations of the project. In this embodiment, the multi-relationship project heterogeneous graph module mainly comprises two parts, namely node feature aggregation in different relationships of the multi-relationship project heterogeneous graph under the corresponding mode and node feature aggregation among different relationships in the multi-relationship project heterogeneous graph. And finishing information propagation aggregation of the heterograms through intra-relationship aggregation and inter-relationship aggregation, so as to obtain project heterogram features corresponding to different modes, and further obtain a project multi-relationship feature set under multiple modes. A user single-mode preference extraction module may be created. The method is used for inputting multi-relation project features of different modes and user ID information (from user project interaction data and available from a user-project bipartite graph) obtained by a multi-relation project heterogeneous graph feature extraction module, inputting the project multi-relation features into the user-project bipartite graph under corresponding modes to serve as initial representation of project nodes in the bipartite graph, initializing the user ID information by normal distribution to obtain initial representation of the user nodes in the bipartite graph, extracting user single-mode preference through graph convolution operation, and obtaining single-mode representation of injected high-order communication information projects, so that a user single-mode preference feature set is obtained.
As an optional embodiment of the invention, extracting graph features of the multi-relation project iso-graph and the user-project bipartite graph under different modes to obtain a project multi-mode representation feature set and a user single-mode preference feature set comprises:
respectively and sequentially carrying out in-relation aggregation and inter-relation aggregation on multi-relation project heterogeneous graphs under each mode in multi-relation project heterogeneous graphs under different modes to finish information propagation aggregation of heterogeneous graphs, obtaining project heterogeneous graph characteristics corresponding to different modes and obtaining a project multi-relation characteristic set under the multi-modes;
inputting project relation features in a project multi-relation feature set under a multi-mode into a user-project bipartite graph under a corresponding mode to be used as project node initial representation in the bipartite graph, initializing user ID information by adopting normal distribution to obtain user node initial representation in the bipartite graph, and extracting user single-mode preference through graph convolution operation to obtain a user single-mode preference feature set and a project single-mode representation feature set with high-order interaction information;
element-level summation is carried out on the item single-mode representation feature set to obtain an item multi-mode representation feature set; wherein the user ID information is derived from user item interaction data.
Specifically, taking text modal multiple relationship heterogeneous graph feature extraction as an example, the aggregation process is as shown in fig. 1-4, and the visual mode or other modes are the same. The intra-relationship node aggregation is used for aggregating neighbor item node information in a single relationship in the mode to the current node. In order to alleviate the influence of node noise information on project feature modeling in the heterogeneous graph sampling process, in the embodiment, a gating attention mechanism is introduced into the project graph of each type of edge relation, and the edge relation of the current node and the neighbor node is subjected to soft clipping and the neighbor node information is subjected to focused aggregation. Specifically, first, linear splicing and transformation are carried out on node characteristics of a current node and a neighbor node to calculate gating scores, so that soft clipping of the side relationship between the two nodes is realized. And then, obtaining the importance degree of the current node to the neighbor node information by calculating the inner integration numbers of the current node and all neighbor nodes, multiplying the gating score and the attention score to obtain an aggregation weight coefficient, and normalizing the final weight score of all neighbor nodes to obtain the aggregation characteristics of the current node under a single relationship. Taking text semantic relation as an example, a specific aggregation formula is as follows:
, wherein ,/>Representing the central item node characteristics obtained by aggregating neighbor node vectors in the relationship under the current relationship, and +.>Representing the central node characteristics of the items before aggregation under the current relation, < + >>Representing the first node of the center under the current relationshipiFeature vector of each neighbor node, +.>The score given when representing the final aggregated neighbor node features is calculated as follows:
wherein ,by gating score->And inter-node attention scoreMultiplying and normalizing. Gating score->The current project node and the neighbor node are linearly spliced and transformed, and then are obtained through an activation functionThe method can perform soft clipping on the side relationship among different nodes to a certain extent, and the attention score among the nodes is +.>The information of the neighbor node which is important to the current node can be obtained by calculating the inner product between the current project node and the neighbor node.
Inter-relationship node aggregation embeds and combines item nodes obtained under different side relationships by introducing an attention mechanism, and the aggregation weight of each side relationship is assumed to be,/>For each edge relationship importance score, final project multiple relationship feature ++>The expression is as follows:
,/>, wherein ,For the corresponding modality finally acquired->The item-multiple relationship feature of the following,
for the different relations in the current modality->Corresponding project node characteristics.
The created user single-mode preference extraction module can mainly use Light-GCN as a graph convolution layer to perform high-order interaction information aggregation of neighbor nodes on a model-level user-project bipartite graph. The Light-GCN consists of a Light-weight graph convolution layer and an embedded combination, wherein the Light-weight graph convolution layer discards two standard operations of feature conversion and nonlinear activation of the traditional graph convolution, so that the parameter number is greatly reduced; the layer embedding combination part gives up the node self-connection operation in the traditional graph convolution, and performs weighted combination on each layer of embedding to serve as the final graph embedding output representation, so that the problem of excessive smoothness of the graph is effectively alleviated. The convolution layer number of the GCN determines the aggregation view size of each vertex, and the first-order to multi-order neighborhood node information of the target node can be aggregated into the current node through iterative propagation of multiple layers of GCNs. The first-order interaction information represents the historical interaction project information of the user, which is most relevant to the interest preference of the user, and the second-order and above high-order interaction information represents the high-order communication information in the two-part graph. The single-mode preference expression of different user nodes is mainly divided into three parts of embedding initialization, single-layer convolution layer information propagation and multi-layer convolution layer information aggregation of user and project bipartite graph nodes.
First, using multiple relationship project heterogeneous graph featuresTo initialize the project initiation layer feature in the user-project bipartite graph>Initializing user ID information by adopting normal distribution to obtain user preference initial characteristics in two graphs of different modal users and projects>. Then, the initialization node is embedded into the user-project bipartite graph +.>Is subject to single-layer convolutional layer propagation.
And finally, aggregating the high-order interaction information in the graph into preference representation embedding of the target user node by utilizing iterative aggregation of Light-GCN to obtain single-mode user interest preference representations output by different convolution layers. The specific single-mode preference feature extraction formula is as follows:
wherein , and />Respectively represent the +.>User node representation and project node representation in layer convolution layer,/->For the standard normalization matrix, ++> and />For user node->And his neighbor item node->Is a contiguous node number of (c). By introducing a normalization matrix during the graph convolution, the longer the path length from the target node, the more information that propagates to the target node is curtailed.
In the multi-layer convolution layer information aggregation, information aggregation is carried out on each layer of convolution layer node representation obtained through graph convolution iteration operation to inhibit the overcomplete problem, and finally user single-mode preference representation and project single-mode node representation are obtained. Inspired by the Light-GCN, in the embodiment of the invention, the characteristics of each node are combined through a weighting coefficient to form the final single-mode representation of each user and item. The specific formula is as follows:
wherein ,representing the number of layers of the graph convolution operation; layer integration weight->Representing node +.>The features of the layers represent the degree of importance. The higher order interaction information is encoded into the user and item representations in a single mode by weighted summation. The same operation is adopted for the bipartite graph of each mode, and after the transmission on the bipartite graph of different modes, the representation of the user and the item of each mode is finally obtained> and />, wherein />Comprises-> and />Two modes.
S130, fusing the user single-mode preference feature sets by adopting an attention mechanism technology to obtain the user multi-mode preference feature set.
Specifically, a user multimodal preference attention fusion module can be created to take as input the extracted user single modality preference representation. And fusing the user single-mode preference feature sets through the attention mechanism technology, so as to obtain the user multi-mode preference feature sets.
As an optional embodiment of the present invention, fusing the user single-mode preference feature set to obtain the user multi-mode preference feature set includes:
constructing a double-layer attention mechanism, wherein the double-layer attention mechanism comprises a single-user mode preference fusion attention layer and a similar-user multi-mode consistency preference fusion attention layer;
Performing feature cross processing on user single-mode preference features in the user single-mode preference feature set, inputting multi-mode cross preference features of each user obtained by feature cross and the user single-mode preference features into a single-user mode preference fusion attention layer to perform personalized single-user multi-mode preference fusion, and obtaining a first user multi-mode preference feature set;
taking each user in the user single-mode preference feature set as a current user respectively, acquiring K similar users most similar to the current user through a co-occurrence method of sampling Top-K co-occurrence times, distributing different attention weights to vectors of the first user multi-mode preference features corresponding to the similar users through a soft attention mechanism from the user first multi-mode preference feature set, obtaining an aggregate attention coefficient of the preference features of each similar user through a weight normalization technology, and carrying out element level multiplication on a value vector obtained by each similar user and the aggregate attention coefficient to obtain a second user multi-mode preference feature set;
and carrying out element summation on the first user multi-mode preference feature set and the second user multi-mode preference feature set for each user to finish residual linking so as to obtain the user multi-mode preference feature set.
Specifically, in order to model the difference of different modal preferences of the user, in this embodiment, the user multi-modal preference modeling process is divided into two parts of operation of single-user modal preference fusion and similar user multi-modal consistency preference fusion, and the two parts are combined by constructing double-layer attention, so as to complete the multi-modal preference attention fusion of the user. The user multimodal preference attention fusion module can be created as a dual layer attention module as shown in fig. 6.
In terms of the fusion of single-user modality preferences,in this embodiment, feature multiplication is performed on the user single-mode preference features in the user single-mode preference feature set, so as to achieve feature crossover, that is. Through feature crossing, the preference features of different modes of the user can be subjected to nonlinear transformation combination, and the more perfect multi-mode cross preference information of each user is obtained. After preference cross feature extraction, the user single-mode preference features and the cross preference features are input into a first layer of attention layer (namely single-user mode preference fusion attention layer) to carry out personalized single-user multi-mode preference fusion. Because the element-level summation can retain most of information of different modal characteristics, the embodiment performs personalized integration on different modal preferences of a user through element-level attention weighted summation, and the integration formula is as follows:
wherein ,user initial multimodal preference representation for current target user via first layer of attention layer,/>For the user +.>Different unimodal preferences represent the corresponding attention coefficients,/->IncludedvtAnd (3) withcI.e., visual modality, text modality, and cross-preference modality; />Is a visual modelCurrent user preference in state indicates +.>For the current user preference representation in text mode, < + >>Is a cross-preference representation of a visual modality with a text modality. And after the operation is performed on each user, the initial multi-mode preference representation of all the users is obtained. In terms of similar user consistency preference fusion, users who frequently interact with the same item have closer multimodal preferences according to the user preference consistency principle. Thus, the multimodal preference personalization fusion pattern of each user is hidden in the co-occurrence relationship of such users. The similar user consistency preference fusion part firstly calculates the users with high co-occurrence times with the target users, and combines a soft attention mechanism with residual connection to carry out personalized fusion on the multi-mode preferences of the similar users. Since the co-occurrence times between the users and the similar users are not consistent, in this embodiment, the users are taken as the current users in turn by sampling the co-occurrence users of the Top-K co-occurrence times, and the current users are the most similar KIndividual users perform personalized multimodal preference aggregation. The aggregation approach is specifically to explore the impact of similar user preferences on target user preferences by assigning different attention weights to multi-modal preference vectors of similar users using a soft attention mechanism. The importance of similar users is weighted according to different similarities, so that the preference expression vector of the target user is more influenced by the preference more similar user vector, and the preference expression vector has larger weight. The specific polymerization formula is as follows:
before aggregation by using a soft attention mechanism, carrying out assignment initialization on three vectors of a key, a value and a query vector, wherein the query vector +.>For the initial multi-mode preference vector representation obtained by single-user multi-mode preference fusion of the current target user, a key vector is +.>Sum vector->Is the first of the current useriInitial multimodal preference vector ++obtained by fusing multimodal preferences of single user for each similar user>. Then, use the initialized query vector +.>Preference vectors, key vectors, respectively, of similar users>Performing inner product to obtain weighted weight, and performing weight normalization on the weighted coefficient through softmax operation to obtain aggregate attention coefficient (L) represented by each similar user preference >. Finally the value vector obtained by each similar user is +.>And->Element level multiplication is performed, and the element level multiplication is performed with the multi-modal preference of the current target userAnd carrying out element summation to finish residual linking, and outputting a final multi-mode preference expression vector of the current user for presetting a prediction model.
S140, inputting the item multi-modal representation feature set and the user multi-modal preference feature set into a preset prediction model, and performing Top-K recommendation on the user through prediction function calculation to obtain a primary model of the MRIH-MPF model.
Specifically, the framework of the primary model of the MRIH-MPF model is shown in FIG. 2, and the model firstly utilizes user project interaction data and project multi-mode characteristics to construct a graph (obtained by step S110) so as to obtain multi-relationship project different-composition graphs and user-project bipartite graphs under different modes. Then, for the constructed multi-graph network, the final user multi-mode preference representation and the project multi-mode representation are obtained for recommendation through multi-relationship project heterogeneous graph feature extraction, user single-mode preference feature extraction (obtained through step S120) and user multi-mode preference attention fusion (obtained through step S130), so that a model framework shown in FIG. 2 is obtained.
As an optional embodiment of the present invention, inputting the item multi-relation feature set and the user multi-mode preference feature set under the multi-mode into a preset prediction model, and performing Top-K recommendation on the user through a prediction function calculation, the obtaining a primary model of the MRIH-MPF model includes:
the item and user are represented as follows according to the item multi-relation feature set and the user multi-mode preference feature set under the multi-mode:
the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>
and />User +.>Final representation and item->Final representation of->User multimodal preferences embodied for a user multimodal preference feature set +.> and />Outputting project list mode characteristics for the user-project bipartite graph under the corresponding mode respectively;
will userFinal representation and item->Inputting the final representation of the (2) into a preset prediction model, and calculating a predictive value by using a calculation formula of a preset inner product function to obtain a user +.>Item->Is a value of the degree of interest of (a); wherein, the calculation formula is:
; wherein ,/>User output for preset predictive model +.>Item->Is a value of the degree of interest of (a);
by means of the userItem->And (3) carrying out descending order sequencing on the interesting degree values of the MRIH-MPF model, and completing Top-K recommendation on the user to obtain a primary model of the MRIH-MPF model.
Specifically, the project multi-relation feature set obtained through the steps and the user multi-mode preference feature set obtained through the fusion of the user multi-mode preference are obtained, and the final project and the user are expressed as follows:
and />User +.>Final representation and item->Final representation of->User multimodal preferences embodied for a user multimodal preference feature set +.> and />Respectively under corresponding modesUser-project bipartite graph output project unimodal features. And finally, inputting the final representations of the user and the item into a preset prediction model to calculate a prediction score. The specific calculation formula is as follows:
in the present embodiment, an inner product function is used as the prediction function. Inner product operation is a common method of computing similarity, and can be used to calculate the inner product number of the user multi-modal preference representation and the item multi-modal representation to measure the correlation between the two. Finally, willUser ∈10 as model output>Item->And sorting in descending order according to the value, and recommending the Top-K to the user.
And S150, performing model training on the primary model of the MRIH-MPF model through a pre-acquired user project interaction data training set and a project multi-mode training set to obtain the MRIH-MPF model.
Specifically, the items in the user item interaction data training set and the item multimodal training set are the same kind of items as the items in the user item interaction data and the item multimodal features in step S110. For example, when the MRIH-MPF model is used for movie recommendation, the training data adopted is user movie interaction data.
In the invention, multi-relation project abnormal patterns, user-project bipartite patterns and a double-layer attention mechanism are introduced into multi-modal recommendation of the graphic neural network, inter-project side relation mining and user multi-modal preference representation modeling research are carried out, and an MRIH-MPF model is designed and realized. The MRIH-MPF model can be based on program data of an IPTV platform, fusion of multi-mode information of a third-party network platform, construction of an IPTV multi-mode movie dataset, and then based on the MRIH-MPF model, combining with a Web development technology, and based on a Django framework (a Web application framework of open source codes), IPTV personalized multi-mode movie recommendation is realized.
The model training is the core of the model construction of the invention, firstly, the parameters needed by the model are determined, the parameters of the model are initialized by utilizing a specified mode, the loss function of the model is determined according to the nature of the problem, and a proper optimizer is selected to optimize the model. In the embodiment of the invention, BPRLoss is used as a loss function, and a BPR (Bayesian personalized ranking) algorithm is a common algorithm for solving the implicit feedback problem in collaborative filtering recommendation. Because there is a large amount of implicit feedback data in the recommendation scenario, i.e., the user does not have an explicit score for the item. Thus, a negative feedback sampling problem is generated, that is, the sampled negative sample is not likely to be a disliked item by the user. In order to solve the negative feedback sampling problem, the BPR is different from SVD (singular value decomposition) algorithm to optimize the recommendation model by using point type learning, and a user project triplet is modeled . wherein ,/>Representing the user-> and />Representing items interacted with by the user and not interacted with, respectively. The BPR algorithm is based on user +_ for the item that has been interacted with>Is greater than or equal to the item +.>And (3) carrying out convergence optimization on the recommended result of the model. The Loss function of the present invention is therefore defined as:
wherein ,to activate the function +.>Is a regularization parameter. />And->Representing the calculated user +.>And (2) is->And articles->Matching degree of-> and />Representing the relevant parameters of the user embedding and the object embedding in the model respectively.
The specific optimization algorithm of the parameters in the model training process uses an Adam algorithm, and is an effective random optimization method which can be completed only by one step. The algorithm uses first and second moment estimates of the first order gradient to dynamically calculate individual adaptive learning rates that adjust different parameters.
In order to prevent the occurrence of the overfitting phenomenon in the training process, the embodiment of the invention controls the training round number by using an early stop method. First, the total training wheel number is setTo prevent the model training time from being too long, the number of early stop steps is set. Each round of verifying recommended effects of the model on the verification set by using corresponding indexes, saving model parameters with the best effects when the model effects are gradually improved, and continuously ++when indexes on the model on the verification set are continuous- >The wheels descend or reach the total training wheel numberAnd stopping training, and applying the optimal parameters to the test set to obtain a test set recommendation result of the model.
As an optional embodiment of the invention, in the process of performing model training on a primary model of the MRIH-MPF model through a pre-acquired user project interaction data training set and a project multi-modal training set to obtain the MRIH-MPF model:
the MRIH-MPF model is used as an IPTV film multi-mode recommendation model;
the user project interaction data training set is user behavior data, program data and user data acquired through an IPTV system;
the project multi-modal training set is an IPTV film multi-modal data set constructed by available multi-modal resources obtained through crawling on the public network platform.
Specifically, in this embodiment, the IPTV data set mainly includes two parts, that is, an IPTV movie interaction data set and an IPTV movie multi-mode data set. The interactive data set is user behavior data, program data, user data and the like acquired by the IPTV system. In the aspect of the IPTV movie multi-modal dataset, the embodiment is used for constructing the IPTV movie multi-modal dataset by crawling available multi-modal resources on a public network platform. A specific acquisition flow of the user project interaction data training set is shown in fig. 7.
In this embodiment, first, corresponding search keywords are generated for each movie according to IPTV movie text data, which specifically includes: movie title movie director show time ]. And then, inputting the search keywords of each movie into a bean-shaped movie search column through a crawler tool to search, entering the corresponding movie in the search result, and storing the first movie cover of the poster page as the poster picture. For a movie which is not crawled, because a part of the movie showing time and the bean net of the IPTV platform have slight differences, in the embodiment, keywords of the movie which are not crawled are reduced, keywords of the showing year are removed, further crawling is performed, and finally all movie poster information is obtained for subsequent use.
And applying the original multi-modal data to a pre-training model of the corresponding mode to extract pre-training features of the corresponding mode. Aiming at text data, the embodiment performs text merging and word segmentation on movie titles, movie introduction and movie labels, and obtains text mode pre-training characteristics through a Sentence-Bert pre-training model; for image data, visual modality pre-training features are derived by inputting movie posters into a pre-trained Resnet-50 model on an image set. And finally, storing the extracted pre-training characteristics of each mode to obtain final IPTV multi-mode data for model training.
Because the number of users involved in the IPTV source data is large, the data scale is huge, in order to reduce the time consumption of constructing a data set, the embodiment selects historical data of the IPTV platform within a certain time range, then determines the IPTV program range involved in constructing the data set, the data selected by the embodiment is a single-set on-demand program in the IPTV platform, and when the data source is screened, the embodiment uses a Hive data warehouse in a large data technology, and the on-demand data of the designated program type is screened out in a designated time period through Hive SQL. The construction of the IPTV interactive data set mainly relates to two technologies of data cleaning and data negative sampling. The data cleaning is mainly to clean interactive data and program text data. The data negative sampling is the process of carrying out negative sampling on IPTV user interaction data.
As an optional embodiment of the invention, after model training is performed on the primary model of the MRIH-MPF model through the pre-acquired user project interaction data training set and the project multi-mode training set, the method further comprises the steps of:
performing accuracy test on the MRIH-MPF model through a preset test set;
after the MRIH-MPF model passes the accuracy test, the MRIH-MPF model is stored into a callable interface, a recommended system instance is developed by utilizing a Web development technology, and the MRIH-MPF model is used as a main algorithm for realizing recommendation at the back end of a Web site.
And specifically, testing the accuracy of the model through a test set. When the MRIH-MPF model is used for movie recommendation, a recommendation system as shown in FIG. 8 can be constructed in conjunction with Web development technology. In order to ensure the accuracy of recommendation, the system evaluates the effect of the model in a Top-K recommendation scene, selects proper evaluation indexes respectively, and performs comparison evaluation from the following two angles:
(1) Under the public data set, comparing the recommendation model with a baseline model to prove the effectiveness of a recommendation system;
(2) And under the IPTV data set, comparing the recommendation model with a baseline model to prove the practicability of the recommendation system.
In the system, a visual display module can be additionally arranged, the module stores the evaluated model into a callable interface, a recommendation system example is developed by utilizing a Web development technology, and an MRIH-MPF model is used as a main algorithm for realizing recommendation at the back end of a Web site, so that a personalized multi-mode recommendation system based on IPTV movie programs is realized, and the visualization is completed.
S160, generating a target item recommendation result for the target user through an MRIH-MPF model according to the acquired target user item interaction data and the target item multi-modal data.
Specifically, the target user is a locked recommended object, the target item is an item recommended to the target user, for example, a movie is recommended to the user a, and then the user a is the target user, and the movie is the target item.
As shown in fig. 9, the present invention provides a multi-modal item recommendation system 200, which may be installed in an electronic device. Depending on the functionality implemented, the multimodal item recommendation system 200 may include a composition module 210, a feature extraction module 220, a feature fusion module 230, a prediction module 240, a model training module 250, and a recommendation module 260. The inventive unit, which may also be referred to as a module, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
In the present embodiment, the functions concerning the respective modules/units are as follows:
the composition module 210 is configured to perform graph construction through user project interaction data and project multi-mode features to obtain a multi-relationship project iso-composition and a user-project bipartite graph under different modes;
the feature extraction module 220 is configured to perform graph feature extraction on the multi-relationship project iso-graph and the user-project bipartite graph under different modes to obtain a project multi-mode representation feature set and a user single-mode preference feature set;
The feature fusion module 230 is configured to fuse the user single-mode preference feature set by adopting an attention mechanism technology, so as to obtain a user multi-mode preference feature set;
the prediction module 240 is configured to input the item multi-modal representation feature set and the user multi-modal preference feature set into a preset prediction model, and perform Top-K recommendation on the user through prediction function calculation to obtain a primary model of the MRIH-MPF model;
the model training module 250 is configured to perform model training on a primary model of the MRIH-MPF model through a pre-acquired user project interaction data training set and a project multi-mode training set, so as to obtain the MRIH-MPF model;
and the recommendation module 260 is used for inputting the acquired target user item interaction data and the target item multi-mode data into the MRIH-MPF model to generate a target item recommendation result for the target user.
According to the Multi-mode project recommendation system 200, personalized project recommendation is realized by constructing an MRIH-MPF model (namely, a recommendation model based on fusion of Multi-relation project iso-composition and Multi-mode preference is called as short term of Multi-relation Item Heterogeneous Graph and Multi-modal Preference Fusion Recommendation), and project Multi-relation heterogeneous characteristics under different modes are obtained from the Multi-relation project iso-composition based on the constructed Multi-relation project iso-composition and a user-project bipartite graph; from the user-project two-part diagram, the user mode preference information of the injected high-order interaction information in the user-project two-part diagram of different modes can be captured through a multi-layer diagram convolution network, attention fusion is carried out on the multi-mode preference of the user, multi-mode preference representation learning of the user is enhanced, and finally accuracy of a recommendation result is improved; according to the recommendation system and the recommendation method, the problems of data sparsity and insufficient utilization of the multi-modal information in the existing recommendation system are solved by utilizing the multi-modal information and the personalized recommendation algorithm technology, so that recommendation results are more accurate and interpretable, and the operation effect and the use experience of users are improved.
As shown in fig. 10, the present invention provides an electronic device 3 of a multi-modal item recommendation method.
The electronic device 3 may comprise a processor 30, a memory 31 and a bus, and may further comprise a computer program stored in the memory 31 and executable on said processor 30, such as a multimodal item recommendation program 32. The memory 31 may also include both internal storage units and external storage devices of the multimodal item recommendation system. The memory 31 may be used not only for storing code installed in application software and various types of data such as a multi-modal item recommendation program, but also for temporarily storing data that has been output or is to be output.
The memory 31 includes at least one type of readable storage medium, including flash memory, a mobile hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 31 may in some embodiments be an internal storage unit of the electronic device 3, such as a removable hard disk of the electronic device 3. The memory 31 may in other embodiments also be an external storage device of the electronic device 3, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device 3. Further, the memory 31 may also include both an internal storage unit and an external storage device of the electronic device 3. The memory 31 may be used not only for storing application software installed in the electronic device 3 and various types of data, such as multi-modal item recommendation method codes, but also for temporarily storing data that has been output or is to be output.
The processor 30 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, combinations of various control chips, and the like. The processor 30 is a Control Unit (Control Unit) of the electronic device, connects respective components of the entire electronic device using various interfaces and lines, and executes various functions of the electronic device 3 and processes data by running or executing programs or modules (e.g., multi-modal item recommendation programs, etc.) stored in the memory 31, and calling data stored in the memory 31.
The bus may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 31 and at least one processor 30 or the like.
Fig. 10 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 10 does not constitute a limitation of the electronic device 3, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.
For example, although not shown, the electronic device 3 may further include a power source (such as a battery) for supplying power to the respective components, and preferably, the power source may be logically connected to the at least one processor 30 through a power management system, so as to implement functions of charge management, discharge management, and power consumption management through the power management system. The power supply may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like. The electronic device 3 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described herein.
Further, the electronic device 3 may also comprise a network interface, optionally comprising a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used for establishing a communication connection between the electronic device 3 and other electronic devices.
The electronic device 3 may optionally further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device 3 and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration within the scope of the application.
The multimodal item recommendation program 32 stored in the memory 31 in the electronic device 3 is a combination of instructions that, when executed in the processor 30, may implement:
s110, constructing a graph through user project interaction data and project multi-mode features to obtain multi-relationship project different graphs and user-project bipartite graphs under different modes;
s120, extracting graph characteristics of multi-relation project iso-graphs and user-project bipartite graphs under different modes to obtain a project multi-mode representation characteristic set and a user single-mode preference characteristic set;
S130, fusing the user single-mode preference feature sets by adopting an attention mechanism technology to obtain a user multi-mode preference feature set;
s140, inputting the item multi-modal representation feature set and the user multi-modal preference feature set into a preset prediction model, and performing Top-K recommendation on the user through prediction function calculation to obtain a primary model of the MRIH-MPF model;
s150, performing model training on a primary model of the MRIH-MPF model through a pre-acquired user project interaction data training set and a project multi-mode training set to obtain the MRIH-MPF model;
s160, generating a target item recommendation result for the target user through the MRIH-MPF model according to the acquired target user item interaction data and the target item multi-mode data.
Specifically, the specific implementation method of the above instructions by the processor 30 may refer to the description of the relevant steps in the corresponding embodiment of fig. 1, which is not repeated herein.
Further, the modules/units integrated by the electronic device 3 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as a stand alone product. The computer readable medium may include: any entity or system capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
Embodiments of the present invention also provide a computer readable storage medium, which may be non-volatile or volatile, storing a computer program which when executed by a processor implements:
s110, constructing a graph through user project interaction data and project multi-mode features to obtain multi-relationship project different graphs and user-project bipartite graphs under different modes;
s120, extracting graph characteristics of multi-relation project iso-graphs and user-project bipartite graphs under different modes to obtain a project multi-mode representation characteristic set and a user single-mode preference characteristic set;
s130, fusing the user single-mode preference feature sets by adopting an attention mechanism technology to obtain a user multi-mode preference feature set;
s140, inputting the item multi-modal representation feature set and the user multi-modal preference feature set into a preset prediction model, and performing Top-K recommendation on the user through prediction function calculation to obtain a primary model of the MRIH-MPF model;
s150, performing model training on a primary model of the MRIH-MPF model through a pre-acquired user project interaction data training set and a project multi-mode training set to obtain the MRIH-MPF model;
S160, generating a target item recommendation result for the target user through the MRIH-MPF model according to the acquired target user item interaction data and the target item multi-mode data.
In particular, the specific implementation method of the computer program when executed by the processor may refer to descriptions of related steps in the multi-modal item recommendation method of the embodiment, which are not described herein in detail.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, system and method may be implemented in other manners. For example, the system embodiments described above are merely illustrative, e.g., the division of the modules is merely a logical function division, and other manners of division may be implemented in practice.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. Multiple units or systems as set forth in the system claims may also be implemented by means of one unit or system in software or hardware. The terms second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (8)

1. A multi-mode project recommending method is characterized by comprising the following steps:
constructing a graph through user project interaction data and project multi-mode features to obtain multi-relationship project different graphs and user-project bipartite graphs under different modes; the method comprises the following steps of: extracting interaction information of the user project interaction data, extracting multi-modal information of the project multi-modal characteristics, and calculating the side relationship of the project and sampling neighbor nodes through the extracted interaction information and the multi-modal information so as to obtain a project graph under each relationship; fusing the modal similar semantic item graphs and the co-occurrence collaborative item graphs under different modalities aiming at the item graphs under each relation to obtain multi-relation item iso-graphs under different modalities; embedding interest preference of user nodes as a construction target according to the user project interaction data, and constructing a corresponding user-project bipartite graph carrying high-order communication information for each mode;
drawing characteristics of the multi-relation project iso-graph and the user-project bipartite graph under different modes are extracted, and a project multi-mode representation characteristic set and a user single-mode preference characteristic set are obtained; the method comprises the following steps of: respectively and sequentially carrying out intra-relationship aggregation and inter-relationship aggregation on the multi-relationship project heterogeneous graphs under each mode in the multi-relationship project heterogeneous graphs under different modes to finish information propagation aggregation of the heterogeneous graphs, and acquiring project heterogeneous graph characteristics corresponding to different modes to obtain a project multi-relationship characteristic set under the multi-modes; inputting the project relation features in the project multi-relation feature set under the multi-mode into a user-project bipartite graph under the corresponding mode to be used as project node initial representation in the bipartite graph, initializing user ID information by adopting normal distribution to obtain user node initial representation in the bipartite graph, and extracting user single-mode preference through graph convolution operation to obtain a user single-mode preference feature set and a project single-mode representation feature set with high-order interaction information; element-level summation is carried out on the item single-mode representation feature set to obtain an item multi-mode representation feature set; wherein the user ID information is from the user item interaction data;
Fusing the user single-mode preference feature sets by adopting an attention mechanism technology to obtain a user multi-mode preference feature set;
inputting the item multi-modal representation feature set and the user multi-modal preference feature set into a preset prediction model, and performing Top-K recommendation on a user through prediction function calculation to obtain a primary model of an MRIH-MPF model;
performing model training on a primary model of the MRIH-MPF model through a pre-acquired user project interaction data training set and a project multi-mode training set to obtain the MRIH-MPF model;
and generating a target item recommendation result for the target user through the MRIH-MPF model according to the acquired target user item interaction data and the target item multi-modal data.
2. The method of claim 1, wherein the fusing the user single-mode preference feature set to obtain the user multi-mode preference feature set by using an attention mechanism technology comprises:
constructing a double-layer attention mechanism, wherein the double-layer attention mechanism comprises a single-user mode preference fusion attention layer and a similar-user multi-mode consistency preference fusion attention layer;
Performing feature cross processing on the user single-mode preference features in the user single-mode preference feature set, inputting the multi-mode cross preference features of each user obtained by feature cross and the user single-mode preference features into the single-user mode preference fusion attention layer to perform personalized single-user multi-mode preference fusion, and obtaining a first user multi-mode preference feature set;
taking each user in the user single-mode preference feature set as a current user respectively, acquiring K similar users most similar to the current user through a co-occurrence method of sampling Top-K co-occurrence times, distributing different attention weights to vectors of first user multi-mode preference features corresponding to the similar users through a soft attention mechanism from the user first multi-mode preference feature set, acquiring an aggregate attention coefficient of the preference features of each similar user through a weight normalization technology, and multiplying a value vector acquired through each similar user by an element level of the aggregate attention coefficient to acquire a second user multi-mode preference feature set;
and for each user, carrying out element summation on the first user multi-mode preference feature set and the second user multi-mode preference feature set to finish residual linking so as to obtain a user multi-mode preference feature set.
3. The multi-modal item recommendation method according to claim 1, wherein the inputting the item multi-modal representation feature set and the user multi-modal preference feature set into a preset prediction model, and performing Top-K recommendation on the user through a prediction function calculation, and obtaining a primary model of an MRIH-MPF model comprises:
and obtaining the representation of the item and the user according to the item multi-relation feature set in the multi-mode and the user multi-mode preference feature set as follows:
; wherein ,
and />User +.>Final representation and item->Final representation of->User multimodal preferences embodied for a user multimodal preference feature set +.> and />Outputting project list mode characteristics for the user-project bipartite graph under the corresponding mode respectively;
the user is provided withFinal representation of (c) and said item->Inputting the final representation of the (2) into a preset prediction model, and calculating a predictive value by using a calculation formula of a preset inner product function to obtain a user +.>Item->Is a value of the degree of interest of (a); wherein, the calculation formula is:
; wherein ,
user +.>Item->Is a value of the degree of interest of (a);
by the userItem- >And (3) carrying out descending order sequencing on the interesting degree values of the MRIH-MPF model, and completing Top-K recommendation on the user to obtain a primary model of the MRIH-MPF model.
4. The multi-modal item recommendation method according to claim 1, wherein in the process of model training the primary model of the MRIH-MPF model through the pre-acquired user item interaction data training set and item multi-modal training set to obtain the MRIH-MPF model:
the MRIH-MPF model is used as an IPTV film multi-mode recommendation model;
the user project interaction data training set is user behavior data, program data and user data acquired through an IPTV system;
the project multi-modal training set is an IPTV film multi-modal data set constructed by available multi-modal resources obtained through crawling on a public network platform.
5. The multi-modal item recommendation method of claim 4, further comprising, after the model training of the primary model of the MRIH-MPF model by the pre-acquired user item interaction data training set and item multi-modal training set, obtaining an MRIH-MPF model:
performing accuracy test on the MRIH-MPF model through a preset test set;
After the MRIH-MPF model passes the accuracy test, the MRIH-MPF model is stored into a callable interface, a recommended system instance is developed by utilizing a Web development technology, and the MRIH-MPF model is used as a main algorithm for realizing recommendation at the back end of a Web site.
6. A multimodal item recommendation system, the system comprising:
the composition module is used for constructing a graph through user project interaction data and project multi-mode characteristics to obtain multi-relationship project different compositions and user-project bipartite graphs under different modes; wherein the patterning module comprises:
the information extraction unit is used for extracting interaction information of the user project interaction data and multi-modal information of the project multi-modal characteristics, and calculating the side relationship of the project and sampling neighbor nodes through the extracted interaction information and multi-modal information, so that a project graph under each relationship is obtained;
the fusion unit is used for fusing the modal similar semantic item graphs and the co-occurrence collaborative item graphs under different modes aiming at the item graphs under each relation to acquire multi-relation item different graphs under different modes;
the composition unit is used for constructing a corresponding modal-level user-project bipartite graph carrying high-order communication information for each modal by taking interest preference embedding of user nodes as a construction target according to the user project interaction data;
The feature extraction module is used for extracting the graph features of the multi-relation project heterograms and the user-project bipartite graphs under different modes to obtain a project multi-mode representation feature set and a user single-mode preference feature set; wherein, the feature extraction module includes:
the aggregation unit is used for respectively and sequentially carrying out intra-relation aggregation and inter-relation aggregation on the multi-relation project heterogeneous graphs under each mode in the multi-relation project heterogeneous graphs under different modes so as to finish information propagation aggregation of the heterogeneous graphs, obtain project heterogeneous graph characteristics corresponding to different modes and obtain a project multi-relation characteristic set under the multi-modes;
the preference extraction unit is used for inputting the project relation characteristics in the project multi-relation characteristic set under the multi-mode into a user-project bipartite graph under the corresponding mode to be used as project node initial representation in the bipartite graph, initializing user ID information by adopting normal distribution to obtain user node initial representation in the bipartite graph, and extracting user single-mode preference through graph convolution operation to obtain a user single-mode preference characteristic set and a project single-mode representation characteristic set with high-order interaction information;
the summing unit is used for carrying out element-level summation on the item single-mode representation feature set to obtain an item multi-mode representation feature set; wherein the user ID information is derived from the user item interaction data
The feature fusion module is used for fusing the user single-mode preference feature sets by adopting an attention mechanism technology so as to obtain user multi-mode preference feature sets;
the prediction module is used for inputting the item multi-modal representation feature set and the user multi-modal preference feature set into a preset prediction model, and performing Top-K recommendation on the user through prediction function calculation to obtain a primary model of the MRIH-MPF model;
the model training module is used for carrying out model training on the primary model of the MRIH-MPF model through a pre-acquired user project interaction data training set and a project multi-mode training set to obtain the MRIH-MPF model;
and the recommendation module is used for inputting the acquired target user item interaction data and the target item multi-mode data into the MRIH-MPF model to generate a target item recommendation result for the target user.
7. An electronic device, the electronic device comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps in the multimodal item recommendation method of any of claims 1 to 5.
8. A computer readable storage medium storing at least one instruction, wherein the at least one instruction when executed by a processor in an electronic device implements the multimodal item recommendation method of any of claims 1 to 5.
CN202310834248.7A 2023-07-10 2023-07-10 Multi-mode project recommendation method, system and device and storage medium Active CN116561446B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310834248.7A CN116561446B (en) 2023-07-10 2023-07-10 Multi-mode project recommendation method, system and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310834248.7A CN116561446B (en) 2023-07-10 2023-07-10 Multi-mode project recommendation method, system and device and storage medium

Publications (2)

Publication Number Publication Date
CN116561446A CN116561446A (en) 2023-08-08
CN116561446B true CN116561446B (en) 2023-10-20

Family

ID=87486546

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310834248.7A Active CN116561446B (en) 2023-07-10 2023-07-10 Multi-mode project recommendation method, system and device and storage medium

Country Status (1)

Country Link
CN (1) CN116561446B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382309A (en) * 2020-03-10 2020-07-07 深圳大学 Short video recommendation method based on graph model, intelligent terminal and storage medium
CN113191154A (en) * 2021-03-04 2021-07-30 浙江师范大学 Semantic analysis method, system and storage medium based on multi-modal graph neural network
CN114090848A (en) * 2021-10-25 2022-02-25 阿里巴巴(中国)有限公司 Data recommendation and classification method, feature fusion model and electronic equipment
CN114357201A (en) * 2022-03-10 2022-04-15 中国传媒大学 Audio-visual recommendation method and system based on information perception
CN114969532A (en) * 2022-06-01 2022-08-30 中南大学 Multi-modal traffic recommendation method based on heterogeneous graph neural network
CN115878904A (en) * 2023-02-22 2023-03-31 深圳昊通技术有限公司 Intellectual property personalized recommendation method, system and medium based on deep learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11100400B2 (en) * 2018-02-15 2021-08-24 Adobe Inc. Generating visually-aware item recommendations using a personalized preference ranking network
US20230106416A1 (en) * 2021-10-05 2023-04-06 Microsoft Technology Licensing, Llc Graph-based labeling of heterogenous digital content items

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382309A (en) * 2020-03-10 2020-07-07 深圳大学 Short video recommendation method based on graph model, intelligent terminal and storage medium
CN113191154A (en) * 2021-03-04 2021-07-30 浙江师范大学 Semantic analysis method, system and storage medium based on multi-modal graph neural network
CN114090848A (en) * 2021-10-25 2022-02-25 阿里巴巴(中国)有限公司 Data recommendation and classification method, feature fusion model and electronic equipment
CN114357201A (en) * 2022-03-10 2022-04-15 中国传媒大学 Audio-visual recommendation method and system based on information perception
CN114969532A (en) * 2022-06-01 2022-08-30 中南大学 Multi-modal traffic recommendation method based on heterogeneous graph neural network
CN115878904A (en) * 2023-02-22 2023-03-31 深圳昊通技术有限公司 Intellectual property personalized recommendation method, system and medium based on deep learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BKGNN-TI: A Bilinear Knowledge-Aware Graph Neural Network Fusing Text Information for Recommendation;Yang Zhang等;International Journal of Computational Intelligence Systems;第1-20页 *
Is the suggested you’re your desired?: Multi-modal recipe recommendation with demand-based knowledge graph;Zhenfeng Lei等;Expert Systems With Applications;第1-14页 *
个性化推荐系统在IPTV系统中的应用;杨剑锋等;广播电视网络;第103-107页 *
基于多模态图神经网络的药物相互作用预测算法研究;张园园;中国优秀硕士学位论文全文数据库 医药卫生科技辑;第3章 *

Also Published As

Publication number Publication date
CN116561446A (en) 2023-08-08

Similar Documents

Publication Publication Date Title
CN111382309B (en) Short video recommendation method based on graph model, intelligent terminal and storage medium
CN111581510B (en) Shared content processing method, device, computer equipment and storage medium
Tang et al. Tri-clustered tensor completion for social-aware image tag refinement
CN112836120A (en) Multi-mode knowledge graph-based movie recommendation method, system and terminal
CN112163165A (en) Information recommendation method, device, equipment and computer readable storage medium
US20220171760A1 (en) Data processing method and apparatus, computer-readable storage medium, and electronic device
CN112733027B (en) Hybrid recommendation method based on local and global representation model joint learning
US20230316379A1 (en) Deep learning based visual compatibility prediction for bundle recommendations
Li et al. Product innovation concept generation based on deep learning and Kansei engineering
CN113297370B (en) End-to-end multi-modal question-answering method and system based on multi-interaction attention
Bai et al. Discriminative latent semantic graph for video captioning
Guo et al. Attention based consistent semantic learning for micro-video scene recognition
CN113326384A (en) Construction method of interpretable recommendation model based on knowledge graph
CN115964560A (en) Information recommendation method and equipment based on multi-mode pre-training model
Zhu et al. Deep learning for video-text retrieval: a review
CN115640449A (en) Media object recommendation method and device, computer equipment and storage medium
Liu et al. Scanning, attention, and reasoning multimodal content for sentiment analysis
CN116561446B (en) Multi-mode project recommendation method, system and device and storage medium
Ye et al. Human action recognition method based on Motion Excitation and Temporal Aggregation module
Ren et al. A co-attention based multi-modal fusion network for review helpfulness prediction
CN116932862A (en) Cold start object recommendation method, cold start object recommendation device, computer equipment and storage medium
Deng et al. Similitude attentive relation network for click-through rate prediction
CN117252665B (en) Service recommendation method and device, electronic equipment and storage medium
Anand et al. KEMM: A Knowledge Enhanced Multitask Model for Travel Recommendation
Cui et al. Transformer-based difference fusion network for RGB-D salient object detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant