CN116431914A

CN116431914A - Cross-domain recommendation method and system based on personalized preference transfer model

Info

Publication number: CN116431914A
Application number: CN202310389375.0A
Authority: CN
Inventors: 许嘉; 王歆
Original assignee: Guangxi University
Current assignee: Guangxi University
Priority date: 2023-04-12
Filing date: 2023-04-12
Publication date: 2023-07-14

Abstract

The invention discloses a cross-domain recommendation method and a system based on a personalized preference transfer model, wherein the method comprises the following steps: constructing a personalized preference transfer model, firstly constructing a multi-relation graph according to interaction of a user and an item, basic information and corresponding comments, and calculating embedding of the user and the item in a source domain and a target domain respectively according to information in the graph; then identifying similar users for the target users in the source domain according to the learned user representation, and learning personalized preference transfer functions by using individual characteristics of the target users and common characteristics of the similar users; finally, transferring user preferences according to the learned personalized preference transfer function, and calculating predictive scores of the user after transferring on items in the target domain according to the transferred user embedding so as to realize personalized recommendation of the source domain user in the target domain; and acquiring interaction, basic information and corresponding comments of the user and the project, and inputting a personalized preference transfer model to realize personalized recommendation. The invention provides better recommendation performance for cross-domain recommendation.

Description

Cross-domain recommendation method and system based on personalized preference transfer model

Technical Field

The invention relates to the field of recommendation systems, in particular to a cross-domain recommendation method and system based on a personalized preference transfer model.

Background

The recommendation system is widely applied in the field of Internet, and recommends personalized articles to users, so that great help is provided for solving the information overload problem faced by the users. Conventional recommendation systems employ Collaborative Filtering (CF) and Matrix Factorization (MF) techniques on historical user-item interactions (e.g., ratings, purchases, or clicks, etc.) to implement recommendations, the accuracy of which is largely determined by the number of user-item interactions. However, when new users are handled, the interaction data becomes very sparse, resulting in a so-called user cold start problem.

In order to alleviate the problem of cold start of users, many efforts have recently been made with good results. These tasks either learn the preferences of the cold-boot users from ancillary information (e.g., user portraits and item attributes) or apply contextual element learning to adapt the global shared priors (or preferences) to each cold-boot user's personalized priors based on the sparse interaction data of each cold-boot user. Cross-domain recommendations (CDRs) are another important technical route to alleviate the user's cold start problem, which aims to transfer useful knowledge from a source domain (or auxiliary domain) to a target domain, and recently has received increasing attention from the academia and industry. CDRs are more challenging than those based on side information or meta-learning because it needs to reasonably solve the two core issues, namely "what to transfer" and "how to transfer".

To address the first core problem, classical machine learning methods such as decomposition (FM) and Latent Semantic Analysis (LSA) are first used to learn useful knowledge of the auxiliary domain (such as user preferences) and then transferred to the target domain to enhance the prediction of target domain user preferences. Thanks to the rapid development of deep learning techniques, many deep learning-based methods have been proposed in recent years to solve the first core problem, and these methods are more powerful than classical machine learning methods, facilitating further development of CDRs. To solve the second core problem, many works use information of overlapping users (or items) to establish a connection between two domains, but when there is no overlapping object between the two domains, the connection of the two domains is established by a method of capturing tag correlation and active learning.

Although significant progress has been made by current research efforts, there are still the following disadvantages in them:

first, when computing a representation of a user and an item, neglecting user-user relationships and item-item relationships. For the first core problem of CDRs, namely "what to transfer," the task of most concern is to learn representations of users and items in each domain, as the quality of these information plays a vital role in the subsequent stages. Existing work calculates representations of users and items based only on explicitly available user-item interactions, while implicit user-user relationships (determined by user portraits and comment information) and item-item relationships (by analyzing descriptions of items) are typically ignored. Recent studies have shown that user-user and item-item relationships are very helpful for capturing user and item features comprehensively and accurately, especially for users and items that have less interaction. For example, FIG. 1 illustrates three heterogeneous relationships, namely user-item, user-user, item-item relationships in the movie field. As shown in fig. 1, a cold start user u _i No interaction with the movie, if only the user-item relationship is considered, this results in a poor personalized recommendation to the user. However, user u may be better predicted when considering user-user and item-item relationships _i For film m _j Is a score of (2). In particular due to AND u _i User u with similar preferences ₂ And u ₄ Separately give film m ₁ ("Taitannig number" and m) ₂ (space-time lover) gives the highest 5 points, m _j ("your name" and m) ₁ And m ₂ Very similar in terms of their labels (e.g., temperament and romance). In summary, for users-users and articlesNeglecting the item relationships makes most recent studies not sufficiently accurate for modeling the representation of the user and the item, which has a negative impact on ensuring the effectiveness of personalized recommendations in the targeted area.

And secondly, when transferring user preferences, the personal characteristics and the common characteristics of the users are not considered at the same time. For the second core problem of CDR, namely "how to transfer", one of its main tasks is to transfer user preferences between two connected domains. To transfer user preferences, some research works consider that all users' preferences have the same relationship between source and target domains and learn a common user preference transfer function for all users, as shown in FIG. 2 (a). However, due to the complex personal characteristics of the user, the general transfer function may degrade the performance of the CDR. To overcome this disadvantage Zhu et al propose to employ a personalized preference transfer function generated based on individual interactions of each user in the source domain (i.e. f ₁ ,…, _n ) As shown in fig. 2 (b). However, the effectiveness of this approach depends largely on the number of user-project interactions, which is very scarce in user cold start scenarios. In this case, common features extracted from similar users may be applied to improve the learned personalized transfer function according to the user collaborative filtering theory. Taking fig. 2 (c) as an example, user u is shown _j Few historical interactions with the project, the description being based on user u _j The individual features of the interactive coding of (a) are inadequate. In this case, in addition to u _j Is from a group other than the individual features of user u _j Users with similar interests (e.g. G ₁ ＝{u ₁ ,u ₂ ,..}) may help to improve the learned personalized transfer function

Because existing CDR methods cannot learn the user's personalized preference transfer function using both individual and common features of the user (collectively referred to herein as transferable features), the recommendations produced by these methods are less than satisfactory.

Accordingly, there is a need for an effective method to solve the above-mentioned problems.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: aiming at the technical problems existing in the prior art, the invention provides a cross-domain recommendation method and a system based on a personalized preference transfer model, and in order to well solve the first problem, firstly, high-quality representations of users and items to be transferred are calculated based on heterogeneous explicit and implicit relations between the users and the items. And then, the individual characteristics of the user and the common characteristics of the similar users are considered at the same time, so that a more effective personalized preference transfer function between two fields is learned for the user, and the second problem is effectively relieved. Finally, based on the learned personalized preference transfer function of each user, user embedding transferred to the target domain is calculated, personalized recommendation of each user in the target domain is realized, and better recommendation performance is provided for cross-domain recommendation.

In order to solve the technical problems, the technical scheme provided by the invention is as follows:

a cross-domain recommendation method based on a personalized preference transfer model comprises the following steps:

s1) constructing a personalized preference transfer model, wherein the personalized preference transfer model firstly constructs a multi-relation graph according to interaction, basic information and corresponding comments of users and items, and calculates embedding of all users and items in a source domain and a target domain respectively according to information of the multi-relation graph; then according to the learned user representation, identifying similar users for each target user in a source domain by adopting a soft clustering method, and learning the personalized preference transfer function of the target user by using the individual characteristics of the target user and the common characteristics of the similar users; finally, calculating user embedding transferred from the source domain to the target domain according to the learned personalized preference transfer function of each user, and calculating the prediction score of the transferred user embedding on the items in the target domain according to the transferred user embedding so as to realize personalized recommendation of the source domain user in the target domain;

s2) acquiring interaction, basic information and corresponding comments of the users and the items, and inputting a personalized preference transfer model to obtain personalized recommendation of each user.

Further, in step S1, a multiple relationship diagram is constructed according to interactions, basic information and corresponding comments of users and items, and when embedding of all users and items of a target is calculated in a source domain and a target domain according to information of the multiple relationship diagram, the method specifically includes:

s101) converting the basic information and the user comments related to the user into a user document vector, converting the basic information and the user comments related to the project into a project document vector, calculating the similarity probability between each pair of users in the user document vector, and calculating the similarity probability between each pair of projects in the project document vector; then summarizing all users and items into nodes of a multi-relation graph, generating corresponding user-user heterogeneous edges in the multi-relation graph according to similarity probability between each pair of users, generating corresponding item-item heterogeneous edges in the multi-relation graph according to similarity probability between each pair of items, and generating corresponding user-item edges in the multi-relation graph according to historical interaction of the users and the items;

s102) according to an embedding strategy facing heterogeneous relations, taking a user-user relation or an item-item relation in the multi-relation diagram as a similar relation, taking the user-item relation in the multi-relation diagram as an interactive relation, and measuring the distance d between two nodes in a potential vector space by using Euclidean distance for each similar relation _sr (n _i ,n _j ) And calculate d _sr (n _i ,n _j ) Loss function at minimum

All interaction relationships are also modeled as translations d between nodes in the potential vector space using an explicit translation-based approach _ir (n _p ,n _q ) And calculate d _ir (n _p ,n _q ) Loss function at minimum->

Finally, the loss function is->

And loss function->

Together minimized, an embedding matrix for the user or item is obtained.

Further, in step S101, when generating a corresponding user-user heterogeneous edge in the multiple relationship graph according to the similarity probability between each pair of users, generating a corresponding item-item heterogeneous edge in the multiple relationship graph according to the similarity probability between each pair of items, and generating a corresponding user-item edge in the multiple relationship graph according to the historical interaction between the user and the item, the method includes:

if the similarity probability between a pair of users or the similarity probability between a pair of items is larger than a preset threshold, generating corresponding user-user heterogeneous edges or item-item heterogeneous edges in the multi-relation graph, and taking the similarity probability corresponding to the user-user heterogeneous edges or the item-item heterogeneous edges as the weight of the user-user heterogeneous edges or the item-item heterogeneous edges;

if the historical interaction between the user and the project exists, generating a corresponding user-project side in the multi-relation diagram, and calculating the quotient of the historical score of the user to the project and the maximum value in a preset scoring matrix to be used as the weight of the user-project side.

Further, in step S1, according to the learned user representation, when identifying similar users for each target user in the source domain by adopting a soft clustering method, and learning the personalized preference transfer function of the target user by using the individual features of the target user and the common features of the similar users, the method specifically includes:

s201) grouping all users in a source domain into K clusters, respectively calculating the allocation probability of embedding each user to the kth cluster, and then establishing a target distribution Y to guide an unsupervised cluster loss function

Further updating a clustering center, multiplying the distribution probability of the target user to each cluster by the clustering center of the class, and accumulating to obtain the common characteristics of the target users;

s202) obtaining an interactive item list of a target user in a source domain, respectively calculating the attention score of each item in the interactive item list by using an attention network, normalizing to obtain the weight of each item, and accumulating after embedding each item in the interactive item list of the target user by the corresponding weight to obtain the individual characteristics of the target user;

s203) connecting the common characteristics and the individual characteristics of the target user through a connecting operation to obtain transferable characteristics of the target user, learning a personalized preference transfer function of the target user from a source domain to a target domain by using a preset neural network aiming at the transferable characteristics of the target user, and generating a representation of the target user transferred in the target domain according to the personalized preference transfer function of the target user.

Further, in step S201, a target distribution Y is established to guide the unsupervised cluster loss function

And further updating the cluster center, comprising:

constructing a soft allocation matrix recording soft allocation information of K clusters of source domains

The definition of the target distribution Y is as follows:

wherein, sigma _i S _i,k For the kth cluster center o _k Soft cluster frequency of S _i,k Representing user u _i Assigned to the kth cluster o _k Probability of (2);

defining a cluster loss function using KL divergence between a soft allocation matrix S and a target distribution Y

The soft allocation is distributed close to the target as follows:

in each iteration, the set o= [ O ] of cluster centers ₁ ，o ₂ ，...，o _K ]Updating by random gradient descent, each cluster center o _k Corresponding to

The gradient was calculated as follows:

wherein beta is expressed as the degree of freedom of the student's t-distribution, e _ui Representing user u _i The update of the kth cluster center is as follows:

wherein ω is the learning rate controlling the update speed of the cluster center.

Further, the attention score expression of each item in the interactive item list in step S202 is as follows:

wherein the function g (·; η) is the attention network, η is the learnable parameter, e _vj Representing item v _j Is embedded in the mold;

the weight expression for each item is as follows:

Wherein B is _ui Representing user u in source domain _i Interactive item list, v _l Representing user u _i A first item in the list of interaction items.

Further, in step S203, learning a personalized preference transfer function of the target user from the source domain to the target domain by using a preset neural network for the transferable characteristics of the target user, and then generating a representation of the target user transferred in the target domain according to the personalized preference transfer function of the target user, including:

learning a personalized preference transfer function expression of the target user from the source domain to the target domain for transferable characteristics of the target user using a preset neural network as follows:

wherein the vector is

Parameters comprising a personalized preference transfer function, the function h (.; phi) being a two-layer neural network parameterized by phi +.>

Representing transferable characteristics of the target user;

vector

Remodelling into matrix->

To fit the magnitude of the parameter of the preferential transfer function, where d _e Representing the embedding dimension of the user;

in matrix M _ui As a parameter, use target user u _i Is used for generating a target user u _i The translated representation in the target domain is expressed as follows:

wherein,,

representing a target user u in a source domain _i Is a representation of (c).

Further, in step S1, when calculating the predicted score of the target domain item according to the transferred user embedding, the method specifically includes: respectively performing dot product operation on the representation of the target user transferred in the target domain and each item representation in the target domain to obtain the predictive score of the target user on each item in the target domain, and generating personalized recommendation of the target user according to the predictive score of each item.

Further, after obtaining the predictive score of the target user for each item in the target domain, the method further includes: the personalized preference transfer function is trained in a task-based optimization mode, and loss between each prediction score and the real score is minimized.

The invention also provides a cross-domain recommendation system based on the personalized preference transfer model, which comprises computer equipment, wherein the computer equipment is programmed or configured to execute any cross-domain recommendation method of the personalized preference transfer model.

Compared with the prior art, the invention has the advantages that:

the invention firstly excavates hidden user-user and project-project relations according to basic information and corresponding comments, builds a heterogeneous information network comprising user and project nodes and three types of edges (user-project, user-user and project-project) in each domain as a multi-relation diagram, and calculates more representative user and project embedding of a source domain and a target domain based on information in the multi-relation diagram, thereby generating rich representation of the user and the project based on the diagram, and effectively solving the limitation of neglecting the user-user relation and the project-project relation when calculating the representation of the user and the project in the prior art;

The invention considers the individual characteristics of the user and the common characteristics of the similar users, adopts a soft clustering method to identify the similar users for each target user, simultaneously uses the individual characteristics of the user and the common characteristics of the similar users to learn the personalized preference transfer function of the user, learns a more effective personalized preference transfer function between two fields for the user, and effectively solves the limitation that the personalized characteristics and the common characteristics of the user are not considered at the same time when the user preference is transferred in the prior art.

Drawings

FIG. 1 is a schematic diagram of inferring user preferences for items by considering three heterogeneous relationships.

FIG. 2 is a schematic diagram showing three different user preference delivery modes at present.

FIG. 3 is a flow chart of an embodiment of the present invention.

FIG. 4 is a schematic diagram of a personalized preference transfer model in an embodiment of the invention.

FIG. 5 is a graph showing the performance impact of parameter settings of a personalized preference transfer model in an embodiment of the invention.

FIG. 6 is a graph showing the comparison of the results of a personalized preference transfer model and an existing model visualization test in an embodiment of the present invention.

Detailed Description

The invention is further described below in connection with the drawings and the specific preferred embodiments, but the scope of protection of the invention is not limited thereby.

Before describing the specific embodiments of the present solution, the related concepts will be described:

user cold start problem: conventional recommendation systems use CF technology to learn user preferences based on historical interactions of users, and thus cannot adequately handle cold-start users with sparse interactions in the system. To alleviate the cold start problem of users, most existing solutions propose to directly exploit auxiliary information, such as enriching the user and item representations by their profile and item attributes. Recent studies use HIN to model heterogeneous relationships between different objects and use high-level auxiliary information of users or items captured from HIN to further solve the user's cold start problem. Meanwhile, the concept of "learning" i.e. the scenario element learning paradigm, is also used to alleviate the cold start problem of the user. Such paradigms may quickly adjust the meta-learning priors (or preferences) globally shared among all users based on user-scarce interaction data to accommodate the personalized priors of each cold-start user. For example, MAMO and MetaHIN combine user profiles and constructed HINs with meta-learning paradigms, respectively, to alleviate user cold-start problems at the data level and model level. Furthermore, PAML and CBML suggest the use of local priors learned from users of similar tastes in the meta-learning process, rather than global priors, to avoid the negative effects of unrelated users. Cross-domain recommendation (CDR) is another important technique to address the problem of user cold-start. In the CDR solution, knowledge of the information auxiliary domain is transferred to the target domain to enrich the representation of the user or item and improve the recommendation performance of the cold start user. The available surveys show that the CDR is more complex than the aforementioned auxiliary information-based or meta-learning-based strategies, as it needs to take into account various recommendation scenarios (user overlapping or non-overlapping scenarios) and different recommendation tasks (e.g. intra-domain or inter-domain recommendations). Furthermore, the CDR needs to model not only user preferences within a single domain, but also consider transferring user preferences from a source domain to a target domain. Recently, CDRs are becoming increasingly interesting to both academia and industry to address the cold start problem in recommendations.

Cross-domain recommendation: cross-domain recommendations (CDRs) aim to transfer useful knowledge from a source domain (or called auxiliary domain) to a target domain, which has two core problems, namely "what to transfer" and "how to transfer".

To address the first core issue, EMCDR computes representations of users and items based on historical user-item interactions using conventional Matrix Factorization (MF) and Bayesian Personalized Ranking (BPR) methods based on historical user-item interactions, which will also be transferred to the target domain in a later stage. However, the EMCDR does not use any auxiliary information, resulting in poor embedding effect for users and items. RC-DFM uses a variant of Stacked Denoising Auto Encoder (SDAE) to fuse the auxiliary information (i.e., comment text and project content) with the user-project rating matrix and generate a richer representation for the user and project. At the same time, many studies have also utilized inferred ancillary information, such as user-user or item-item relationships, to further refine the representation of the user or item. For example, the CDLFM determines user-user relationships through three predefined user similarity metrics over user-project interactions and uses these relationships as constraints in the matrix factorization process. In DCDIR, the effectiveness of item representation in the target domain is improved by mining the relationships between items through meta-paths defined on the insurance product knowledge graph. Recently, zhu et al established a heterogeneous map in their recommendations (i.e., GA-DTCDRs) that captured heterogeneous relationships (i.e., user-project, user-user, and project-project) for enhancing modeling of users and projects. And the isomorphic embedding technology (namely Node2 vec) is used in the GA-DTCDR to calculate the embedding of the user and the project, so that different relations in the graph cannot be distinguished, and the quality of the calculated embedded vector is reduced. Therefore, there is a need to devise a heterogeneous graph embedding method to better calculate the knowledge that needs to be passed in the source domain based on heterogeneous relationships.

To address the second core issue, information of two domains overlapping users or items is typically used to establish a connection between the two domains, while capture and active learning of tag dependencies often establishes a connection between the two domains when there is no overlapping object between the two domains. This study focused mainly on the scenario of users with partially overlapping areas, which is very common. In this case, the proposed CDR method assumes that the relationship between user preferences in the source domain and the target domain is shared by all users and learns a generic preference transfer function for all users. However, due to the complex individual features of the user, the general preference transfer function may degrade the performance of the CDR. To overcome this drawback, CGN uses a loop generation network to develop a personalized bi-directional transfer function, while PTUPCDR uses a meta-network to generate a personalized transfer function for each user. However, these efforts only use each user's historical behavior (referred to herein as individual features) to train their models, which makes them less satisfactory for recommending when dealing with cold-start users. In a cold start user scenario, common features from a group of users with similar preferences may significantly improve the quality of the personalized transfer function learned by the user. However, existing CDR methods do not apply both the personal and common characteristics of the user to learn the user's personalized preference transfer function, which makes their results in recommendations less accurate.

Example 1

As shown in fig. 3, the present embodiment proposes a cross-domain recommendation method based on a personalized preference transfer model, which includes the following steps:

s1) constructing a personalized preference transfer model, wherein the personalized preference transfer model is shown in FIG. 4, firstly constructing a multi-relation graph according to interaction, basic information and corresponding comments of users and projects, and calculating embedding of all users and projects in a source domain and a target domain respectively according to the information of the multi-relation graph; then according to the learned user representation, identifying similar users for each target user in a source domain by adopting a soft clustering method, and learning the personalized preference transfer function of the target user by using the individual characteristics of the target user and the common characteristics of the similar users; finally, calculating user embedding transferred from the source domain to the target domain according to the learned personalized preference transfer function of each user, and calculating the prediction score of the transferred user embedding on the items in the target domain according to the transferred user embedding so as to realize personalized recommendation of the source domain user in the target domain;

Before describing the specific steps, we first propose the formulas of our questions and then define the key concepts related to our questions.

In this embodiment, we mark u= { U ₁ ,u ₂ ,u ₃ ,…u _N The user set is v= { V ₁ ,v ₂ ,v ₃ ,…v _M As a collection of items. In addition, the scoring matrix between the user and the item may be labeled R, where R _i,j E R represents user u _i And article v _j Scoring in between. Note that there are two domains in each CDR task, we adoptThey are distinguished by the superscript { s, t }. For a CDR scenario where some users overlap, there is some user interaction in both domains while the items in both domains are disjoint. Therefore, we call overlapping users U ^o ＝U ^s ∩U ^t Wherein |U ^o |＜＜|U ^s I or U ^t | a. The invention relates to a method for producing a fibre-reinforced plastic composite. The purpose of this embodiment is to optimize user u _i Item v for the target domain _j Scoring of (2)

In order to achieve high quality user and item embedding, explicit user-item and implicit user-user and item relationships are derived through derivation of a multi-relationship graph constructed in the proposed personalized preference transfer model (HCCDR). It should be noted that the multiple relationship graph is a heterogeneous information network that contains three distinct relationships (user-item, user-user, item-item), and each relationship in the graph is represented as a node-relationship triplet in this embodiment. Then, an embedding vector for computing users and items based on information in the multiple relationship graph is proposed. We define the relevant key concepts as follows.

Definition 1 (multiple relationship diagram): the multiple relationship graph is expressed as

Wherein each node n E V and each edge E. Each multiple relationship graph and node type mapping function +.>

And edge type mapping function->

And (5) correlation. />

And T _E Sets of node and edge types are described separately, satisfy +.>

Definition 2 (node-relationship triplet): in the multiple relationship graph, each relationship is represented by a node-relationship triplet in the form of<n _i ,e,n _j >Indicating a relationship (i.e. edge) E connects two nodes n _i ,n _j E V, and each node-relationship triplet<n _i ,e,n _j >E P, where P is the set of all node-relationship triples.

Definition 3 (heterogeneous information network embedded): given a multiple relationship graph

There is a mapping function +.>

Where d < |V| is trained to project each node n E V into a low-dimensional vector space.

We summarize the key symbols and their description in table 1.

Table 1 key symbols and description

In this embodiment, the framework of the personalized preference transfer model constructed in step S1 is shown in fig. 4, and is composed of three parts, including a Heterogeneous Latent Factor Modeling (HLFM) component, a cluster enhanced preference transfer (CPT) component, and a Personalized Recommendation (PR) component. In FIG. 4, u _i Is one of the overlapping users between the source domain and the target domain, and v _j Is an item in the target domain. In particular, the heterogeneous underlying modeling component is responsible for computing the embedding of users and items in each domain. Thereafter, the cluster-enhanced preference-transfer component considers both the common and personal characteristics of the user, for eachThe user learns a personalized preference transfer function. Finally, the personalized recommendation component implements personalized recommendation in the target domain according to the preference of the user to transfer from the source domain to the target domain. Therefore, in step S1 of the present embodiment, the process of constructing the personalized preference transfer model includes: building heterogeneous potential factor modeling components, building cluster-enhanced preference transfer components, and building personalized recommendation components.

For heterogeneous underlying factor modeling components, as previously described, in addition to explicit user-item relationships generated based on historical interactions of users with items, implicit user-user relationships and item relationships play an important role in improving user and item representations. By taking these heterogeneous relationships into account, we have designed a heterogeneous potential factor modeling (HLFM) component to learn representations of users and projects. In HLFM, user-item relationships are first identified. Then, a multiple relationship graph is built for each domain, which contains two types of nodes (user and project nodes) and three types of relationship edges (user-project, user-user and project-project). Finally, we devised a heterogeneous relation oriented embedding method that computes an enhanced representation of the user (or item) in the form of an embedding matrix. To achieve the above functions, HLFM mainly designs two modules: a multiple relationship graph construction module and an embedding module facing heterogeneous relationships.

For a multiple relationship graph construction module, typically, users (or items) may be interrelated. Relationships between users (or items) may be identified based on their basic materials, such as the same educational background (or manufacturer) and their associated user comments, which reflect the interests of the user or the characteristics of the item. Herein, c _i (c _N+j ) Representation store and user u _i Documents of relevant basic information and user comments, wherein 1.ltoreq.i.ltoreq.N (or item v) _j Wherein 1.ltoreq.j.ltoreq.M). In this embodiment, all the components c= { C ₁ ,c ₂ ,…,c _N+M The documents represented are all segmented into words using the popular natural language tool stanford CoreNLP. The document vectors of all users (i.e.,

) And document vectors for all items (i.e., D _V ＝[d _vj ] ^M×dc ) Can be calculated by the widely used Doc2vec, where d _c Is the dimension of a document vector. Based on derived document vectors of all users, i.e. D _U Similarity probability between a pair of users<u _i ，u _j >The calculation is as follows:

wherein S (du _i ，du _j ) Is u _i Is (i.e. du) _i ∈D _U ) And u is equal to _j Is (i.e. du) _j ∈D _U ) Normalized cosine similarity function between. Alpha is a hyper-parameter that controls the similarity threshold. If the value of S (·, ·) exceeds the threshold α, then the output of T is equal to the input, otherwise the output of T is set to 0. Similarly, the likelihood of similarity between a pair of items may also be calculated.

Then, all users and items are aggregated into nodes, and three types of heterogeneous edges (i.e., user-item, user-user, and item-item) are generated to construct a multi-relationship graph. Specifically, if there is a history of user-item interactions, a user-item edge is generated and the weight of the edge is defined by r _i,j And/max (R) calculation. Here, r _i,j Is user u ⁱ Give item v _j Max (R) is the maximum value in the scoring matrix R. If the similarity probability of the user-user (or item-item) edges is greater than 0, they are generated and the weight of the edge is set to the similarity probability.

For the embedding module facing the heterogeneous relationship, after constructing the multi-relationship graph, we realize an embedding (HRE) strategy facing the heterogeneous relationship, and calculate the embedding matrix of the user (or item) based on the graph. The multiple relationship graph G contains three types of heterogeneous relationships (i.e., user-item, user-user, and item-item); the embedding policy should take into account their different semantics. To facilitate subsequent modeling, we first consider a user-user (or item-item) relationship as a Similarity Relationship (SR) because it reflects the similarity of two users (or items) between their attributes. Meanwhile, the user-item relationship is regarded as an Interactive Relationship (IR), representing the user-item interaction. In this embodiment, HRE policies are used to handle Similarity Relationships (SRs) and Interaction Relationships (IRs) in a multi-relationship graph, using different approaches to better capture their semantic differences.

For SR, HREs ensure that two similar users (or items) in a participation relationship are mapped close to each other in the potential vector space. The euclidean distance is used in HRE to measure the distance between two nodes in the potential vector space. Given a node-relation triplet with e as SR<n _i ,e,n _j >Node n _i And n _j The distances in the potential vector space are calculated as follows:

wherein w is _i,j Is the weight of edge e (or relation), hn _i ,

Respectively is node n _i And node n _j Is embedded in the d-dimensional vector of (c). To ensure that two nodes are close to each other in the potential vector space, we use an edge-based loss function to minimize d _sr (n _i ,n _j ) The following is shown:

where γ is a marginal hyper-parameter, we set γ=1. P (P) _SR Is a set of node-relation triples belonging to SRs, and

is a negative sample set of node-relationship triples belonging to an SRs.

As IRs convey interaction information between user nodes and project nodes, HREs use explicit translation-based methods to model IRs as translations between nodes in a potential vector space. Formally, a node-relationship triplet is given<n _p ,r,n _q >R is IR, node n _p And n _q The distance between them is defined as follows:

wherein w is _p,q Hn is the weight of relation r _p ，Hn _q Respectively is node n _p And n _q Is embedded in X _r Is the embedding of the relation r. Equation (4) penalizes (Hn) _p +X _r ) And Hn _q Is a deviation of (2). Also, a margin-based penalty function is defined to ensure translation of two nodes in a low-dimensional vector space:

wherein P is _IR Is a set of node-relationship triples belonging to IRs

Is a negative-sample set in which node-relationship triples do not belong to IRs.

Finally, the common minimization of the two loss functions is shown below, generating an embedding matrix for the user (or item), i.e., E _U ＝[e _ui ] ^N×de (or E) _V ＝[e _vj ] ^M×de ) Wherein d is _e Is the dimension of the embedded vector of the user (or item).

Therefore, based on the heterogeneous latent factor modeling component constructed in the present embodiment, in step S1 of the present embodiment, the personalized preference transfer model constructs a multiple relationship graph (user-item relationship, user-user relationship, item-item relationship) according to interactions of users and items, basic information, and corresponding comments, and calculates embeddings of all users and items of a target in a source domain and a target domain according to information of the multiple relationship graph, respectively, including the following steps:

s101) based on a multi-relation diagram construction module, converting the basic information and the user comments related to the user into a user document vector, converting the basic information and the user comments related to the project into a project document vector, calculating the similarity probability between each pair of users in the user document vector, and calculating the similarity probability between each pair of projects in the project document vector; then summarizing all users and items into nodes of a multi-relation graph, generating corresponding user-user heterogeneous edges in the multi-relation graph according to similarity probability between each pair of users, generating corresponding item-item heterogeneous edges in the multi-relation graph according to similarity probability between each pair of items, and generating corresponding user-item edges in the multi-relation graph according to historical interaction of the users and the items;

S102) based on an embedding module facing heterogeneous relations, according to an embedding strategy facing heterogeneous relations, taking a user-user relation or an item-item relation in a multi-relation diagram as a similar relation, taking a user-item relation in the multi-relation diagram as an interactive relation, and measuring the distance d between two nodes in a potential vector space by using Euclidean distance for each similar relation _sr (n _i ,n _j ) And calculate d _sr (n _i ,n _j ) Loss function at minimum

All interaction relationships are also modeled as translations d between nodes in the potential vector space using an explicit translation-based approach _i (n _p ,n _q ) And calculate d _i (n _p ,n _q ) Loss function at minimum->

Finally, lost functionCount->

And loss function->

Together minimized, an embedding matrix for the user or item is obtained.

In step S101 of the present embodiment, when generating a corresponding user-user heterogeneous edge in the multiple relationship graph according to the similarity probability between each pair of users, generating a corresponding item-item heterogeneous edge in the multiple relationship graph according to the similarity probability between each pair of items, and generating a corresponding user-item edge in the multiple relationship graph according to the historical interaction between the user and the item, the method includes:

if the similarity probability between a pair of users or the similarity probability between a pair of items is larger than zero, generating corresponding user-user heterogeneous edges or item-item heterogeneous edges in the multi-relation graph, and taking the similarity probability corresponding to the user-user heterogeneous edges or the item-item heterogeneous edges as the weight of the user-user heterogeneous edges or the item-item heterogeneous edges;

For the cluster-enhanced preference transfer component, we consider that providing a personalized transfer function for each user is important to ensure performance of the CDR. The personalized delivery function may be learned from the user's preferences. While interactions between the user and the item directly reflect the personal characteristics of the user's preferences, such interactions captured from the source domain are insufficient under the user's cold start settings. In order to better learn personalized transfer functions, a cluster enhanced preference transfer (CPT) component is presented that transfers preferences of a target user from a source domain to a target domain based on individual characteristics of the user and common characteristics learned from their similar users. Specifically, a soft clustering method is first applied in the CPT to identify similar users to the target user. Thus, a common feature of the user's preferences compared to similar users can be deduced, which is considered an important complement to the individual features. Finally, by means of these two features of user preference, a high quality personalized transfer function can be learned for each user. The CPT is provided with three main modules: common feature generator, individual feature generator, user preference transfer.

For a common feature generator (CCG) module, a soft clustering method is employed in this embodiment to identify user groups with similar preferences, and then common features of users are generated from the clustering results. Note that soft clustering, also known as fuzzy clustering, allows each user to be assigned to several classes simultaneously with different probability of assignment.

Specifically, the CCG module groups all users in the source domain into K clusters, and the kth class is identified by a cluster center, i.e., o _k . Here, let random initialization

Representing a set of K cluster centers. Then, the student's t-distribution is employed to facilitate computation of the user u _i Is embedded e of (2) _ui ∈E _U Assigned to the kth cluster o _k Probability s of (2) _i,k . With respect to s _i,k See formula (7):

where β is the degree of freedom of the student's t-distribution, and is usually set to 1.

Similar to the DEC method, unsupervised learning is used in the CCG module to optimize cluster loss, noted as

Specifically, we construct a soft allocation matrix +.>

It records soft allocation information for K clusters of source domain. Then, establishing a target scoreY is distributed to guide unsupervised cluster loss->

And further updating the cluster center. The definition of the target distribution Y is as follows:

Wherein, sigma _i S _i,k For the kth cluster o _k Soft cluster frequency of S _i,k Representing user u _i Assigned to the kth cluster o _k Is a probability of (2).

After calculation of Y, the KL divergence between the soft distribution S and the target distribution Y is used for defining a clustering loss function

The soft allocation is distributed close to the target as follows:

in each iteration, the set o= [ O ] of cluster centers ₁ ，o ₂ ，...，o _K ]Updating is performed by random gradient descent (SGD). Each cluster center o _k Corresponding to

The gradient was calculated as follows:

the update of the kth cluster center is as follows:

Finally, the distribution probability of the user to each cluster is multiplied by the cluster center of the class, and then the target user u is obtained through accumulation _i Is characterized in that:

wherein s is _i,k Is calculated by equation (7) to give user u _i Assigned to the kth cluster o _k Is a probability of (2).

In summary, the CCG module divides all users into K groups according to the user embedding, and generates common features for each user according to the clustering result. The soft clustering used in the CCG module is more efficient than the traditional hard clustering method because it learns the assignment with high confidence with the help of the target distribution Y, iteratively improving the initial assignment result.

For an individual feature generator (ICG) module, to generate individual features of a user, the individual feature generator (ICG) module first employs an attention mechanism to weight all items interacted with by the user, reflecting the user's preferences in the source domain. Formally, set B _ui = { v1, v2,..} is user u in the source domain _i Is provided. The attention score for each item is learned by a defined attention network as follows:

wherein, the function g (·; η) is the attention network and η is the learnable parameter. Note that in this context, the attention network is a two-layer MLP. Then, calculate item v _j Is a normalized attention score of a) _j Item v _j The weights (or contributions) of (a) are calculated as follows:

finally, the ICG module is configured to control the user u by integrating the user u _i (i.e

) Aggregating the embedding of all the interactive items, calculating user u _i The weights of the individual features of (a) are as follows: />

For user preference transfer Module (UPT), to calculate user u _i We first pass user u to the personalized user preference transfer function of (a) _i Is common to (i.e

) And individual characteristics (i.e.)>

) Connected to form a transferable feature of the user, denoted +.>

Where +.. Then, a user u is designed in the UPT module _i Transferable Property->

Is to learn user u _i The personalized preference transfer function from the source domain to the target domain is as follows: :

wherein the function is

Is a two-layer neural network parameterized by phi, vector +.>

Including parameters of the preference transfer function.

Although the preference transfer function may be set to any structure, for simplicity we set it to the linear layer f (·) in the UPT module. Note that the resulting vector

Is remolded into a matrix

To fit the magnitude of the parameter of the preferential transfer function, where d _e Representing the embedding dimension of the user. By M _ui Using derived user u as a parameter _i Is used for generating a personalized preference transfer function of the user u _i The translated representation in the target domain is as follows:

wherein,,

representing user u in source domain _i Representation of->

User u _i Transfer from the source domain to the embedding in the target domain. Derived->

Can be regarded as user u _i Initial embedding in the target domain to solve the user cold start problem.

It follows that the user preference transfer module (UPT) learns the personalized preference transfer function for each user based on their personal and common characteristics, improving the quality of the learned transfer function compared to existing CDR solutions.

Therefore, based on the cluster-enhanced preference transfer component constructed in the present embodiment, in step S1 of the present embodiment, the personalized preference transfer model, according to the learned user representation, identifies similar users for each target user in the source domain by adopting the soft clustering method, and when learning the personalized preference transfer function of the target user by using the individual characteristics of the target user and the common characteristics of the similar users, includes the following steps:

s201) based on a common feature generator module, grouping all users in a source domain into K clusters, respectively calculating the allocation probability of embedding each user to the kth cluster, and then establishing a target distribution Y to guide an unsupervised cluster loss function

s202) based on an individual feature generator module, acquiring an interaction item list of a target user in a source domain, respectively calculating the attention score of each item in the interaction item list by using an attention network, normalizing to obtain the weight of each item, and accumulating after embedding each item in the interaction item list of the target user by the corresponding weight to obtain the individual feature of the target user;

S203) based on a user preference transfer module, connecting the common characteristics and the individual characteristics of the target user through a connection operation to obtain transferable characteristics of the target user, learning a personalized preference transfer function of the target user from a source domain to a target domain by using a preset neural network aiming at the transferable characteristics of the target user, and then generating a representation of the target user transferred in the target domain according to the personalized preference transfer function of the target user.

For personalized recommendation component, in this embodiment, for target user u _i Given its transferred user representation

Item v in the target domain _j Predictive score of->

The calculation is as follows:

wherein,,

is an item representation in the target domain, +..

Since the number of overlapping users between these two domains is typically very limited, training of the mapping-based personalized preference transfer function may suffer from over-fitting problems. Thus, the task-based optimization method employed in the cluster-enhanced preference transfer (CPT) component in this embodiment to train the personalized preference transfer function minimizes the loss between each predictive score and the true score, which increases training samples and avoids overfitting. Predicting loss

Expressed as:

wherein,,

is a set of scoring items in a target domain by overlapping users of two domains.

Finally, the overall loss function of the proposed HCCDR model is defined as the cluster loss

And predictive loss->

The sum of (2) is as follows:

therefore, based on the personalized recommendation component constructed in the present embodiment, in step S1 of the present embodiment, the step of calculating, by the personalized preference transfer model, the prediction score of the personalized preference transfer model for the item in the target domain according to the transferred user embedding into the target domain specifically includes: respectively performing dot product operation on the representation of the target user transferred in the target domain and each item representation in the target domain to obtain the predictive score of the target user on each item in the target domain, and generating personalized recommendation of the target user according to the predictive score of each item.

Further, when the personalized recommendation component is constructed in this embodiment, after obtaining the prediction score of the target user for each item in the target domain, the method further includes: the personalized preference transfer function is trained by adopting a task-based optimization mode, so that the loss between each prediction score and the real score is minimized, the training sample can be effectively increased, and the overfitting is avoided.

The personalized preference transfer model of the present embodiment is explained below by specific experiments.

1. Experimental setup

To evaluate the effectiveness of the personalized preference transfer model (HCCDR) presented in this example, we used two published datasets well suited for CDR experiments, namely a bean dataset and an amazon dataset. The two data sets have rich items from different domains and provide comments and scores for the user. Each score has a value from 1 to 5, with higher scores indicating greater user interest in the item. Details of these two data sets are provided below, and table 2 summarizes their statistics. In table 2,(s) and (t) represent the source domain and the target domain, respectively.

Bean cotyledon dataset: in view of its broad user scope and true user-generated reviews, bean is one of the most authoritative rating platforms. It provides rich information about users and items, including user profiles, item descriptions, and user scoring and comments on the items. In our setup, we have selected the movie domain and book domain of the bean to define CDR task 1: movie→book. We filter out all missing and duplicate data.

Amazon dataset: the amazon dataset is one of the most commonly used datasets for evaluating CDR tasks. It contains nearly 20 fields, and in each field the dataset provides the user's ratings and reviews, product metadata, images, and links. For the Amazon dataset, we have chosen two pairs of fields to define two CDR tasks, namely Sport→closing (task 2) and CD→Game (task 3).

Table 2 statistics of two datasets

We evaluated the proposed personalized preference transfer model using two indices, mean Absolute Error (MAE) and Root Mean Square Error (RMSE). These metrics are typically used to evaluate the performance of CDR tasks, where:

the Mean Absolute Error (MAE) is the average of the absolute errors between all predicted values and their corresponding ground truth values, reflecting the actual error of the predicted values. The smaller the MAE value, the closer the predicted value is to its ground truth value.

Root Mean Square Error (RMSE) is used to measure the deviation between all predictions and their corresponding ground truth values. An average of the squares of the errors between the predicted values and ground truth values is calculated, and then RMSE is equal to the square root of the average. Due to squaring operations, RMSE values are typically greater than or equal to MAE values and are very sensitive to outliers. The smaller the RMSE value, the more accurate the predicted value.

We compared the proposed personalized preference transfer model with two types of baseline methods: 1) Single domain recommendation methods, including MF and GMF; 2) Cross-domain recommendation methods, including CMF, EMCDR, DCDCSR, SSCDR, DCDIR, LACDR, and ptupccdr. The detailed description of the above baseline is as follows:

MF is a classical recommendation method that breaks down a sparse user-project interaction matrix into two low rank and dense matrices, a user latent factor matrix and a project latent factor matrix, by which a user's scoring score for a project can be predicted.

GMF is a variant of the Neural Collaborative Filtering (NCF) model. When a dot product operation is performed between user embedding and item embedding, it assigns a different weight to each dimension to predict the score.

CMF introduces a different side information matrix to alleviate the cold start problem. It shares the user-item interaction matrix between the source domain and the target domain and jointly breaks them down. However, CMF ignores the difference between these two fields.

The EMCDR is a paradigm of an embedding and mapping framework that trains mapping functions between source and target domains based on information of overlapping users. By using the trained mapping function, user embeddings that are more active in the source domain but do not behave in the target domain are calculated, based on which personalized recommendations are made in the target domain.

The DCDCSR generates a reference matrix using matrix sparsity information between different domains, and then learns a mapping function between the target domain and the reference domain.

SSCDRs consider that in real life the proportion of common users or items between two domains is typically small, resulting in a poor mapping function. Thus, the SSCDR learns the mapping function through non-overlapping data in a semi-supervised manner to enhance the robustness of the learning function.

The DCDIR constructs a heterogeneous knowledge graph in the target domain and defines a plurality of meta-paths to obtain user embedding before training the mapping function.

The LACDR trains the encoder of the source domain and the decoder of the target domain respectively by utilizing the data of all users, thereby realizing better generalization.

The PTUPCDR considers that the preference transfer process should be different for each user. Thus, it employs personal interactions to create a personalized bridge for each user.

In the user preference transfer module of the personalized preference transfer model, a two-layer neural network is adopted, and the hidden layer is 5 x d _e Dimension, output layer is

Dimension. We used Adam optimizer with a learning rate set to 0.005 and batch size set to 512. To ensure fairness of the comparison, we use a neural network based model GMF to calculate the embedding of users and items, while maintaining the original optimal settings of other parameters, for all baselines. Evaluating the effectiveness of the proposed personalized preference transfer model on cold-start users, we divide the data of overlapping users between two fields into a training set and a test set, wherein the ratio between the training set and the test set is set to 8: 2. 5:5 and 2:8. we ignore the user-project interactions in the test set and treat them as cold-start users in the target domain, while the scores given by the users in the training set for the project are used to train the personalized preference transfer function. We use PyTorch 1.10.0 to implement a personalized preference transfer model, which uses a GPU (Nvidia RTX A5000) and a CPU (Intel core i9-11900 2.50G) for training.

2. Parameter analysis

We first studied the impact of different parameter settings on their performance in the personalized preference transfer model. The experimental results are shown in fig. 5, and in fig. 5, the smaller the Mean Absolute Error (MAE) or Root Mean Square Error (RMSE) value, the better the personalized preference transfer model (HCCDR) performance.

Number of clusters K: as can be seen from fig. 5 (a) -5 (c), as the number of clusters K increases, the Mean Absolute Error (MAE) or Root Mean Square Error (RMSE) of the personalized preference transfer model steadily decreases first, because too small K may cause dissimilar users to be separated into the same cluster, thereby reducing the quality of the clusters. Then, when K in task 1 is greater than 12 and K in both task 2 and task 3 is greater than 10, the MAE or RMSE of the personalized preference transfer model will rise. This is because, as K is greater, the number of users included in the cluster is reduced, thereby reducing the common characteristic information of the users learned in the cluster. In the next experiment we set k=12 for Task 1 and k=10 for Task 2 and Task 3 to ensure that the personalized preference transfer model can achieve the best performance.

Embedded dimension d _e : subgraph 5 (d) -5 (f) describes that the personalized preference transfer model changes the embedding dimension d of the user and item simultaneously in three tasks _e Performance at that time. These subgraphs show, as d _e From 16 to 128, the performance of the personalized preference transfer model is improved because d _e The larger generally means that more user or item information can be encoded into their embedded vector. At the same time, we can also observe from the subgraph that in Task 1, when d _e Greater than 32 (or d in task 2 and task 3) _e Greater than 64), d _e The rise in (c) brings about smaller and smaller performance gains. And taking into account the computational overhead of the personalized preference transfer model with d _e Exponentially increasing, we set d for Task 1 throughout the next experiment _e =32, set d for task 2 and task 3 _e ＝64。

Similarity threshold α: the value of the similarity threshold α may affect the quality of the computational embedding of a user or item in building a user-user relationship (or item-item relationship) in a personalized preference transfer model. When changing the value of α, the performance of the personalized preference transfer model in three tasks is seen in sub-graphs 5 (g) -5 (i). These sub-graphs reveal that as alpha increases, the Mean Absolute Error (MAE) or Root Mean Square Error (RMSE) of the personalized preference transfer model first drops and then rises, as a larger alpha only indicates that users (or items) with higher similarity are connected, and thus the quality of the learned embedded vector is improved. However, when α exceeds its optimal value, the number of relationships generated is reduced, thereby reducing the information of the learnable user-user (or item-item) relationships, thereby reducing the quality of learning embedment of the user or item. In the following experiments, α for task 3 was set to 0.95, and α for task 1 and task 2 was set to 0.9.

3. Comparison with baseline

Based on 3 real datasets, we compared the personalized preference transfer model (HCCDR) with 9 baseline methods over 3 CDR tasks. The values in table 3 are the average of the experimental results of five runs. The ratio represents the ratio of the training set to the test set. The best value is indicated in bold and the next best value is indicated by underlining. The improvement in the table is the improvement of HCCDR relative to the secondary value. From the results in table 3, we can derive the following findings:

found 1: we propose HCCDR that exceeds all baselines in 3 CDR tasks. In particular, HCCDR improves maximum performance of 12.69% (or 8.99%) for Root Mean Square Error (RMSE) (or Mean Absolute Error (MAE)) and 3.38% (or 1.88%) for minimum performance of RMSE (or MAE), demonstrating the effectiveness of our approach in handling CDR tasks.

Finding 2: all single domain recommendation methods (e.g., MF and GMF methods) are defeated by these CDR methods (e.g., CMF, EMCDR, DCDCSR, SSCDR, DCDIR, LACDR, PTUPCDR and HCCDR). This is because the CDR method uses information from other domains, thereby alleviating the user's cold start problem and thus improving recommendation performance.

Finding 3: the improvement in target domain recommendation performance is related to the density of source domains. For example, the average rise in MAE and RMSE of HCCDR in Task 1 (source domain density= 4.047% as shown in table 2) was 6.71%, while the average improvements in Task 2 (source domain density=0.049%) and Task3 (source domain density=0.044%) were only 4.51% and 3.74%, respectively. The reason is that the higher the density of the source domain, the more user interaction information it contains, thereby helping to model user preferences in the modeling target domain to achieve better recommendations. Note that the density value of the source domain is the smallest in Task3, which results in the weakest improvement of the target domain.

Finding 4: the larger the ratio of training set to test set, the better the recommended performance. Specifically, when the ratios are set to 8: 2. 5:5 and 2: at 8, the MAE or RMSE value of HCCDR increases gradually. This is because when the training set is larger, more training samples will be used to train the HCCDR, thus better generalizing the HCCDR in processing recommended tasks.

Finding 5: CMF is inferior to other CDR methods (e.g., EMCDR, DCDCSR, SSCDR, DCDIR, LACDR, PTUPCDR and HCCDR) in three tasks. This is because the CMF suggests to combine the data from the source domain and the target domain to learn the preferences of overlapping users, which ignores the potential transfer between the two domains. In contrast, other CDR approaches learn a preferred transfer function to connect the two domains, which effectively mitigates the effects of domain migration.

Finding 6: the use of non-overlapping users (or items) in learning the preference transfer function greatly contributes to improving recommendation performance. In practical applications, overlapping users between two domains are very limited, which prevents training of preference transfer functions based on such information. That is why the SSCDR uses overlapping user information and non-overlapping source domain items and the laccdr uses overlapping and non-overlapping user information to perform better than the EMCDR paradigm (training the mapping function between the two domains based only on overlapping user information). Note that HCCDR also considers these overlapping users when learning the user's transfer function using common features extracted from similar users in the source domain.

Found 7: personalized preference transfer effectively facilitates performance enhancement of CDR tasks. Considering that the user preference relationship between the source domain and the target domain varies from user to user, the PTUPCDR and HCCDR learn personalized preference transfer functions for each user, rather than using a common preference transfer function for all users. The evaluation results listed in table 3 show that PTUPCDR and HCCDR are superior to all other methods, demonstrating the significance of the custom transfer function. Also, we can observe from the table that HCCDR is superior to ptupccdr due to the consideration of HCCDR for individual and common characteristics of the user when learning the personalized preference transfer function.

Table 3 comparison of HCCDR to baseline in three CDR tasks

4. Ablation study

To analyze the effectiveness of the proposed module in HCCDR, we performed an ablation study. In particular, we consider the following four variants of HCCDR:

HCDDR (GMF) is a variant of HCCDR in that it uses GMF methods to generate user and project embeddings instead of heterogeneous potential factor modeling (HLFM) components proposed in HCCDR. Note that the GMF ignores the user-user and item-item relationships in computing the embeddings.

HCDDR (N2V) is a variant of HCCDR that calculates the embedding of users and items based on what we push to the multiple relationship graph, using isomorphic embedding technique Node2vec instead of HLFM components proposed in HCCDR.

HCDDR-CCG is a variant of HCCDR. It removes the common feature generator (CCG) module from the cluster enhanced preference transfer (CPT) component of HCCDR, which suggests that we remove the common features generated by the clusters, using only the individual features of the user to train the user's personalized preference transfer function.

HCDDR (GMF) -CCG is a variant of HCDDR (GMF) in which the CCG module is disabled.

We performed ablation studies on three tasks, the ratio of training set to test set being 8:2. note that similar conclusions can be drawn using other ratio settings. The experimental results are shown in table 4, and the ablation analysis results are as follows. .

Table 4 ablation study (best results are shown in bold and secondary best values are underlined).

Validity of Heterogeneous Latent Factor Modeling (HLFM): we first evaluated the effectiveness of the HLFM component proposed in HCCDR by comparing HCCDR with its two variants, HCCDR (GMF) and HCCDR (N2V). The results of table 4 show that the MAE (or RMSE) of HCCDR (GMF) is on average 1.69% (or 1.64%) higher than HCCDR, because the GMF method used in HCCDR (GMF) fails to calculate the embedding of users and items taking into account implicit relations (i.e., user-user relations and item-item relations), but only user-item interactions. On the other hand, while HCCDR (N2V) considers three types of relationships (user-item, user-user, item-item) simultaneously, it applies isomorphic embedding, i.e., node2vec, to generate representations for users and items. Because of the heterogeneous nature of these relationships, HCCDR (N2V) produces unsatisfactory recommendations compared to HCCDR. Another important observation from the table is that HCCDR (N2V) is the best variant, again demonstrating the importance of learning a representation from the three types of relationships. In summary, two optimization strategies in the HLFM component, namely considering both three relationship types and heterogeneous-oriented embedding, make the proposed HCCDR superior to both variants.

Effectiveness of cluster enhancement preference transfer (CPT): the importance of CPT components can be studied by comparing HCCDR and its variants HCCDR-CCG. The novelty of CPT is to extract the common features of the user, and to use this information, together with the personal features of the user, to learn the personalized transfer function between the two domains for the user. In HCCDR-CCG, novelty in CPT is eliminated by disabling a key module called a Common Characteristics Generator (CCG). The experimental results in Table 4 show an average improvement in HCCDR performance on MAE (or RMSE) of 2.43% (or 2.93%) compared to HCCDR-CCG. This demonstrates the effectiveness of learning personalized transport bridging between two domains for each user using common features extracted from similar users.

Finally, by comparing HCCDR to its variant HCCDR (GMF) -CCG, we found from table 4 that if there were no advantages from the HLFM and CPT modules, the MAE (or RMSE) of HCCDR (GMF) -CCG increased by 3.7% (or 3.86%) on average over HCCDR, which is a significant increase in error. Such observations again demonstrate the importance of reasonably identifying and processing heterogeneous relationships (i.e., HLFM) when computing representations of users and items, as well as the importance of considering individual and common features of users (i.e., CPT) when learning personalized transfer functions of users.

5. Visual experiment

To investigate the quality of user embedding after the target domain transfer, we also compared the HCCDR proposed in this embodiment with the existing model EMCDR, PTUPCDR.

Specifically, the PTUPCDR is the best baseline method, which also learns the personalized transfer function between the two domains for each user; an EMCDR is a typical paradigm of CDRs that learns the mapping function between source and target domains based only on information of overlapping users between two domains. Note that these baseline methods, i.e., DCDCSR, DCDIR, and ptupccdr, all follow the framework of EMCDR. We employ a t-SNE tool to visualize the displaced user embeddings for each comparison method, and the ground truth user embeddings in the target domain are consistently generated by the HLFM component of HCCDR. Due to space constraints we only discuss the visualization of task 1, while the same conclusions can be drawn for task 2 and task 3. Note that the ratio of training set to test set is set to 8:2 we randomly decimate 512 user embeddings for each method for clear visualization. The results of the visualization are shown in fig. 6. Wherein, the dark color points represent the true user embedding of the target domain, and the light color points represent the user embedding after the transfer. The left-to-right scatter plot in fig. 6 corresponds to EMCDR, PTUPCDR and HCCDR models, respectively.

As shown in fig. 6 (a), the distribution of the transferred user embeddings is significantly different from the distribution of the ground real user embeddings, because one of them (i.e. the transferred user embeddings) is scattered, while the other (i.e. the real user embeddings) is concentrated in a certain area. This is because the EMCDR learns a generic preference transfer function for all users, which cannot capture the different relationships of user preferences between source and target domains. In contrast, sub-graphs 6 (b) and 6 (c) show that the transformed user-embedded profile matches the true user-embedded profile well, since both methods customize the preference transfer function for each user.

Meanwhile, sub-figure 6 (c) shows that the proposed HCCDR successfully captures the common features of the user group in the target domain, as there are many clusters in the distribution of the transferred user embeddings (or real user embeddings). In contrast, there is hardly any apparent cluster in sub-graph 6 (b) because the PTUPCDR does not take into account common features of similar users. This also demonstrates that the cluster-enhanced personalized preference transfer proposed in HCCDR is quite effective in computing high quality user embeddings in the target domain.

Example two

The first embodiment provides a cross-domain recommendation system based on a personalized preference transfer model, which comprises a computer device, wherein the computer device is programmed or configured to execute the cross-domain recommendation method of the personalized preference transfer model.

In summary, the present invention addresses two core issues of CDRs-what to transfer and how to transfer-to optimize CDR performance. For the first problem, existing approaches only learn the embedding of users and items using explicit user-item relationships, and ignore implicit user-user and item-item relationships. For the second problem, existing works do not take into account the common characteristics of users obtained by analyzing similar users in learning the user's personalized preference transfer function between two domains. To this end, we have constructed a novel cluster-enhanced personalized preference transfer model HCCDR that, to address the first problem, first models three types of relationships, namely user-item, user-user, item-item, in a multiple relationship graph, and then carefully designs the heterogeneous relationship-oriented embedding method in each domain to learn more efficient user and item representations by absorbing the information of the multiple relationship graph. To better alleviate the second problem, the model first attempts to learn the user's personalized preference transfer function in combination with the user's common features and individual features. Experimental results show that the model is significantly better than all baselines in two real data sets and three CDR tasks. The main contributions are summarized as follows:

1. To address the limitations of existing CDR approaches in terms of what to transfer and how to transfer, we propose a novel heterogeneous and cluster-enhanced personalized preference transfer model (HCCDR) for cross-domain recommendation that provides better recommendation performance for cross-domain recommendation.

2. To address the first limitation, we construct a multiple relationship graph in each domain, model the user and item (user-item relationship, user-user relationship and item-item relationship), and design an efficient heterogeneous relationship oriented embedding method to generate rich representations of the user and item based on the graph.

3. To address the second limitation, not only is the personalized preference transfer function of each user trained based on the individual characteristics of the user, but the common characteristics of the user and similar users are enhanced by deriving them using a soft clustering mechanism.

4. We conducted extensive experiments on two common datasets. Experimental results showed an average improvement of HCCDR of 4.94% over the optimal baseline in terms of Mean Absolute Error (MAE) and Root Mean Square Error (RMSE).

The foregoing is merely a preferred embodiment of the present invention and is not intended to limit the present invention in any way. While the invention has been described with reference to preferred embodiments, it is not intended to be limiting. Therefore, any simple modification, equivalent variation and modification of the above embodiments according to the technical substance of the present invention shall fall within the scope of the technical solution of the present invention.

Claims

1. The cross-domain recommendation method based on the personalized preference transfer model is characterized by comprising the following steps of:

2. The method for cross-domain recommendation based on personalized preference transfer model according to claim 1, wherein in step S1, a multiple relationship graph is constructed according to interactions of users and items, basic information and corresponding comments, and when embedding of all users and items of a target is calculated in a source domain and a target domain respectively according to the information of the multiple relationship graph, the method specifically comprises:

Finally, the loss function is->

And loss function->

Together minimized, an embedding matrix for the user or item is obtained.

3. The method for cross-domain recommendation based on personalized preference transfer model according to claim 2, wherein in step S101, when generating a corresponding user-user heterogeneous edge in the multiple relationship graph according to the similarity probability between each pair of users, generating a corresponding item-item heterogeneous edge in the multiple relationship graph according to the similarity probability between each pair of items, and generating a corresponding user-item edge in the multiple relationship graph according to the historical interactions between the users and the items, the method comprises:

4. The method for cross-domain recommendation based on personalized preference transfer model according to claim 1, wherein in step S1, according to the learned user representation, when the source domain adopts a soft clustering method to identify similar users for each target user, and uses the individual characteristics of the target user and the common characteristics of the similar users to learn the personalized preference transfer function of the target user, the method specifically comprises:

s201) grouping all users in the source domain into K clusters, and calculating the embedment of each userThe probability of assignment to the kth cluster is then set up to target distribution Y to guide the unsupervised cluster loss function

5. The method for cross-domain recommendation based on personalized preference transfer model according to claim 4, wherein the target distribution Y is established in step S201 to guide an unsupervised clustering loss function

And further updating the cluster center, comprising:

The definition of the target distribution Y is as follows:

The soft allocation is distributed close to the target as follows:

The gradient was calculated as follows:

6. The method for cross-domain recommendation based on a personalized preference transfer model according to claim 4, wherein the attention score expression of each item in the interactive item list in step S202 is as follows:

the weight expression for each item is as follows:

7. The method for cross-domain recommendation based on a personalized preference transfer model according to claim 4, wherein learning a personalized preference transfer function of a target user from a source domain to a target domain for transferable characteristics of the target user using a preset neural network in step S203, and then generating a transferred representation of the target user in the target domain according to the personalized preference transfer function of the target user comprises:

Wherein the vector is

Comprising parameters of a personalized preference transfer function, the function h (.; phi) is a two-layer neural network parameterized by phi,/>

representing transferable characteristics of the target user;

vector

Remodelling into matrix->

in matrix M _ui As a parameter, use target user u _i Is used for generating a target user u _i The shifted representation in the target domain is expressed as follows:

wherein,,

representing a target user u in a source domain _i Is a representation of (c).

8. The method for cross-domain recommendation based on personalized preference transfer model according to claim 4, wherein in step S1, when calculating the predicted score of the item in the target domain according to the transferred user embedding, the method specifically comprises: respectively performing dot product operation on the representation of the target user transferred in the target domain and each item representation in the target domain to obtain the predictive score of the target user on each item in the target domain, and generating personalized recommendation of the target user according to the predictive score of each item.

9. The method for cross-domain recommendation based on personalized preference transfer model according to claim 8, further comprising, after obtaining a predictive score of the target user for each item in the target domain: the personalized preference transfer function is trained in a task-based optimization mode, and loss between each prediction score and the real score is minimized.

10. A cross-domain recommendation system based on a personalized preference transfer model, comprising a computer device programmed or configured to perform the cross-domain recommendation method of a personalized preference transfer model according to any one of claims 1 to 9.