CN113609311A - Method and device for recommending items - Google Patents

Method and device for recommending items Download PDF

Info

Publication number
CN113609311A
CN113609311A CN202111157843.9A CN202111157843A CN113609311A CN 113609311 A CN113609311 A CN 113609311A CN 202111157843 A CN202111157843 A CN 202111157843A CN 113609311 A CN113609311 A CN 113609311A
Authority
CN
China
Prior art keywords
user
loss function
item
items
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111157843.9A
Other languages
Chinese (zh)
Inventor
王潇茵
师博雅
杜红艳
张家华
郑俊康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Hongkang Intelligent Technology Beijing Co ltd
Original Assignee
Aerospace Hongkang Intelligent Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Hongkang Intelligent Technology Beijing Co ltd filed Critical Aerospace Hongkang Intelligent Technology Beijing Co ltd
Priority to CN202111157843.9A priority Critical patent/CN113609311A/en
Publication of CN113609311A publication Critical patent/CN113609311A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

A method and apparatus for recommending items is provided. The method comprises the following steps: acquiring a user set, a project set and interactive data; acquiring a corresponding knowledge graph, wherein the knowledge graph comprises entities and relations; constructing a prediction model, wherein the prediction model comprises a knowledge graph model containing a first parameter and a user-item preference model containing a second parameter; constructing a first loss function based on all samples including a positive sample and a negative sample, wherein the positive sample represents an entity having interactive information with a user, and the negative sample represents an entity having no interactive information with the user; constructing a second loss function by using a multi-head attention mechanism and a knowledge graph based on the user set and the item set; training a predictive model based on a first loss function to update a first parameter; training the predictive model to update the second parameter based on the updated first parameter and the second loss function; and selecting at least one of the set of items as a recommended item for one of the set of users based on the updated parametric predictive model.

Description

Method and device for recommending items
Technical Field
The present disclosure relates to a method and apparatus for recommending an item, and more particularly, to a method and apparatus for recommending an item using a knowledge graph and a multi-head attention mechanism.
Background
The rich background knowledge in the knowledge graph is effectively integrated into vector representation of users and projects to become a research hotspot, and some researches utilize knowledge graph embedding tasks to assist in mining the association between knowledge and expand the embedded representation of the users and the projects through the knowledge in the knowledge graph so as to obtain better recommendation accuracy. But often only pay attention to feature mining of historical behaviors of users, and processing of complex semantic relation spaces among entities in the knowledge graph is omitted. Resulting in high-order connectivity can introduce noise to user preferences that can impair large amounts of collaborative information.
In addition, in a general recommendation scenario, items recorded by user interaction are often only a small part of all commodities, and comparatively, the size of the items which are not observed is very large. The model has two optimization strategies in this sample imbalance scenario: a sampling learning strategy and a non-sampling learning strategy. Non-sampled learning tends to be overlooked compared to sampled learning because of its high computational cost.
Disclosure of Invention
The present disclosure provides a method and apparatus for recommending an item.
According to a first aspect of embodiments of the present disclosure, there is provided a method of recommending an item, the method comprising: acquiring a user set, a project set and interaction data for representing interaction information between users and projects; acquiring a knowledge graph corresponding to a user set, a project set and interactive data, wherein the knowledge graph comprises entities and relations, and the entities comprise at least one of users and projects; constructing a prediction model, wherein the prediction model comprises a knowledge graph model containing a first parameter and a user-item preference model containing a second parameter; constructing a first loss function based on all samples including a positive sample and a negative sample, wherein the positive sample represents an entity having interactive information with a user, and the negative sample represents an entity having no interactive information with the user; constructing a second loss function by using a multi-head attention mechanism and a knowledge graph based on the user set and the item set; training a predictive model based on a first loss function to update a first parameter; training the predictive model to update the second parameter based on the updated first parameter and the second loss function; and selecting at least one of the items as a recommended item for one of the users in the user set based on the updated parametric predictive model.
Alternatively, the entities may include a head entity and a tail entity, the relationship representing a relationship between the head entity and the tail entity and being a directed edge.
Optionally, the step of constructing the first loss function based on all samples including the positive and negative samples may include: the entity is projected into the relationship space through a predetermined relationship matrix using the transR method, and the first loss function can be expressed as:
Figure 213800DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 951949DEST_PATH_IMAGE002
in order to be a function of the first loss,
Figure 330978DEST_PATH_IMAGE003
is a head entity, and is characterized in that,
Figure 397154DEST_PATH_IMAGE004
is a tail entity, and is a new tail entity,
Figure 152620DEST_PATH_IMAGE005
in order to be in a relationship of,
Figure 479696DEST_PATH_IMAGE006
in the form of a relationship matrix, the relationship matrix,
Figure 151331DEST_PATH_IMAGE007
representing the number of entities in a batch during the training process,
Figure 829437DEST_PATH_IMAGE008
Figure 388594DEST_PATH_IMAGE009
respectively representing the entity of the positive sample and the relation of the positive sample,
Figure 711122DEST_PATH_IMAGE010
Figure 166374DEST_PATH_IMAGE011
entities representing all samples respectively and relationships of all samples, for
Figure 331776DEST_PATH_IMAGE003
Figure 163466DEST_PATH_IMAGE005
Figure 120927DEST_PATH_IMAGE004
The three-element group of the composition,
Figure 481501DEST_PATH_IMAGE012
Figure 399778DEST_PATH_IMAGE013
Figure 910525DEST_PATH_IMAGE014
representing the weight of the correct triple score, the weight of the false triple score, and the weight of the total triple score, respectively.
Optionally, wherein the step of constructing the second loss function using the multi-head attention mechanism and the knowledge graph based on the user set and the item set may include: respectively acquiring a user embedded expression vector and a project embedded expression vector based on the user set and the project set; performing feature extraction through graph volume aggregation and combination operation based on the user embedded expression vector and the item embedded expression vector by using a multi-head attention mechanism and a knowledge graph to obtain a user final expression and an item final expression; obtaining a user-item predictor based on the user final representation and the item final representation; and constructing a second loss function based on the user-item predicted values.
Alternatively, the multi-head attention mechanism may utilize a scaled dot product attention mechanism to obtain the attention of the neighbor entity to the propagation of the user preference based on the head entity, the relationship, and the tail entity, and the attention may be expressed as
Figure 66700DEST_PATH_IMAGE015
Wherein the content of the first and second substances,
Figure 394913DEST_PATH_IMAGE016
to represent the attention function of the attention,
Figure 393962DEST_PATH_IMAGE017
representing the dimensions of the head entity or the dimensions of the relationship.
Alternatively, the multi-head attention mechanism may stitch the multi-head attention using a stitching function,
Figure 833034DEST_PATH_IMAGE018
wherein S is a stitching function, n is the number of attentions and n is a positive integer,
Figure 109294DEST_PATH_IMAGE019
are weights.
Alternatively, the user-item predictor can be expressed as
Figure 483775DEST_PATH_IMAGE020
Figure 579907DEST_PATH_IMAGE021
Wherein the content of the first and second substances,
Figure 822669DEST_PATH_IMAGE022
for the user-item prediction value(s),
Figure 953436DEST_PATH_IMAGE023
in order to be presented in the end for the user,
Figure 482507DEST_PATH_IMAGE024
in order to be the final representation of the item,
Figure 331514DEST_PATH_IMAGE025
representing the multiplication of the corresponding elements.
Alternatively, the second loss function may be expressed as
Figure 112388DEST_PATH_IMAGE026
Wherein B represents the item of a training batch,
Figure 97662DEST_PATH_IMAGE027
representing the users of a training batch,
Figure 813945DEST_PATH_IMAGE028
and
Figure 150248DEST_PATH_IMAGE029
representing the weight of the positive and negative samples, respectively.
According to a second aspect of embodiments of the present disclosure, there is provided an apparatus for recommending an item, the apparatus comprising: a data acquisition unit configured to acquire a set of users, an item, and interaction data representing interaction information between the users and the item, and acquire a knowledge graph corresponding to the interaction data, the knowledge graph including entities and relations, the entities including at least one of the users and the item; a model construction unit configured to construct a prediction model, the prediction model comprising a knowledge graph model comprising a first parameter and a user-item preference model comprising a second parameter; a first loss function construction unit configured to construct a first loss function based on all samples including a positive sample representing an entity having interactive information with a user and a negative sample representing an entity having no interactive information with the user; a second loss function construction unit configured to construct a second loss function using a multi-head attention mechanism and a knowledge graph based on the user set and the item set; a co-training unit configured to train the prediction model to update the first parameter based on the first loss function, and train the prediction model to update the second parameter based on the second loss function and the updated first parameter; and a recommending unit configured to select at least one of the item sets as a recommended item for one of the users based on the prediction model of the updated parameters.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform a method of recommending items as described above.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform a method of recommending items as described above.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
according to the method and the device for recommending the items, firstly, an adaptive multi-head attention framework is obtained in a weight calculation mode of graph convolution starting from semantic features of a knowledge graph, so that more noises introduced by high-order connectivity in user preference are reduced in the aspect of complex semantic relation of the knowledge graph, and the feature extraction capability of the graph convolution in the knowledge graph is improved through adaptive design of attention heads; secondly, by adopting a non-sampling method to jointly train two aspects of knowledge graph embedding and user-item preference, the overall characteristics of the data are comprehensively and fairly learned, and the model effect cannot generate large difference due to different sampling modes.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
Fig. 1 is a schematic diagram illustrating an NSKAR model framework according to an exemplary embodiment of the present disclosure.
Fig. 2 is a flowchart illustrating a method of recommending items according to an exemplary embodiment of the present disclosure.
Fig. 3 is a flowchart illustrating a method of constructing a first loss function according to an exemplary embodiment of the present disclosure.
Fig. 4 is a flowchart illustrating a method of constructing a second loss function according to an exemplary embodiment of the present disclosure.
Fig. 5 is a flowchart illustrating an algorithm according to an exemplary embodiment of the present disclosure.
Fig. 6 is a block diagram illustrating an apparatus for recommending items according to an exemplary embodiment of the present disclosure.
Fig. 7 is a block diagram illustrating an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The embodiments described in the following examples do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
In this case, the expression "at least one of the items" in the present disclosure means a case where three types of parallel expressions "any one of the items", "a combination of any plural ones of the items", and "the entirety of the items" are included. For example, "include at least one of a and B" includes the following three cases in parallel: (1) comprises A; (2) comprises B; (3) including a and B. For another example, "at least one of the first step and the second step is performed", which means that the following three cases are juxtaposed: (1) executing the step one; (2) executing the step two; (3) and executing the step one and the step two.
Hereinafter, a method and apparatus for recommending items according to the present disclosure will be described in detail with reference to fig. 1 to 7.
Fig. 1 is a schematic diagram illustrating an NSKAR model framework according to an exemplary embodiment of the present disclosure.
Referring to fig. 1, an exemplary embodiment of the present disclosure provides a Non-Sampling Learning Knowledge-aware Attention-enhancing Recommendation model (Non-Sampling Learning for Knowledge-aware Attention communication Recommendation, hereinafter referred to as NSKAR). Specifically, firstly, an adaptive multi-head attention architecture is designed for a weight calculation mode of graph convolution based on semantic features of the knowledge graph, and then a non-sampling method is introduced to jointly train two tasks of knowledge graph embedding and user-item preference.
The goal of NSKAR is to enhance the propagation of user preferences and reduce the noise of the knowledge-graph during higher-order propagation, providing more accurate recommendations for recommendations. Especially, when the embedded expression of the neighbor nodes is aggregated, the number and the combination of the proper heads are designed to adapt to the semantic features of the knowledge graph, so that different semantic spaces can be well represented, the importance of different knowledge connections can be distinguished, and the propagation of user preference can be enhanced.
The input of the NSKAR comprises a user set, an item set, interaction data representing interaction information between users and items and a knowledge graph corresponding to the user set and the item set and the interaction data and the knowledge graph are firstly converted into vector representation through a knowledge graph embedding technology. As a bridge of cooperative training, items in a recommended scene fully utilize non-adopted learning strategies to promote the common optimization of a knowledge map embedding task and a user-item preference task. Because all training samples are considered in the gradient propagation process, the model prediction value is more effective and stable.
As shown in fig. 1, the structure of NSKAR mainly includes three parts: 1) the knowledge graph embedding part is used for learning an optimization model by a non-sampling method so as to further enhance the representation capability of knowledge; 2) modeling a user item preference part by the graph convolution network, and reducing noise when user preference is transmitted in a complex semantic space through an adaptive multi-head attention mechanism; 3) and the cooperative training part integrates the two parts together in an end-to-end mode to jointly optimize, so that the optimal efficiency of the model is realized.
According to an exemplary embodiment of the disclosure, the user set refers to an object required to be recommended or data indicating the object required to be recommended, the item set refers to a recommended object or data indicating the recommended object, for example, the user set is a user ID, the item set is a product number, and the interaction information between the user and the item may be purchase, shopping cart addition, collection addition, browsing, and the like.
Fig. 2 is a flowchart illustrating a method of recommending items according to an exemplary embodiment of the present disclosure.
Referring to fig. 2, in step S10, a set of users, a set of items, and interaction data representing interaction information between the users and the items are acquired.
In step S20, a knowledge-graph corresponding to the set of users, the set of items, and the interaction data is obtained, the knowledge-graph including entities and relationships, the entities including at least one of the users and the items. For example, the entities may include a head entity and a tail entity, the relationship representing a relationship between the head entity and the tail entity and being a directed edge.
In step S30, a prediction model is constructed, the prediction model including a knowledge graph model (e.g., the knowledge graph embedding portion described with reference to fig. 1) including a first parameter and a user-item preference model (e.g., the graph convolution network modeling user item preference portion described with reference to fig. 1) including a second parameter. In the non-sampling knowledge graph embedding model, a transR method can be adopted to strengthen the characteristic representation of the knowledge graph semantic relation space. In the user-item preference model, NSKAR introduces a multi-head attention mechanism to perform feature description/characterization information on different semantic spaces through different heads, and enhances convolution by self-attention; second, the use of the number of different heads and their combinations increases the adaptability of the multi-headed attention mechanism to different knowledge maps. By designing an adaptive multi-head attention mechanism, the weight calculation mode of graph convolution is improved by the NSKAR model, and the feature extraction capability of aggregation is further promoted.
In step S40, a first loss function of the knowledge graph model is constructed based on all samples including positive samples representing entities having interactive information with the user and negative samples representing entities having no interactive information with the user.
In step S50, a second loss function of the user-item preference model is constructed using a multi-head attention mechanism and a knowledge graph based on the set of users and the set of items.
In step S60, the predictive model is trained to update the first parameter based on the first loss function.
In step S70, the predictive model is trained to update the second parameter based on the updated first parameter and the second loss function.
In step S80, at least one of the items is selected as a recommended item for one of the users in the user set based on the updated parameter prediction model.
According to the exemplary embodiment of the present disclosure, taking any user data as a user with a user name of C who wants to watch a movie, and taking item data as movie D, movie E, movie F, movie G, movie H, movie I and movie J as examples, the preference score prediction values of C for movie D, movie E, movie F, movie G, movie H, movie I and movie J are 0.61, 0.82, 0.93, 0.45, 0.56, 0.77 and 0.95, respectively, obtained by a trained recommendation model, then the preference score prediction values of C for each item data are arranged in descending order of size to obtain: and if the preset threshold is 0.6, recommending the movie J, the movie F, the movie E, the movie I and the movie D to C.
It can be seen that the recommendation method in the exemplary embodiment of the present disclosure is obtained based on the prediction result of the trained recommendation model, and the training process of the recommendation model will be further described below.
Fig. 3 is a flowchart illustrating a method of constructing a first loss function according to an exemplary embodiment of the present disclosure.
In the non-sampling knowledge graph embedding part, a transR method is adopted to strengthen the characteristic representation of the semantic relation space of the knowledge graph.
Specifically, in step S410, the entity is projected into the relationship space through a predetermined relationship matrix using the transR method.
In step S420, a first loss function is constructed. The first loss function may be expressed as:
Figure 469234DEST_PATH_IMAGE030
Figure 967736DEST_PATH_IMAGE031
wherein the content of the first and second substances,
Figure 713975DEST_PATH_IMAGE002
in order to be a function of the first loss,
Figure 271996DEST_PATH_IMAGE003
is a head entity, and is characterized in that,
Figure 270039DEST_PATH_IMAGE004
is a tail entity, and is a new tail entity,
Figure 229904DEST_PATH_IMAGE005
in order to be in a relationship of,
Figure 147045DEST_PATH_IMAGE006
in the form of a relationship matrix, the relationship matrix,
Figure 457940DEST_PATH_IMAGE007
representing the number of entities in a batch during the training process,
Figure 508942DEST_PATH_IMAGE008
Figure 323314DEST_PATH_IMAGE009
respectively representing the entity of the positive sample and the relation of the positive sample,
Figure 676935DEST_PATH_IMAGE010
Figure 84914DEST_PATH_IMAGE011
entities representing all samples respectively and relationships of all samples, for
Figure 549393DEST_PATH_IMAGE003
Figure 483851DEST_PATH_IMAGE005
Figure 133007DEST_PATH_IMAGE004
The three-element group of the composition,
Figure 887337DEST_PATH_IMAGE012
Figure 155507DEST_PATH_IMAGE013
Figure 944471DEST_PATH_IMAGE014
representing the weight of the correct triple score, the weight of the false triple score, and the weight of the total triple score, respectively.
Figure 515261DEST_PATH_IMAGE032
Can be used to represent the loss of all data,
Figure 756887DEST_PATH_IMAGE033
can be used to represent the loss of positive data.
In an example embodiment, a positive sample represents an entity that has interactive information with a user, and a negative sample represents an entity that has no interactive information with a user. Because the entities in the positive sample have interactive information with the user, the knowledge conveyed by the corresponding head and tail entities based on the relationship is acquirable from the user historical knowledge map and/or the project historical knowledge map, and therefore the correct knowledge can be expressed by the relationship between the head and tail entities. Thus, the correct triples represent the head entity, the relationship, and the tail entity corresponding to the positive sample, which may be the data obtained from the positive sample. The error triples may represent the head entity, the relationship, and the tail entity corresponding to the negative examples.
In an example embodiment, a first loss function may be used to characterize the loss of unobserved data, thereby reducing noise introduced in user preferences due to higher order connectivity in terms of the complex semantic relationships of the knowledge-graph.
Fig. 4 is a flowchart illustrating a method of constructing a second loss function according to an exemplary embodiment of the present disclosure.
In step S510, a user embedded representation vector and an item embedded representation vector are acquired based on the user set and the item set, respectively.
In step S520, feature extraction is performed by a graph volume aggregation operation based on the user embedded representation vector and the item embedded representation vector using a multi-head attention mechanism and a knowledge graph to obtain a user final representation and an item final representation.
In an example embodiment, the multi-head attention mechanism may utilize a scaled dot product attention mechanism to obtain attention of neighboring entities to user preference propagation based on head, relationship, and tail entities, and the attention may be expressed as
Figure 828748DEST_PATH_IMAGE034
Wherein the content of the first and second substances,
Figure 862432DEST_PATH_IMAGE016
to represent the attention function of the attention,
Figure 463177DEST_PATH_IMAGE035
representing the dimensions of the head entity or the dimensions of the relationship. Furthermore, according to the semantic relation characteristics of the knowledge graph, a multi-head mechanism is introduced into the NSKAR to represent the characteristics of different semantic relation spaces, and multiple groups of weights are trained to fully model the complex relation spaces, so that the characteristic representation capability of the model is improved.
In an exemplary embodiment, the multi-head attention mechanism may stitch the multi-head attention using a stitching function,
Figure 457678DEST_PATH_IMAGE036
wherein S is a stitching function, n is the number of attentions and n is a positive integer,
Figure 802072DEST_PATH_IMAGE019
are weights.
In step S530, a user-item prediction value is obtained based on the user final representation and the item final representation.
For example, after the embedded representation vector of the user and the embedded representation vector of the item are propagated through the L layer (see fig. 1), feature extraction is performed through an aggregation operation after weighting to obtain different attention assignments.
In an example embodiment, user-item prediction values may be expressed as
Figure 440995DEST_PATH_IMAGE037
Figure 212642DEST_PATH_IMAGE038
Wherein the content of the first and second substances,
Figure 694439DEST_PATH_IMAGE022
for the user-item prediction value(s),
Figure 842523DEST_PATH_IMAGE023
in order to be presented in the end for the user,
Figure 847870DEST_PATH_IMAGE024
in order to be the final representation of the item,
Figure 55997DEST_PATH_IMAGE025
representing the multiplication of the corresponding elements.
In step S540, a second loss function is constructed based on the user-item predictor. The second loss function can be expressed as
Figure 759511DEST_PATH_IMAGE039
Wherein B represents the item of a training batch,
Figure 321073DEST_PATH_IMAGE040
representing the users of a training batch,
Figure 59222DEST_PATH_IMAGE028
and
Figure 172672DEST_PATH_IMAGE029
representing the weight of the positive and negative samples, respectively.
Fig. 5 is a flowchart illustrating an algorithm according to an exemplary embodiment of the present disclosure.
The symbols and functions corresponding to the symbols shown in fig. 5 have the same meanings as those described above and commonly used in the prior art, and redundant description is omitted here. Referring to fig. 5, the algorithm flowchart may perform steps S10 through S70 described with reference to fig. 2. And returns the updated parameters of the predictive model.
Theoretical significance of NSKAR: in overcoming the problem that higher-order connectivity introduces more noise in user preference, the NSKAR model effectively characterizes different semantic relation spaces by flexibly setting the number and combination of heads in multi-head attention, and enhances graph convolution by self-attention. In practical application, the NSKAR can adjust the number and the combination of the heads according to the semantic features of different knowledge maps, has strong adaptability and better generalization capability. More importantly, NSKAR compensates that the conventional recommendation model based on the knowledge graph only focuses on user item interaction information, and cannot realize that noise generated by high-order connectivity mainly comes from a complex semantic relation space in the knowledge graph. This is complementary to the existing work of exploring advanced neural network architectures.
The results of model recommendation tests based on the last. fm dataset, the boost-cross dataset, and the Dianping-Food dataset are shown in tables 1 to 4. BPRMF, CKE, KGCN-LS, KGAT, JNSSKR, CKAN and NSKAR are recommendation models. model is a model.
Table 1 shows data set-based click rate prediction results in CTR scene
Figure 363482DEST_PATH_IMAGE041
As shown in table 1, the CTR scenario is a click-through rate prediction scenario, i.e. the probability of each data interaction in the data set is predicted, and AUC and F1 are used as indicators. Table 1 shows the click rate prediction results of the recommendation model based on last. Both the AUC and F1 values for NSKAR were superior to the other models shown.
Table 2 shows the result of prediction of Recall @ K value based on data set under top-K scene
Figure 243582DEST_PATH_IMAGE042
Table 2 shows the prediction results of the recommendation model based on the Recall @ K of last. the top-K scenario recommends the K most likely interactive items for each user in the test set. Recall @ K is the Recall. In exemplary embodiments of the present disclosure, Recall @ K for NSKAR is superior to the other models shown. In particular, Recall @ K improves music, book and restaurant datasets by 10.02%, 19.17%, 10.35%, respectively, which makes it more practical in real world scenarios.
Tables 3 and 4 show click-through rate predictions based on the Book-Crossing dataset in a cold-boot scenario.
TABLE 3 predicted results based on 20% Book-Cross dataset
Figure 836237DEST_PATH_IMAGE043
TABLE 4 prediction results based on a 40% Book-Cross dataset
Figure 386167DEST_PATH_IMAGE044
The cold start problem is that for a newly registered user or a newly warehoused project, an accurate recommendation model cannot be well trained by common algorithms depending on a large number of user behaviors such as collaborative filtering, deep learning and the like, how to recommend the project for the new user to satisfy the user, how to distribute the newly warehoused project and recommend the project to a user who likes the project. Where NDCG represents the normalized loss cumulative gain. In the cold start scenario, the Recall @ K and NDCG @ K of NSKAR outperform the other models shown with only 20% and 40% training data.
The complexity of NSKAR can be divided into two parts for the temporal complexity of the model. For the knowledge-graph embedding part, updating a batch of items requires
Figure 939639DEST_PATH_IMAGE045
Wherein
Figure 967638DEST_PATH_IMAGE046
A positive knowledge triple representing the batch. For the user item preference part of graph volume network modeling, updating a batch of items needs
Figure 680379DEST_PATH_IMAGE047
(the time overhead of the attention network is small and negligible). Thus, the total cost over all parameters is a sum of both, maintaining the same level of time overhead compared to existing predictive models.
Table 5 shows training efficiency according to an exemplary embodiment of the present disclosure.
TABLE 5 training efficiency
Figure 525844DEST_PATH_IMAGE048
Therefore, although the effective learning algorithm proposed by NSKAR is based on a non-sampling method, the effective learning algorithm is also suitable for a model with a nonlinear prediction layer due to a proper model training strategy, and a deep neural network prediction model can be added in different recommendation scenes in an expanded mode. And even compared with the most advanced deep learning method, NSKAR has remarkable improvement on the recommendation performance and can keep the training efficiency at the same time.
According to the method and the device for recommending the items, firstly, an adaptive multi-head attention framework is obtained in a weight calculation mode of graph convolution starting from semantic features of a knowledge graph, so that more noises introduced by high-order connectivity in user preference are reduced in the aspect of complex semantic relation of the knowledge graph, and the feature extraction capability of the graph convolution in the knowledge graph is improved through adaptive design of attention heads; secondly, the non-sampling method is adopted to jointly train two aspects of knowledge graph embedding and user-item preference, so that the overall characteristics of the data are comprehensively and fairly learned, great difference of model effects caused by different sampling modes is avoided, and training efficiency is not influenced even if the non-sampling mode is used.
Fig. 6 is a block diagram illustrating an apparatus 10 for recommending items according to an exemplary embodiment of the present disclosure.
Referring to fig. 6, the apparatus 10 for recommending items includes: the data acquisition unit 110, the model construction unit 120, the first loss function construction unit 130, the second loss function construction unit 140, the collaborative training unit 150, and the recommendation unit 160.
The data acquisition unit 110 is configured to acquire a set of users, an item, and interaction data representing interaction information between the users and the item, and acquire a knowledge graph corresponding to at least one of the set of users, the set of items, and the interaction data, the knowledge graph including entities and relations, the entities including at least one of the users and the items. The data acquisition unit 110 is configured to perform the method described with reference to steps S10 and S20 in fig. 2.
The model construction unit 120 is configured to construct a prediction model comprising a knowledge graph model comprising a first parameter and a user-item preference model comprising a second parameter. The model building unit 120 is configured to perform the method described with reference to step S30 in fig. 2.
The first loss function constructing unit 130 is configured to construct a first loss function based on all samples including a positive sample and a negative sample, the positive sample representing an entity having interactive information with a user. The first loss function building unit 130 is configured to perform the method described with reference to step S40 in fig. 2.
The second loss function construction unit 140 is configured to construct a second loss function using a multi-head attention mechanism and a knowledge graph based on the set of users and the set of items. The second loss function building unit 140 is configured to perform the method described with reference to step S50 in fig. 2.
The co-training unit 150 is configured to train the prediction model to update the first parameter based on the first loss function and to train the prediction model to update the second parameter based on the second loss function and the updated first parameter. The co-training unit 150 is configured to perform the method described with reference to steps S60 and S70 in fig. 2.
The recommendation unit 160 is configured to select at least one of the set of items as a recommended item for one of the set of users based on the prediction model of the updated parameters. The recommending unit 160 is configured to perform the method described with reference to step S80 in fig. 2.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module/unit performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated herein.
Fig. 7 is a block diagram illustrating an electronic device 1200 according to an example embodiment of the present disclosure.
Referring to fig. 7, an electronic device 1200 includes at least one memory 1201 and at least one processor 1202, the at least one memory 1201 storing computer-executable instructions that, when executed by the at least one processor 1202, cause the at least one processor 1202 to perform a method of recommending items according to an embodiment of the present disclosure.
By way of example, the electronic device 1200 may be a PC computer, tablet device, personal digital assistant, smartphone, or other device capable of executing the instructions described above. Here, the electronic device 1200 need not be a single electronic device, but can be any arrangement or collection of circuits capable of executing the above-described instructions (or sets of instructions), either individually or in combination. The electronic device 1200 may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).
In the electronic device 1200, the processor 1202 may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
The processor 1202 may execute instructions or code stored in the memory 1201, where the memory 1201 may also store data. The instructions and data may also be transmitted or received over a network via a network interface device, which may employ any known transmission protocol.
The memory 1201 may be integrated with the processor 1202, for example, by having RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, memory 1201 may include a stand-alone device, such as an external disk drive, storage array, or any other storage device usable by a database system. The memory 1201 and the processor 1202 may be operatively coupled or may communicate with each other, e.g., through I/O ports, network connections, etc., such that the processor 1202 is able to read files stored in the memory.
In addition, the electronic device 1200 may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device 1200 may be connected to each other via a bus and/or a network.
According to an embodiment of the present disclosure, there may also be provided a computer-readable storage medium, wherein when instructions stored in the computer-readable storage medium are executed by at least one processor, the at least one processor is caused to perform a method of recommending items according to an embodiment of the present disclosure. Examples of the computer-readable storage medium herein include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, non-volatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD + RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD + RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, Blu-ray or compact disc memory, Hard Disk Drive (HDD), solid-state drive (SSD), card-type memory (such as a multimedia card, a Secure Digital (SD) card or a extreme digital (XD) card), magnetic tape, a floppy disk, a magneto-optical data storage device, an optical data storage device, a hard disk, a magnetic tape, a magneto-optical data storage device, a hard disk, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, A solid state disk, and any other device configured to store and provide a computer program and any associated data, data files, and data structures to a processor or computer in a non-transitory manner such that the processor or computer can execute the computer program. The computer program in the computer-readable storage medium described above can be run in an environment deployed in a computer apparatus, such as a client, a host, a proxy device, a server, and the like, and further, in one example, the computer program and any associated data, data files, and data structures are distributed across a networked computer system such that the computer program and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by one or more processors or computers.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (11)

1. A method of recommending items, the method comprising:
acquiring a user set, a project set and interaction data for representing interaction information between users and projects;
obtaining a knowledge graph corresponding to at least one of the set of users, the set of items, and the interaction data, the knowledge graph including entities and relationships, the entities including at least one of the users and the items;
constructing a predictive model comprising a knowledge graph model comprising a first parameter and a user-item preference model comprising a second parameter;
constructing a first loss function based on all samples including a positive sample and a negative sample, wherein the positive sample represents an entity having interactive information with the user, and the negative sample represents an entity having no interactive information with the user;
constructing a second loss function using a multi-head attention mechanism and the knowledge graph based on the set of users and the set of items;
training the predictive model to update the first parameter based on the first loss function;
training the predictive model to update the second parameter based on the second loss function and the updated first parameter; and
selecting at least one of the set of items as a recommended item for a user of the set of users based on the updated parametric predictive model.
2. The method of claim 1, wherein the entities comprise a head entity and a tail entity, and wherein the relationship represents a relationship between the head entity and the tail entity and is a directed edge.
3. The method of claim 2, wherein constructing the first loss function based on all samples including positive samples and negative samples comprises:
projecting the entities into a relationship space through a predetermined relationship matrix using a transR method, and
the first loss function is expressed as:
Figure 630612DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 141227DEST_PATH_IMAGE002
for the purpose of said first loss function,
Figure 655385DEST_PATH_IMAGE003
in order to be the head entity,
Figure 50595DEST_PATH_IMAGE004
is the tail entity of the tail entity,
Figure 333808DEST_PATH_IMAGE005
in order to be in the said relationship,
Figure 15325DEST_PATH_IMAGE006
in order to be said relation matrix, the relation matrix,
Figure 16779DEST_PATH_IMAGE007
representing the number of entities in a batch during the training process,
Figure 684521DEST_PATH_IMAGE008
Figure 212454DEST_PATH_IMAGE009
entities representing the positive samples and the relations of the positive samples respectively,
Figure 940239DEST_PATH_IMAGE010
Figure 428989DEST_PATH_IMAGE011
entities representing the total samples and the relationships of the total samples respectively,
for the product composed of
Figure 900422DEST_PATH_IMAGE003
Figure 282862DEST_PATH_IMAGE005
Figure 447127DEST_PATH_IMAGE004
Composed triad,
Figure 423173DEST_PATH_IMAGE012
Figure 291772DEST_PATH_IMAGE013
Figure 669664DEST_PATH_IMAGE014
Representing the weight of the correct triple score, the weight of the false triple score, and the weight of the total triple score, respectively.
4. The method of claim 3, wherein said constructing a second loss function using a multi-point attention mechanism and said knowledge-graph based on said set of users and said set of items comprises:
respectively acquiring a user embedded expression vector and a project embedded expression vector based on the user set and the project set;
performing feature extraction through graph volume aggregation and combination operation based on the user embedded representation vector and the item embedded representation vector by using a multi-head attention mechanism and the knowledge graph to obtain a user final representation and an item final representation;
obtaining a user-item predictor based on the user final representation and the item final representation; and
constructing the second loss function based on the user-item prediction value.
5. The method of claim 4, wherein the multi-head attention mechanism utilizes a scaled dot product attention mechanism to obtain attention of neighboring entities to propagation of user preferences based on the head entity, the relationship, and the tail entity, the attention being represented as
Figure 739251DEST_PATH_IMAGE015
Wherein,
Figure 202593DEST_PATH_IMAGE016
To represent the attention function of the attention,
Figure 140462DEST_PATH_IMAGE017
a dimension representing the head entity or a dimension of the relationship.
6. The method of claim 5, wherein the multi-head attention mechanism stitches multi-head attention using a stitching function,
Figure 372860DEST_PATH_IMAGE018
wherein the content of the first and second substances,
Figure 878928DEST_PATH_IMAGE019
for the purpose of the said splicing function,
Figure 563987DEST_PATH_IMAGE020
is the amount of attention and
Figure 42898DEST_PATH_IMAGE021
is a positive integer and is a non-zero integer,
Figure 395382DEST_PATH_IMAGE022
are weights.
7. The method of claim 6, wherein the user-item prediction value is represented as
Figure 806771DEST_PATH_IMAGE023
Figure 103761DEST_PATH_IMAGE024
Wherein the content of the first and second substances,
Figure 258798DEST_PATH_IMAGE025
for the user-item prediction value or values,
Figure 465789DEST_PATH_IMAGE026
in order to be presented in the end for the user,
Figure 313659DEST_PATH_IMAGE027
for the final representation of the item in question,
Figure 97944DEST_PATH_IMAGE028
representing the multiplication of the corresponding elements.
8. The method of claim 7, wherein the second loss function is expressed as
Figure 56673DEST_PATH_IMAGE029
Wherein the content of the first and second substances,
Figure 852591DEST_PATH_IMAGE030
the items representing a training batch are represented by,
Figure 730417DEST_PATH_IMAGE031
representing the users of a training batch,
Figure 142944DEST_PATH_IMAGE032
and
Figure 905363DEST_PATH_IMAGE033
representing the weight of the positive and negative samples, respectively.
9. An apparatus for recommending items, the apparatus comprising:
a data acquisition unit configured to acquire a set of users, a set of items, and interaction data representing interaction information between users and items, and acquire a knowledge graph corresponding to at least one of the set of users, the set of items, and the interaction data, the knowledge graph including entities and relations, the entities including at least one of the users and the items;
a model construction unit configured to construct a prediction model comprising a knowledge graph model comprising a first parameter and a user-item preference model comprising a second parameter;
a first loss function construction unit configured to construct a first loss function based on all samples including a positive sample and a negative sample, the positive sample representing an entity having interactive information with the user;
a second loss function construction unit configured to construct a second loss function using a multi-head attention mechanism and the knowledge graph based on the user set and the item set;
a co-training unit configured to train the predictive model to update the first parameter based on the first loss function and train the predictive model to update the second parameter based on the second loss function and the updated first parameter; and
a recommending unit configured to select at least one of the item sets as a recommended item for one of the users based on the prediction model of the updated parameters.
10. An electronic device, comprising:
at least one processor;
at least one memory storing computer-executable instructions,
wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform the method of any one of claims 1 to 8.
11. A computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform the method of any one of claims 1 to 8.
CN202111157843.9A 2021-09-30 2021-09-30 Method and device for recommending items Pending CN113609311A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111157843.9A CN113609311A (en) 2021-09-30 2021-09-30 Method and device for recommending items

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111157843.9A CN113609311A (en) 2021-09-30 2021-09-30 Method and device for recommending items

Publications (1)

Publication Number Publication Date
CN113609311A true CN113609311A (en) 2021-11-05

Family

ID=78343292

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111157843.9A Pending CN113609311A (en) 2021-09-30 2021-09-30 Method and device for recommending items

Country Status (1)

Country Link
CN (1) CN113609311A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116151892A (en) * 2023-04-20 2023-05-23 中国科学技术大学 Item recommendation method, system, device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902171A (en) * 2019-01-30 2019-06-18 中国地质大学(武汉) Text Relation extraction method and system based on layering knowledge mapping attention model
CN110275960A (en) * 2019-06-11 2019-09-24 中国电子科技集团公司电子科学研究院 Representation method and system based on the knowledge mapping and text information for censuring sentence
CN110598006A (en) * 2019-09-17 2019-12-20 南京医渡云医学技术有限公司 Model training method, triplet embedding method, apparatus, medium, and device
US20200073933A1 (en) * 2018-08-29 2020-03-05 National University Of Defense Technology Multi-triplet extraction method based on entity-relation joint extraction model
US20200125957A1 (en) * 2018-10-17 2020-04-23 Peking University Multi-agent cooperation decision-making and training method
CN111581395A (en) * 2020-05-06 2020-08-25 西安交通大学 Model fusion triple representation learning system and method based on deep learning
CN112148883A (en) * 2019-06-29 2020-12-29 华为技术有限公司 Embedding representation method of knowledge graph and related equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200073933A1 (en) * 2018-08-29 2020-03-05 National University Of Defense Technology Multi-triplet extraction method based on entity-relation joint extraction model
US20200125957A1 (en) * 2018-10-17 2020-04-23 Peking University Multi-agent cooperation decision-making and training method
CN109902171A (en) * 2019-01-30 2019-06-18 中国地质大学(武汉) Text Relation extraction method and system based on layering knowledge mapping attention model
CN110275960A (en) * 2019-06-11 2019-09-24 中国电子科技集团公司电子科学研究院 Representation method and system based on the knowledge mapping and text information for censuring sentence
CN112148883A (en) * 2019-06-29 2020-12-29 华为技术有限公司 Embedding representation method of knowledge graph and related equipment
CN110598006A (en) * 2019-09-17 2019-12-20 南京医渡云医学技术有限公司 Model training method, triplet embedding method, apparatus, medium, and device
CN111581395A (en) * 2020-05-06 2020-08-25 西安交通大学 Model fusion triple representation learning system and method based on deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116151892A (en) * 2023-04-20 2023-05-23 中国科学技术大学 Item recommendation method, system, device and storage medium
CN116151892B (en) * 2023-04-20 2023-08-29 中国科学技术大学 Item recommendation method, system, device and storage medium

Similar Documents

Publication Publication Date Title
US20210311595A1 (en) Multi-level table grouping
Wei et al. Neighborhood change in metropolitan America, 1990 to 2010
Miao et al. Context‐based dynamic pricing with online clustering
US10713236B2 (en) Systems and methods for analysis of data stored in a large dataset
US7805010B2 (en) Cross-ontological analytics for alignment of different classification schemes
SG192380A1 (en) Social media data analysis system and method
US9767417B1 (en) Category predictions for user behavior
CN111782951B (en) Method and device for determining display page, computer system and medium
JP7171471B2 (en) LEARNING MODEL GENERATION SUPPORT DEVICE AND LEARNING MODEL GENERATION SUPPORT METHOD
Morozov Measuring benefits from new products in markets with information frictions
US20070233668A1 (en) Method, system, and computer program product for semantic annotation of data in a software system
EP3842958A1 (en) Platform for conversation-based insight search in analytics systems
US20170300461A1 (en) Representation of an Interactive Document as a Graph of Entities
Ryan Deep learning with structured data
CN113609311A (en) Method and device for recommending items
US9760933B1 (en) Interactive shopping advisor for refinancing product queries
CN109376152A (en) Big data system file data preparation method and system
CN113570058B (en) Recommendation method and device
CN115080856A (en) Recommendation method and device and training method and device of recommendation model
WO2023286087A1 (en) Providing personalized recommendations based on users behavior over an e-commerce platform
US20190065987A1 (en) Capturing knowledge coverage of machine learning models
CN114791968A (en) Processing method, device and system for graph calculation and computer readable medium
Duboue Feature Engineering: Human-in-the-Loop Machine Learning
CN113537403A (en) Training method and device and prediction method and device of image processing model
US20160148095A1 (en) Electronic calculating apparatus, method thereof and non-transitory machine-readable medium thereof for sensing context and recommending information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211105