CN113609311A

CN113609311A - Method and device for recommending items

Info

Publication number: CN113609311A
Application number: CN202111157843.9A
Authority: CN
Inventors: 王潇茵; 师博雅; 杜红艳; 张家华; 郑俊康
Original assignee: Aerospace Hongkang Intelligent Technology Beijing Co ltd
Current assignee: Aerospace Hongkang Intelligent Technology Beijing Co ltd
Priority date: 2021-09-30
Filing date: 2021-09-30
Publication date: 2021-11-05

Abstract

A method and apparatus for recommending items is provided. The method comprises the following steps: acquiring a user set, a project set and interactive data; acquiring a corresponding knowledge graph, wherein the knowledge graph comprises entities and relations; constructing a prediction model, wherein the prediction model comprises a knowledge graph model containing a first parameter and a user-item preference model containing a second parameter; constructing a first loss function based on all samples including a positive sample and a negative sample, wherein the positive sample represents an entity having interactive information with a user, and the negative sample represents an entity having no interactive information with the user; constructing a second loss function by using a multi-head attention mechanism and a knowledge graph based on the user set and the item set; training a predictive model based on a first loss function to update a first parameter; training the predictive model to update the second parameter based on the updated first parameter and the second loss function; and selecting at least one of the set of items as a recommended item for one of the set of users based on the updated parametric predictive model.

Description

Method and device for recommending items

Technical Field

The present disclosure relates to a method and apparatus for recommending an item, and more particularly, to a method and apparatus for recommending an item using a knowledge graph and a multi-head attention mechanism.

Background

The rich background knowledge in the knowledge graph is effectively integrated into vector representation of users and projects to become a research hotspot, and some researches utilize knowledge graph embedding tasks to assist in mining the association between knowledge and expand the embedded representation of the users and the projects through the knowledge in the knowledge graph so as to obtain better recommendation accuracy. But often only pay attention to feature mining of historical behaviors of users, and processing of complex semantic relation spaces among entities in the knowledge graph is omitted. Resulting in high-order connectivity can introduce noise to user preferences that can impair large amounts of collaborative information.

In addition, in a general recommendation scenario, items recorded by user interaction are often only a small part of all commodities, and comparatively, the size of the items which are not observed is very large. The model has two optimization strategies in this sample imbalance scenario: a sampling learning strategy and a non-sampling learning strategy. Non-sampled learning tends to be overlooked compared to sampled learning because of its high computational cost.

Disclosure of Invention

The present disclosure provides a method and apparatus for recommending an item.

According to a first aspect of embodiments of the present disclosure, there is provided a method of recommending an item, the method comprising: acquiring a user set, a project set and interaction data for representing interaction information between users and projects; acquiring a knowledge graph corresponding to a user set, a project set and interactive data, wherein the knowledge graph comprises entities and relations, and the entities comprise at least one of users and projects; constructing a prediction model, wherein the prediction model comprises a knowledge graph model containing a first parameter and a user-item preference model containing a second parameter; constructing a first loss function based on all samples including a positive sample and a negative sample, wherein the positive sample represents an entity having interactive information with a user, and the negative sample represents an entity having no interactive information with the user; constructing a second loss function by using a multi-head attention mechanism and a knowledge graph based on the user set and the item set; training a predictive model based on a first loss function to update a first parameter; training the predictive model to update the second parameter based on the updated first parameter and the second loss function; and selecting at least one of the items as a recommended item for one of the users in the user set based on the updated parametric predictive model.

Alternatively, the entities may include a head entity and a tail entity, the relationship representing a relationship between the head entity and the tail entity and being a directed edge.

Optionally, the step of constructing the first loss function based on all samples including the positive and negative samples may include: the entity is projected into the relationship space through a predetermined relationship matrix using the transR method, and the first loss function can be expressed as:

wherein the content of the first and second substances,

in order to be a function of the first loss,

is a head entity, and is characterized in that,

is a tail entity, and is a new tail entity,

in order to be in a relationship of,

in the form of a relationship matrix, the relationship matrix,

representing the number of entities in a batch during the training process,

、

respectively representing the entity of the positive sample and the relation of the positive sample,

、

entities representing all samples respectively and relationships of all samples, for

、

The three-element group of the composition,

、

、

representing the weight of the correct triple score, the weight of the false triple score, and the weight of the total triple score, respectively.

Optionally, wherein the step of constructing the second loss function using the multi-head attention mechanism and the knowledge graph based on the user set and the item set may include: respectively acquiring a user embedded expression vector and a project embedded expression vector based on the user set and the project set; performing feature extraction through graph volume aggregation and combination operation based on the user embedded expression vector and the item embedded expression vector by using a multi-head attention mechanism and a knowledge graph to obtain a user final expression and an item final expression; obtaining a user-item predictor based on the user final representation and the item final representation; and constructing a second loss function based on the user-item predicted values.

Alternatively, the multi-head attention mechanism may utilize a scaled dot product attention mechanism to obtain the attention of the neighbor entity to the propagation of the user preference based on the head entity, the relationship, and the tail entity, and the attention may be expressed as

Wherein the content of the first and second substances,

to represent the attention function of the attention,

representing the dimensions of the head entity or the dimensions of the relationship.

Alternatively, the multi-head attention mechanism may stitch the multi-head attention using a stitching function,

wherein S is a stitching function, n is the number of attentions and n is a positive integer,

are weights.

Alternatively, the user-item predictor can be expressed as

Wherein the content of the first and second substances,

for the user-item prediction value(s),

in order to be presented in the end for the user,

in order to be the final representation of the item,

representing the multiplication of the corresponding elements.

Alternatively, the second loss function may be expressed as

Wherein B represents the item of a training batch,

representing the users of a training batch,

and

representing the weight of the positive and negative samples, respectively.

According to a second aspect of embodiments of the present disclosure, there is provided an apparatus for recommending an item, the apparatus comprising: a data acquisition unit configured to acquire a set of users, an item, and interaction data representing interaction information between the users and the item, and acquire a knowledge graph corresponding to the interaction data, the knowledge graph including entities and relations, the entities including at least one of the users and the item; a model construction unit configured to construct a prediction model, the prediction model comprising a knowledge graph model comprising a first parameter and a user-item preference model comprising a second parameter; a first loss function construction unit configured to construct a first loss function based on all samples including a positive sample representing an entity having interactive information with a user and a negative sample representing an entity having no interactive information with the user; a second loss function construction unit configured to construct a second loss function using a multi-head attention mechanism and a knowledge graph based on the user set and the item set; a co-training unit configured to train the prediction model to update the first parameter based on the first loss function, and train the prediction model to update the second parameter based on the second loss function and the updated first parameter; and a recommending unit configured to select at least one of the item sets as a recommended item for one of the users based on the prediction model of the updated parameters.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform a method of recommending items as described above.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform a method of recommending items as described above.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

according to the method and the device for recommending the items, firstly, an adaptive multi-head attention framework is obtained in a weight calculation mode of graph convolution starting from semantic features of a knowledge graph, so that more noises introduced by high-order connectivity in user preference are reduced in the aspect of complex semantic relation of the knowledge graph, and the feature extraction capability of the graph convolution in the knowledge graph is improved through adaptive design of attention heads; secondly, by adopting a non-sampling method to jointly train two aspects of knowledge graph embedding and user-item preference, the overall characteristics of the data are comprehensively and fairly learned, and the model effect cannot generate large difference due to different sampling modes.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

Fig. 1 is a schematic diagram illustrating an NSKAR model framework according to an exemplary embodiment of the present disclosure.

Fig. 2 is a flowchart illustrating a method of recommending items according to an exemplary embodiment of the present disclosure.

Fig. 3 is a flowchart illustrating a method of constructing a first loss function according to an exemplary embodiment of the present disclosure.

Fig. 4 is a flowchart illustrating a method of constructing a second loss function according to an exemplary embodiment of the present disclosure.

Fig. 5 is a flowchart illustrating an algorithm according to an exemplary embodiment of the present disclosure.

Fig. 6 is a block diagram illustrating an apparatus for recommending items according to an exemplary embodiment of the present disclosure.

Fig. 7 is a block diagram illustrating an electronic device according to an exemplary embodiment of the present disclosure.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The embodiments described in the following examples do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

In this case, the expression "at least one of the items" in the present disclosure means a case where three types of parallel expressions "any one of the items", "a combination of any plural ones of the items", and "the entirety of the items" are included. For example, "include at least one of a and B" includes the following three cases in parallel: (1) comprises A; (2) comprises B; (3) including a and B. For another example, "at least one of the first step and the second step is performed", which means that the following three cases are juxtaposed: (1) executing the step one; (2) executing the step two; (3) and executing the step one and the step two.

Hereinafter, a method and apparatus for recommending items according to the present disclosure will be described in detail with reference to fig. 1 to 7.

Referring to fig. 1, an exemplary embodiment of the present disclosure provides a Non-Sampling Learning Knowledge-aware Attention-enhancing Recommendation model (Non-Sampling Learning for Knowledge-aware Attention communication Recommendation, hereinafter referred to as NSKAR). Specifically, firstly, an adaptive multi-head attention architecture is designed for a weight calculation mode of graph convolution based on semantic features of the knowledge graph, and then a non-sampling method is introduced to jointly train two tasks of knowledge graph embedding and user-item preference.

The goal of NSKAR is to enhance the propagation of user preferences and reduce the noise of the knowledge-graph during higher-order propagation, providing more accurate recommendations for recommendations. Especially, when the embedded expression of the neighbor nodes is aggregated, the number and the combination of the proper heads are designed to adapt to the semantic features of the knowledge graph, so that different semantic spaces can be well represented, the importance of different knowledge connections can be distinguished, and the propagation of user preference can be enhanced.

The input of the NSKAR comprises a user set, an item set, interaction data representing interaction information between users and items and a knowledge graph corresponding to the user set and the item set and the interaction data and the knowledge graph are firstly converted into vector representation through a knowledge graph embedding technology. As a bridge of cooperative training, items in a recommended scene fully utilize non-adopted learning strategies to promote the common optimization of a knowledge map embedding task and a user-item preference task. Because all training samples are considered in the gradient propagation process, the model prediction value is more effective and stable.

As shown in fig. 1, the structure of NSKAR mainly includes three parts: 1) the knowledge graph embedding part is used for learning an optimization model by a non-sampling method so as to further enhance the representation capability of knowledge; 2) modeling a user item preference part by the graph convolution network, and reducing noise when user preference is transmitted in a complex semantic space through an adaptive multi-head attention mechanism; 3) and the cooperative training part integrates the two parts together in an end-to-end mode to jointly optimize, so that the optimal efficiency of the model is realized.

According to an exemplary embodiment of the disclosure, the user set refers to an object required to be recommended or data indicating the object required to be recommended, the item set refers to a recommended object or data indicating the recommended object, for example, the user set is a user ID, the item set is a product number, and the interaction information between the user and the item may be purchase, shopping cart addition, collection addition, browsing, and the like.

Referring to fig. 2, in step S10, a set of users, a set of items, and interaction data representing interaction information between the users and the items are acquired.

In step S20, a knowledge-graph corresponding to the set of users, the set of items, and the interaction data is obtained, the knowledge-graph including entities and relationships, the entities including at least one of the users and the items. For example, the entities may include a head entity and a tail entity, the relationship representing a relationship between the head entity and the tail entity and being a directed edge.

In step S30, a prediction model is constructed, the prediction model including a knowledge graph model (e.g., the knowledge graph embedding portion described with reference to fig. 1) including a first parameter and a user-item preference model (e.g., the graph convolution network modeling user item preference portion described with reference to fig. 1) including a second parameter. In the non-sampling knowledge graph embedding model, a transR method can be adopted to strengthen the characteristic representation of the knowledge graph semantic relation space. In the user-item preference model, NSKAR introduces a multi-head attention mechanism to perform feature description/characterization information on different semantic spaces through different heads, and enhances convolution by self-attention; second, the use of the number of different heads and their combinations increases the adaptability of the multi-headed attention mechanism to different knowledge maps. By designing an adaptive multi-head attention mechanism, the weight calculation mode of graph convolution is improved by the NSKAR model, and the feature extraction capability of aggregation is further promoted.

In step S40, a first loss function of the knowledge graph model is constructed based on all samples including positive samples representing entities having interactive information with the user and negative samples representing entities having no interactive information with the user.

In step S50, a second loss function of the user-item preference model is constructed using a multi-head attention mechanism and a knowledge graph based on the set of users and the set of items.

In step S60, the predictive model is trained to update the first parameter based on the first loss function.

In step S70, the predictive model is trained to update the second parameter based on the updated first parameter and the second loss function.

In step S80, at least one of the items is selected as a recommended item for one of the users in the user set based on the updated parameter prediction model.

According to the exemplary embodiment of the present disclosure, taking any user data as a user with a user name of C who wants to watch a movie, and taking item data as movie D, movie E, movie F, movie G, movie H, movie I and movie J as examples, the preference score prediction values of C for movie D, movie E, movie F, movie G, movie H, movie I and movie J are 0.61, 0.82, 0.93, 0.45, 0.56, 0.77 and 0.95, respectively, obtained by a trained recommendation model, then the preference score prediction values of C for each item data are arranged in descending order of size to obtain: and if the preset threshold is 0.6, recommending the movie J, the movie F, the movie E, the movie I and the movie D to C.

It can be seen that the recommendation method in the exemplary embodiment of the present disclosure is obtained based on the prediction result of the trained recommendation model, and the training process of the recommendation model will be further described below.

In the non-sampling knowledge graph embedding part, a transR method is adopted to strengthen the characteristic representation of the semantic relation space of the knowledge graph.

Specifically, in step S410, the entity is projected into the relationship space through a predetermined relationship matrix using the transR method.

In step S420, a first loss function is constructed. The first loss function may be expressed as:

wherein the content of the first and second substances,

in order to be a function of the first loss,

is a head entity, and is characterized in that,

is a tail entity, and is a new tail entity,

in order to be in a relationship of,

in the form of a relationship matrix, the relationship matrix,

representing the number of entities in a batch during the training process,

、

、

、

The three-element group of the composition,

、

、

Can be used to represent the loss of all data,

can be used to represent the loss of positive data.

In an example embodiment, a positive sample represents an entity that has interactive information with a user, and a negative sample represents an entity that has no interactive information with a user. Because the entities in the positive sample have interactive information with the user, the knowledge conveyed by the corresponding head and tail entities based on the relationship is acquirable from the user historical knowledge map and/or the project historical knowledge map, and therefore the correct knowledge can be expressed by the relationship between the head and tail entities. Thus, the correct triples represent the head entity, the relationship, and the tail entity corresponding to the positive sample, which may be the data obtained from the positive sample. The error triples may represent the head entity, the relationship, and the tail entity corresponding to the negative examples.

In an example embodiment, a first loss function may be used to characterize the loss of unobserved data, thereby reducing noise introduced in user preferences due to higher order connectivity in terms of the complex semantic relationships of the knowledge-graph.

In step S510, a user embedded representation vector and an item embedded representation vector are acquired based on the user set and the item set, respectively.

In step S520, feature extraction is performed by a graph volume aggregation operation based on the user embedded representation vector and the item embedded representation vector using a multi-head attention mechanism and a knowledge graph to obtain a user final representation and an item final representation.

In an example embodiment, the multi-head attention mechanism may utilize a scaled dot product attention mechanism to obtain attention of neighboring entities to user preference propagation based on head, relationship, and tail entities, and the attention may be expressed as

Wherein the content of the first and second substances,

to represent the attention function of the attention,

representing the dimensions of the head entity or the dimensions of the relationship. Furthermore, according to the semantic relation characteristics of the knowledge graph, a multi-head mechanism is introduced into the NSKAR to represent the characteristics of different semantic relation spaces, and multiple groups of weights are trained to fully model the complex relation spaces, so that the characteristic representation capability of the model is improved.

In an exemplary embodiment, the multi-head attention mechanism may stitch the multi-head attention using a stitching function,

are weights.

In step S530, a user-item prediction value is obtained based on the user final representation and the item final representation.

For example, after the embedded representation vector of the user and the embedded representation vector of the item are propagated through the L layer (see fig. 1), feature extraction is performed through an aggregation operation after weighting to obtain different attention assignments.

In an example embodiment, user-item prediction values may be expressed as

Wherein the content of the first and second substances,

for the user-item prediction value(s),

in order to be presented in the end for the user,

in order to be the final representation of the item,

representing the multiplication of the corresponding elements.

In step S540, a second loss function is constructed based on the user-item predictor. The second loss function can be expressed as

Wherein B represents the item of a training batch,

representing the users of a training batch,

and

representing the weight of the positive and negative samples, respectively.

The symbols and functions corresponding to the symbols shown in fig. 5 have the same meanings as those described above and commonly used in the prior art, and redundant description is omitted here. Referring to fig. 5, the algorithm flowchart may perform steps S10 through S70 described with reference to fig. 2. And returns the updated parameters of the predictive model.

Theoretical significance of NSKAR: in overcoming the problem that higher-order connectivity introduces more noise in user preference, the NSKAR model effectively characterizes different semantic relation spaces by flexibly setting the number and combination of heads in multi-head attention, and enhances graph convolution by self-attention. In practical application, the NSKAR can adjust the number and the combination of the heads according to the semantic features of different knowledge maps, has strong adaptability and better generalization capability. More importantly, NSKAR compensates that the conventional recommendation model based on the knowledge graph only focuses on user item interaction information, and cannot realize that noise generated by high-order connectivity mainly comes from a complex semantic relation space in the knowledge graph. This is complementary to the existing work of exploring advanced neural network architectures.

The results of model recommendation tests based on the last. fm dataset, the boost-cross dataset, and the Dianping-Food dataset are shown in tables 1 to 4. BPRMF, CKE, KGCN-LS, KGAT, JNSSKR, CKAN and NSKAR are recommendation models. model is a model.

Table 1 shows data set-based click rate prediction results in CTR scene

As shown in table 1, the CTR scenario is a click-through rate prediction scenario, i.e. the probability of each data interaction in the data set is predicted, and AUC and F1 are used as indicators. Table 1 shows the click rate prediction results of the recommendation model based on last. Both the AUC and F1 values for NSKAR were superior to the other models shown.

Table 2 shows the result of prediction of Recall @ K value based on data set under top-K scene

Table 2 shows the prediction results of the recommendation model based on the Recall @ K of last. the top-K scenario recommends the K most likely interactive items for each user in the test set. Recall @ K is the Recall. In exemplary embodiments of the present disclosure, Recall @ K for NSKAR is superior to the other models shown. In particular, Recall @ K improves music, book and restaurant datasets by 10.02%, 19.17%, 10.35%, respectively, which makes it more practical in real world scenarios.

Tables 3 and 4 show click-through rate predictions based on the Book-Crossing dataset in a cold-boot scenario.

TABLE 3 predicted results based on 20% Book-Cross dataset

TABLE 4 prediction results based on a 40% Book-Cross dataset

The cold start problem is that for a newly registered user or a newly warehoused project, an accurate recommendation model cannot be well trained by common algorithms depending on a large number of user behaviors such as collaborative filtering, deep learning and the like, how to recommend the project for the new user to satisfy the user, how to distribute the newly warehoused project and recommend the project to a user who likes the project. Where NDCG represents the normalized loss cumulative gain. In the cold start scenario, the Recall @ K and NDCG @ K of NSKAR outperform the other models shown with only 20% and 40% training data.

The complexity of NSKAR can be divided into two parts for the temporal complexity of the model. For the knowledge-graph embedding part, updating a batch of items requires

Wherein

A positive knowledge triple representing the batch. For the user item preference part of graph volume network modeling, updating a batch of items needs

(the time overhead of the attention network is small and negligible). Thus, the total cost over all parameters is a sum of both, maintaining the same level of time overhead compared to existing predictive models.

Table 5 shows training efficiency according to an exemplary embodiment of the present disclosure.

TABLE 5 training efficiency

Therefore, although the effective learning algorithm proposed by NSKAR is based on a non-sampling method, the effective learning algorithm is also suitable for a model with a nonlinear prediction layer due to a proper model training strategy, and a deep neural network prediction model can be added in different recommendation scenes in an expanded mode. And even compared with the most advanced deep learning method, NSKAR has remarkable improvement on the recommendation performance and can keep the training efficiency at the same time.

According to the method and the device for recommending the items, firstly, an adaptive multi-head attention framework is obtained in a weight calculation mode of graph convolution starting from semantic features of a knowledge graph, so that more noises introduced by high-order connectivity in user preference are reduced in the aspect of complex semantic relation of the knowledge graph, and the feature extraction capability of the graph convolution in the knowledge graph is improved through adaptive design of attention heads; secondly, the non-sampling method is adopted to jointly train two aspects of knowledge graph embedding and user-item preference, so that the overall characteristics of the data are comprehensively and fairly learned, great difference of model effects caused by different sampling modes is avoided, and training efficiency is not influenced even if the non-sampling mode is used.

Fig. 6 is a block diagram illustrating an apparatus 10 for recommending items according to an exemplary embodiment of the present disclosure.

Referring to fig. 6, the apparatus 10 for recommending items includes: the data acquisition unit 110, the model construction unit 120, the first loss function construction unit 130, the second loss function construction unit 140, the collaborative training unit 150, and the recommendation unit 160.

The data acquisition unit 110 is configured to acquire a set of users, an item, and interaction data representing interaction information between the users and the item, and acquire a knowledge graph corresponding to at least one of the set of users, the set of items, and the interaction data, the knowledge graph including entities and relations, the entities including at least one of the users and the items. The data acquisition unit 110 is configured to perform the method described with reference to steps S10 and S20 in fig. 2.

The model construction unit 120 is configured to construct a prediction model comprising a knowledge graph model comprising a first parameter and a user-item preference model comprising a second parameter. The model building unit 120 is configured to perform the method described with reference to step S30 in fig. 2.

The first loss function constructing unit 130 is configured to construct a first loss function based on all samples including a positive sample and a negative sample, the positive sample representing an entity having interactive information with a user. The first loss function building unit 130 is configured to perform the method described with reference to step S40 in fig. 2.

The second loss function construction unit 140 is configured to construct a second loss function using a multi-head attention mechanism and a knowledge graph based on the set of users and the set of items. The second loss function building unit 140 is configured to perform the method described with reference to step S50 in fig. 2.

The co-training unit 150 is configured to train the prediction model to update the first parameter based on the first loss function and to train the prediction model to update the second parameter based on the second loss function and the updated first parameter. The co-training unit 150 is configured to perform the method described with reference to steps S60 and S70 in fig. 2.

The recommendation unit 160 is configured to select at least one of the set of items as a recommended item for one of the set of users based on the prediction model of the updated parameters. The recommending unit 160 is configured to perform the method described with reference to step S80 in fig. 2.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module/unit performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated herein.

Fig. 7 is a block diagram illustrating an electronic device 1200 according to an example embodiment of the present disclosure.

Referring to fig. 7, an electronic device 1200 includes at least one memory 1201 and at least one processor 1202, the at least one memory 1201 storing computer-executable instructions that, when executed by the at least one processor 1202, cause the at least one processor 1202 to perform a method of recommending items according to an embodiment of the present disclosure.

By way of example, the electronic device 1200 may be a PC computer, tablet device, personal digital assistant, smartphone, or other device capable of executing the instructions described above. Here, the electronic device 1200 need not be a single electronic device, but can be any arrangement or collection of circuits capable of executing the above-described instructions (or sets of instructions), either individually or in combination. The electronic device 1200 may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).

In the electronic device 1200, the processor 1202 may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.

The processor 1202 may execute instructions or code stored in the memory 1201, where the memory 1201 may also store data. The instructions and data may also be transmitted or received over a network via a network interface device, which may employ any known transmission protocol.

The memory 1201 may be integrated with the processor 1202, for example, by having RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, memory 1201 may include a stand-alone device, such as an external disk drive, storage array, or any other storage device usable by a database system. The memory 1201 and the processor 1202 may be operatively coupled or may communicate with each other, e.g., through I/O ports, network connections, etc., such that the processor 1202 is able to read files stored in the memory.

In addition, the electronic device 1200 may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device 1200 may be connected to each other via a bus and/or a network.

According to an embodiment of the present disclosure, there may also be provided a computer-readable storage medium, wherein when instructions stored in the computer-readable storage medium are executed by at least one processor, the at least one processor is caused to perform a method of recommending items according to an embodiment of the present disclosure. Examples of the computer-readable storage medium herein include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, non-volatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD + RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD + RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, Blu-ray or compact disc memory, Hard Disk Drive (HDD), solid-state drive (SSD), card-type memory (such as a multimedia card, a Secure Digital (SD) card or a extreme digital (XD) card), magnetic tape, a floppy disk, a magneto-optical data storage device, an optical data storage device, a hard disk, a magnetic tape, a magneto-optical data storage device, a hard disk, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, A solid state disk, and any other device configured to store and provide a computer program and any associated data, data files, and data structures to a processor or computer in a non-transitory manner such that the processor or computer can execute the computer program. The computer program in the computer-readable storage medium described above can be run in an environment deployed in a computer apparatus, such as a client, a host, a proxy device, a server, and the like, and further, in one example, the computer program and any associated data, data files, and data structures are distributed across a networked computer system such that the computer program and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by one or more processors or computers.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method of recommending items, the method comprising:

acquiring a user set, a project set and interaction data for representing interaction information between users and projects;

obtaining a knowledge graph corresponding to at least one of the set of users, the set of items, and the interaction data, the knowledge graph including entities and relationships, the entities including at least one of the users and the items;

constructing a predictive model comprising a knowledge graph model comprising a first parameter and a user-item preference model comprising a second parameter;

constructing a first loss function based on all samples including a positive sample and a negative sample, wherein the positive sample represents an entity having interactive information with the user, and the negative sample represents an entity having no interactive information with the user;

constructing a second loss function using a multi-head attention mechanism and the knowledge graph based on the set of users and the set of items;

training the predictive model to update the first parameter based on the first loss function;

training the predictive model to update the second parameter based on the second loss function and the updated first parameter; and

selecting at least one of the set of items as a recommended item for a user of the set of users based on the updated parametric predictive model.

2. The method of claim 1, wherein the entities comprise a head entity and a tail entity, and wherein the relationship represents a relationship between the head entity and the tail entity and is a directed edge.

3. The method of claim 2, wherein constructing the first loss function based on all samples including positive samples and negative samples comprises:

projecting the entities into a relationship space through a predetermined relationship matrix using a transR method, and

the first loss function is expressed as:

wherein the content of the first and second substances,

for the purpose of said first loss function,

in order to be the head entity,

is the tail entity of the tail entity,

in order to be in the said relationship,

in order to be said relation matrix, the relation matrix,

representing the number of entities in a batch during the training process,

、

entities representing the positive samples and the relations of the positive samples respectively,

、

entities representing the total samples and the relationships of the total samples respectively,

for the product composed of

、

Composed triad，

、

、

4. The method of claim 3, wherein said constructing a second loss function using a multi-point attention mechanism and said knowledge-graph based on said set of users and said set of items comprises:

respectively acquiring a user embedded expression vector and a project embedded expression vector based on the user set and the project set;

performing feature extraction through graph volume aggregation and combination operation based on the user embedded representation vector and the item embedded representation vector by using a multi-head attention mechanism and the knowledge graph to obtain a user final representation and an item final representation;

obtaining a user-item predictor based on the user final representation and the item final representation; and

constructing the second loss function based on the user-item prediction value.

5. The method of claim 4, wherein the multi-head attention mechanism utilizes a scaled dot product attention mechanism to obtain attention of neighboring entities to propagation of user preferences based on the head entity, the relationship, and the tail entity, the attention being represented as

Wherein，

To represent the attention function of the attention,

a dimension representing the head entity or a dimension of the relationship.

6. The method of claim 5, wherein the multi-head attention mechanism stitches multi-head attention using a stitching function,

wherein the content of the first and second substances,

for the purpose of the said splicing function,

is the amount of attention and

is a positive integer and is a non-zero integer,

are weights.

7. The method of claim 6, wherein the user-item prediction value is represented as

Wherein the content of the first and second substances,

for the user-item prediction value or values,

in order to be presented in the end for the user,

for the final representation of the item in question,

representing the multiplication of the corresponding elements.

8. The method of claim 7, wherein the second loss function is expressed as

Wherein the content of the first and second substances,

the items representing a training batch are represented by,

representing the users of a training batch,

and

representing the weight of the positive and negative samples, respectively.

9. An apparatus for recommending items, the apparatus comprising:

a data acquisition unit configured to acquire a set of users, a set of items, and interaction data representing interaction information between users and items, and acquire a knowledge graph corresponding to at least one of the set of users, the set of items, and the interaction data, the knowledge graph including entities and relations, the entities including at least one of the users and the items;

a model construction unit configured to construct a prediction model comprising a knowledge graph model comprising a first parameter and a user-item preference model comprising a second parameter;

a first loss function construction unit configured to construct a first loss function based on all samples including a positive sample and a negative sample, the positive sample representing an entity having interactive information with the user;

a second loss function construction unit configured to construct a second loss function using a multi-head attention mechanism and the knowledge graph based on the user set and the item set;

a co-training unit configured to train the predictive model to update the first parameter based on the first loss function and train the predictive model to update the second parameter based on the second loss function and the updated first parameter; and

a recommending unit configured to select at least one of the item sets as a recommended item for one of the users based on the prediction model of the updated parameters.

10. An electronic device, comprising:

at least one processor;

at least one memory storing computer-executable instructions,

wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform the method of any one of claims 1 to 8.

11. A computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform the method of any one of claims 1 to 8.