CN112085559B

CN112085559B - Interpretable commodity recommendation method and system based on time sequence knowledge graph

Info

Publication number: CN112085559B
Application number: CN202010833009.6A
Authority: CN
Inventors: 刘士军; 崔志红; 潘丽; 崔立真
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2020-08-18
Filing date: 2020-08-18
Publication date: 2024-07-02
Anticipated expiration: 2040-08-18
Also published as: CN112085559A

Abstract

The invention discloses an interpretable commodity recommendation method and system based on a time sequence knowledge graph, wherein the method comprises the following steps: acquiring a historical click sequence of a user, and constructing a time sequence knowledge graph based on the historical click sequence, wherein the time sequence knowledge graph comprises entities in a plurality of time periods and relations among the entities; based on the initial entity, carrying out behavior selection of setting step length according to the knowledge graph of each time period in sequence, and obtaining the state and the true value of each time period; the state comprises an initial entity, a final entity and a corresponding path; and according to the state and the true value of each time period, adopting a GRU network to recommend commodities and giving out an explanation path. According to the method, through constructing the time sequence knowledge graph, more potential interests of the user can be mined, and the recommendation effectiveness, accuracy and diversity can be improved.

Description

Interpretable commodity recommendation method and system based on time sequence knowledge graph

Technical Field

The invention belongs to the technical field of data mining, and particularly relates to an interpretable commodity recommendation method and system based on a time sequence knowledge graph.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

The knowledge graph (Knowledge Graphs, KGs) contains rich semantic heterogeneous information and is widely applied to the field of interpretable recommendation systems. On one hand, the various and huge entities in the knowledge graph enrich the choices of the potential interests of the user, so that the recommendation system is facilitated to provide accurate suggestions for the user, and the user of the recommendation system is helped to make commodity purchase decisions efficiently. On the other hand, various relation links exist among the entities in the knowledge graph, and the relation can be used as a specific reason for the final selection of the commodity by the user, so that the recommendation system is endowed with better interpretation capability. The application of the knowledge graph in the field of recommendation systems is mainly divided into two genres. The first application is commodity recommendation based on KGs embedding methods, which typically map entities and relationships into vectors of fixed length and recommend commodities to users according to their semantic similarity to each other. The second is a path query-based approach, which typically trains a recommendation model to walk as many entities and relationships as possible in KGs according to a certain policy. Both methods can better recommend goods for users and give the recommendation system a certain interpretation capability. In fact, the information provided by the method has a certain limitation, and the influence of the historical click sequence of the user on the final selected commodity is often ignored, so that the interest of the hidden commodity of the user cannot be truly and objectively judged.

In the field of commodity recommendation with interpretable time sequence, a single-path time sequence task is often adopted. The method generally starts from clicking commodities in a certain history of a plurality of time periods of a user, combines rich and diversified entities and relations in KGs to generate a plurality of paths, maps the paths into vectors, and scores each path according to evaluation indexes to reflect the interest degree of the user in the final commodities, so that the commodities possibly liked by the maximum probability are recommended to the user according to the complete scores. However, such scoring does not take into account the user's global history information and does not provide a sufficiently valuable recommendation.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides an interpretable commodity recommendation method and system based on a time sequence knowledge graph, which comprehensively presumes the potential preference of the user in the next time period by combining the global history of the user and rich entity and diversified relations in KGs.

To achieve the above object, one or more embodiments of the present invention provide the following technical solutions:

An interpretable commodity recommendation method based on a time sequence knowledge graph comprises the following steps:

Acquiring a historical click sequence of a user, and constructing a time sequence knowledge graph based on the historical click sequence, wherein the time sequence knowledge graph comprises entities in a plurality of time periods and relations among the entities;

Based on the initial entity, carrying out behavior selection of setting step length according to the knowledge graph of each time period in sequence, and obtaining the state and the true value of each time period; the state comprises an initial entity, a final entity and a corresponding path;

And according to the state and the true value of each time period, adopting a GRU network to recommend commodities and giving out an explanation path.

One or more embodiments provide an interpretable merchandise recommendation system based on a time-series knowledge graph, comprising:

The knowledge graph construction module is configured to acquire a historical click sequence of a user and construct a time sequence knowledge graph based on the historical click sequence, wherein the time sequence knowledge graph comprises entities of a plurality of time periods and relations among the entities;

the state evaluation module is configured to sequentially select behaviors of setting step length according to the knowledge graph of each time period based on the initial entity, and acquire the state and the true value of each time period; the state comprises an initial entity, a final entity and a corresponding path;

and the commodity recommending module is configured to adopt the GRU network to recommend commodities and give explanation paths according to the states of the time periods and the true values of the time periods.

One or more embodiments provide an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the interpretable merchandise recommendation method based on a time-series knowledge graph when executing the program.

One or more embodiments provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the interpretable commodity recommendation method based on a time-series knowledge graph.

The one or more of the above technical solutions have the following beneficial effects:

The method can be used for explaining and recommending as a time sequence task, and can efficiently and accurately acquire more information from the constructed time sequence knowledge graph in the running process, so that the potential interests of users are enriched to the greatest extent, and the effectiveness, accuracy and diversity of recommendation are improved.

The method and the device can further infer the potential interest of the user in the next stage according to the acquired abundant and diversified information of the user in the near-term stage and give out effective path interpretation, effectively relieve the effect of the recommended commodity caused by insufficient information in the past, recommend the most satisfactory commodity for the user, and improve the interactive experience of the user.

Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.

FIG. 1 is a flowchart of an interpretable commodity recommendation method based on a time-series knowledge graph in an embodiment of the present invention;

FIG. 2 is a schematic diagram of selecting candidate actions according to an embodiment of the present invention.

Detailed Description

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.

Embodiments of the invention and features of the embodiments may be combined with each other without conflict.

Example 1

The embodiment discloses an interpretable commodity recommendation method based on a time sequence knowledge graph, which is based on RL (Reinforcement Learning ) and GRU (Gated Recurrent Unit, gate-controlled circulation unit), combines rich entities and diversified relations in a global history of a user and KGs, can estimate the potential preference of the user in the next time period more quickly, comprehensively and objectively, provides more reasonable recommended commodities for the user and gives recommendation reasons. Specifically, the method comprises two stages:

Stage one: acquiring a historical click sequence of a user, and constructing a time sequence knowledge graph based on the historical click sequence, wherein the time sequence knowledge graph comprises entities of a plurality of time periods and relations among the entities

The first stage specifically comprises:

Step 1: acquiring a historical click sequence of a user for commodities, and dividing according to time;

the step 1 specifically includes:

Step1.1: acquiring a historical user click sequence, wherein the historical S' is sequenced according to a time sequence;

Step1.2: the ordered user click sequence is segmented into a plurality of time segments S' = { S ₁,s₂,....,s_k-1,s_k }, where k is the number of segments.

Step 2: for the historical click sequence in each segmentation time period, acquiring corresponding entities and relations by combining the historical click sequence with AmazonDatabase knowledge maps;

The Amazon Database comprises a large number of user browsing records, corresponding commodity sets, relation sets among commodities and the like, and each history record of the user in each segmentation time period is used for mining corresponding entities and corresponding relations thereof in AmazonDatabase knowledge maps. In this embodiment, the entity and the relationship are subjected to class abstraction, and the formats of the relationship dictionary and the entity dictionary E in the time-series knowledge graph are defined.

Specifically, the category abstraction of an entity is the following five classes: USER, PRODUCT, WORD, RPRODUCT, BRAND, CATEGORY. The category of relationships includes 8 categories: PURCHASE, MENTION, DESCRIBRED _AS, PRODUCED_BY, BELONG_TO, ALSO_BOUGHT, ALSO_ VIEWED, BOUGHT _ TOGETHER.

The format of the physical dictionary E comprises a user dictionary U, a commodity physical dictionary P, a brand physical dictionary B, a catalog physical dictionary C, a characteristic dictionary F and a relation dictionary, and is defined as follows ：USER：{PURCHASE:PRODUCT,MENTION:WORD},WORD：{MENTION:USER,DESCRIBED_AS:PRODUCT},PRODUCT：{PURCAHSE:USER,DESCRIBED_AS:WORD,PRODUCED_BY:BRAND,BELONG_TO:CATEGORY,ALSO_BOUGHT:PRODUCT,ALSO_VIEWD:PRODUCT,BOUGHT_TOUGHTER:PRODUCT},BRAND：{PRODUCED_BY:PRODUCT},CATEGORY：{BELONG_TO:PRODUCT},PRODUCT：{ALSO_BOUGHT:PRODUCT,ALSO_VIEWED:PRODUCT,BOUGHT_TOGHTER:PRODUCT}.

Step 3: and for each segmentation time period, constructing a corresponding knowledge graph of the time period according to the entity and the relation.

The step 3 specifically includes:

step 3.1: embedding each entity and relation mined in the step 2 into a vector with fixed length by using a TransE method, and labeling each entity and relation, namely, each entity has a unique identifier eid, and each relation has a unique identifier rid.

Step 3.2: all entity and relationship information extracted is loaded using the Pickle component.

Step 3.3: and constructing an entity set epsilon of the time sequence knowledge graph G, and storing the entity set epsilon into an entity set of the dictionary G. And (3) sequentially and completely importing the user click sequence s _k and the entity epsilon ^* mined in the AmazonDatabase knowledge graph in each time period in the step (2) into the entity set of the dictionary G, storing the embedded expression of each entity and the entity ID corresponding to each entity each time, and calculating the final entity number. And storing all the entities into the dictionary G according to the above flow by the divided K time periods, wherein the format of the dictionary is G [ entity ] [ eid1], and eid1 represents the id of the entity.

Step 3.4: and constructing a relation set R of the time sequence knowledge graph set G, and storing the relation set R into a dictionary G. The addition pattern of each path is as follows.

For example: the added relationship is PURCAHSE, then the format added in the time series knowledge graph is (USER, uid, PURCAHSE, PRODCUT, pid), where uid is the id of the USER and pid is the id of the USER purchasing the product. It should be noted that in order to perform information mining on the timing knowledge graph more square-variant and efficient in the later period, each relation is added in a bidirectional manner. Illustrating the principle of adding relationships in a time-series knowledge graph: each time a relation (etype 1, eid1, relation, etype2, eid 2) is added, it is necessary to store G [ etype1] [ eid1] [ relation ] [ etype2] [ eid2] and G [ etype2] [ eid2] [ relation ] [ etype1] [ eid1], i.e. a bi-directional edge, if there is an edge from e ₁ to e ₂, then there is an edge from e ₂ to e ₁.

Step 3.5: the degree of each entity is calculated, i.e. the number of relations each entity is connected to is calculated at the time of logging.

Step 3.6: and storing all the information in the steps 3.1-3.5 into a storage path of the time sequence knowledge graph.

The time sequence knowledge graph dictionary G is constructed, and the above steps describe a time period of the user click sequence s _k combined with the corresponding knowledge graph to construct a time period of the knowledge graph dictionary G.

Step 4: and combining the knowledge maps corresponding to all the segmentation time periods to obtain a time sequence knowledge map.

And constructing a corresponding knowledge graph of each period according to the steps 3.1-3.5 by combining each click sequence of the user with the entity and relation information mined in the AmazonDatabase knowledge graph, and finally obtaining the constructed time sequence knowledge graph.

Step 5: generating a training set and a testing set, and carrying out the following steps of 8 on the data: 2, a part of the proportion is divided into a training set and the other part is a data set.

In the knowledge-graph, once the type of a relationship is determined, the entity to which it is linked can also be determined, and when the term "behavior" is used in this specification, it is indicated as a relationship in the time-series knowledge-graph, it refers to the relationship and the entity to which it is linked.

Stage two: based on the initial entity, carrying out behavior selection of setting step length according to the knowledge graph of each time period in sequence, and obtaining the state and the true value of each time period; the state includes an initial entity, a final entity, and a corresponding path.

The basic principle of the stage two is that the training agent searches and selects paths in the established time sequence knowledge graph according to a certain strategy, and the final purpose is to find the most likely interactive product I for the user. From this point of view and according to the method of constructing a time-series knowledge graph in the first embodiment, the initial entity of the path search is defined as the user U. In addition, the steps in the second embodiment represent a specific flow of dividing a time period in the first embodiment. Fig. 2 shows in detail the functional block diagram of the present system.

In this embodiment, a set of entity states S, a set of behaviors A, a set of paths P, a set of rewards R, a set of probabilities Q are defined, wherein each state of an entity is a tripletWherein U is from the initial entity U, e _t is the final entity after T steps and its historical footprint(Which contains all entities and relationships intermediate from the starting entity to the final entity),Wherein r _t represents a relationship. Secondly, the build format of each behavior is as follows: The selection of the next relationship can only be made from the unselected relationships.

The second stage specifically comprises:

step 1: acquiring and formatting the entity and relation of the time sequence knowledge graph;

step 2: and receiving an initial user entity, and carrying out probability evaluation on the possible actions selected by the user according to the knowledge graph to obtain the current state, the path and the corresponding probability corresponding to each step of action selection of the user, and a path set when the set step length is reached.

The step 2 specifically includes:

Step 2.1: for the knowledge graph in each segmentation time period, pruning the redundant relation according to a certain strategy;

Specifically, according to the number of relationships of each entity link counted in step 3 in the first stage, if the number of relationships exceeds a set behavior number threshold, in order to improve the searching efficiency, the less prominent features are eliminated by using Term Frequency-inverse document Frequency (TF-IDF), so as to further tailor the redundant relationships in the knowledge graph. If the TF-TDF of a relationship is greater than the set TF-TDF threshold or the frequency of occurrence is less than the set frequency threshold, the relationship is deleted. The definition of TF-IDF is as follows:

Where c represents a document, j represents a relationship of some type, tf _c,j is the number of times that the relationship j occurs in the current document c, tf _c is the number of all relationships contained in the current document c. df is the total number of documents for all relationships, df _j is the number of documents containing relationship j, and 1 is added to avoid the case where the denominator is 0.

Step 2.2: starting from an initial user entity, searching all behaviors connected with the initial user entity in the knowledge graph as candidate behavior sets, performing probability evaluation, and selecting M behaviors with highest probability evaluation;

Step 2.3: for each of the M behaviors, searching all the behaviors connected with the behavior in the knowledge graph as candidate behavior sets of the behavior, performing probability evaluation, and selecting M behaviors with highest probability evaluation; repeating the steps until reaching a preset T;

step 2.4: each behavior selected and its corresponding probability are stored.

In this embodiment, the behavior that the user has a high probability of interacting with is filtered. Recording candidate behavior set in current stateWhere t represents the number of steps taken, i.e., the off coefficient.

After pruning in step 2.1, the remaining link relationship of each entity is about 250, and if each step is searched 250 times in the subsequent step T, the number of later searches increases exponentially, so the search strategy design in each step in the system is shown in fig. 2: starting from the first entity u, searching all the behaviors linked with the first entity u as candidate behavior sets, calculating scores according to a multi-step evaluation function, and selecting M behaviors with highest evaluation scores according to the evaluation scores (M is a preset candidate behavior number threshold).

In the second step, selecting all behaviors linked with the M behaviors selected in the first step as candidate behavior sets, calculating scores again according to a multi-step evaluation function, selecting the M behaviors with the highest evaluation scores again, sequentially selecting according to the strategy until reaching a set step length T, storing probabilities of the M behaviors selected each time into a probability set Q, and finally selecting the M candidate behavior sets as follows

Where f ((r, e) |u) represents a multi-step path score.

In order to realize multi-step searching and storing in the time sequence knowledge graph, we construct Path-patterns with steps of 3 and 4 as follows, and label different types of Path-patterns, namely each Path-Pattern is marked with a unique pid as shown below.

Path-Pattern＝{

1:((NONE,USER),(MENTION,WORD),(DESCRIBED_AS,PRODUCT),

11:((NONE,USER),(PURCAHSE,PRODUCT),(PURCAHSE,USER),(PURCAHSE,PRODUCT)),

12:((NONE,USER),(PURCAHSE,PRODUCT),(DESCRIBED_AS,WORD),(DESCRIBED_AS,PRODUCT)),

13:((NONE,USER),(PURCAHSE,PRODUCT),(PRODUCED_BY,BRAND),(PRODUCED_BY,PRODUCT)),

14:((NONE,USER),(PURCAHSE,PRODUCT),(BELONG_TO,CATEGORY),(BELONG_TO,PRODUCT)),

15:((NONE,USER),(PURCAHSE,PRODUCT),(ALSO_BOUGHT,PRODUCT),(ALSO_BOUGHT,PRODUCT)),

16:((NONE,USER),(PURCAHSE,PRODUCT),(ALSO_VIEWED,PRODUCT),(ALSO_VIEWED,PRODUCT)),

17:((NONE,USER),(PURCAHSE,PRODUCT),(BOUGHT_TOGETHER,PRODUCT),(BOUGHT_TOGETHER,PRODUCT)),

18:((NONE,USER),(MENTION,WORD),(MENTION,USER),(PURCHASE,PRODUCT)),

}

It should be noted that there may be hundreds or thousands of Path-patterns from the beginning entity to the end entity, so to reduce the amount of computation, we only compute Path-patterns with the smallest forward-transfer relationship, i.e.

As shown in Path-Pattern obtained by the user after T steps, a multipath evaluation function is designed for more accurately calculating the probability of the user selecting the final entity. In general, a user must be affected by all of the previous behavior selections a ₁,…,a_t when selecting behavior a _t+1 for each state s _t, so the probability of the user selecting the final entity should be the accumulation of all previous selection probabilities. From this point of view, the present multipath evaluation function is defined as follows:

Wherein, <, >, represents a dot product, PATH PATTERN, which represents a 1-inverse t-step, j represents the number of forward-passed relationships in the Path-Pattern, i.eT-j+1 represents the number of backward-transferred relationships, i.eT represents the total number of relationships in the Path-Pattern, s represents the currently selected relationship location in the forward or backward transfer relationship, and b _et is the offset of entity e.

When t=j=0, he measures the cosine similarity between the two entities e ₀ and e _t as follows,

When t=j=1, it estimates the similarity of the two entities e ₀ and e _t embedded by the link relationship as follows,

Step 3: and for the path set when the set step length is reached, evaluating the possibility of each path, and rewarding according to the possibility.

And rewarding or punishing possible results after each state s _t selects the action a _t+1, if the rewarding mechanism indicates that the actions selected by the agent are possible to interact by the user, amplifying the selection possibility of the actions, otherwise, reducing the selection possibility of the actions, storing rewards of each action into a rewards set R, and storing the actions into a path set P. If each state selects one action and then immediately rewards or penalizes, there is often a defect of local optimum, so in order to increase the accuracy of the final entity of user interaction and reduce the calculation amount, a global evaluation method is used herein, that is, the evaluation is that the interaction possibility between the user and the final entity reached after T steps are taken, the higher the possibility is, the rewards are carried out, and otherwise, the penalization is carried out, as shown below.

Where e _T represents the final entity reached after T steps, f (u, e _T) estimates the probability of user interaction with the final entity, and Σf (u, i) represents the sum of the probabilities of all items selectable by the user.

The current state, optional behavior, path and rewards of the agent after random walk in the time sequence knowledge graph are obtained, and the information is stored.

Step 4: and carrying out diversity evaluation on each path for the path set when the set step length is reached.

When the agent starts to walk the selectable behavior from the user, a path set of the T steps can be obtained. To improve the interpretability of the system, we use the diversity assessment feedback described below to perform diversity assessment for all interpretation paths that the user finds.

Where F is the number of existing paths,Is an embedded vector of all relationships in the existing path, i.e Is the current path.

Step 5: from the initial user perspective, the path mined by the knowledge graph of the current segmentation time period is optimized based on the information such as the probability, the rewards and the like of the alternative path acquired by each step, so that the accumulated rewards of the arrival state after T steps are selected are maximized, and the accumulated rewards function is as follows.

The step5 specifically includes:

and 5.1, from the perspective of a user, quantifying the state after the T steps by using a Policy Network and a Value Network, and calculating the accumulated rewards of the state after the T steps of each path.

Wherein the Policy network aggregates the states s _t and the candidate behaviorAs input, the probability of each behavior is output, with a probability output of 0 for behaviors that are not in the candidate behavior set. All states are mapped to a true Value using Value Network to measure the final prize rating for the state reached after T walk is selected. The structure of the two is defined as follows:

Wherein, Is a Policy Network that is configured to provide a Network,Is the Value Network, x is the hidden feature of the learned state s, as well as Hadamard product, W _p and W _v are the parameter settings of the Policy Network and the Value Network, respectively.

Step 5.2: GD (GRADIENT DESCENT ) was used to optimize the system to maximize the jackpot for the T-step post-state, as shown by the optimization function below.

Where Θ represents the parameters of the Policy Network and Value Networkde and G is the jackpot from the initial state s to the final state after the T steps.

Step 5.3: outputting the final state set S _T and the optimized accumulated prize optimal value of each state, namely, based on the user angle, the observation information of knowledge graph mining in the current segmentation time periodWherein state set s _t∈S_T is for initial entity u, final entity e _t, and historyAnd (3) cascading the three.

Step 6: processing the information of each segmentation period of the time sequence knowledge graph according to the steps 6-8 to finally obtain the observation information of each segmentation period of the constructed time sequence knowledge graph

Stage three: and according to the state and the true value of each time period, adopting a GRU network to recommend commodities and giving out an explanation path.

Step 1: data preprocessing, namely cascading states and true values of the states in the time sequence knowledge graph output in the second stage to serve as input;

step 2: and the GRU network performs data analysis and integration according to the knowledge graph information of the K segmentation periods mined in the second embodiment, and recommends the final potential Top-N commodities of the user.

Step2.1: setting initial parameters of the GRU neural network, updating weights, biases and the like of the gate and the reset gate, and outputting a hidden description h ₀ of the current state according to the initial parameters, and storing the hidden description h ₀ into a memory module.

Step2.2: knowledge graph observation information of each stage of miningA memory module for each period stored in the GRU.

Step2.3: knowledge-graph observation information of first period obtained in neural network processing stage two in GRUThe hidden state description h ₀ and the current knowledge-graph observation information which are initially set are describedTogether, a hidden description of the current state, h ₁, is estimated. Knowledge-graph observation information at each period is calculated as follows:

Wherein, Is an input vector representing knowledge graph observation information in the current period,Representing the final output hidden state. The U _z is used for processing the data,U _c is the parameter transfer matrix of the GRU neural network, σ (x) =1/(1+e ^x) is the sigmoid function used to implement the nonlinear mapping, and is the product in elements between the two vectors,Is a candidate state value activated by tanh (x).

Step2.4: according to the principle in Step2.3, observing the knowledge graph of all periodsData analysis is carried out to finally obtain the hidden description of the next period of the userThe description includes the observation information O _k in all previous knowledge maps.

Step2.5: and outputting the interaction possibility between the user and the commodity in the final hidden description by using a softmax function, and selecting Top-N commodities to recommend to the user.

Step3: more efficient interpretation pathways were selected in conjunction with the Top-N products recommended in Step 2. In the second stage, paths between each user and the recommended commodity are obtained, and the paths can be multiple, and one more efficient path is selected as the interpretation path for the current user to select the commodity. The efficiency of one path is inversely proportional to its path length, as shown below.

Wherein,Is all the relationships in the path, i.e

Step4: the system loss function is optimized using the average cross entropy to obtain the most likely user interactive product from a time perspective, as shown below.

Where y is the positive and negative sample training data,Is the optimal expected cumulative value that each entity in embodiment two is based on the user.

The first stage is optimizing the space resource acquisition, so that the subsequent recommendation system can acquire more accurate, comprehensive and various commodity and path information as efficiently as possible, and further excavate more comprehensive and deep abundant semantic information, structural information and the like in KGs, which is generally regarded as a space completion task.

The second stage is aimed at optimizing the time sequence selection, and selecting the commodity of most interest for the user from the time sequence. The task is to score a user selected commodity according to an evaluation index based on a historical click commodity sequence of the user, and further recommend commodities which are likely to be purchased in the future to the user, and is generally regarded as a time sequence prediction optimization task.

In the recommending process, the recommended commodities are subjected to joint optimization again according to scores obtained by evaluation indexes of two optimizing tasks. Our task is to recommend diversified products to the user and give a reasonable interpretation for the final score derived from the optimization task according to two aspects.

Example two

An object of the present embodiment is to provide an interpretable commodity recommendation system based on a time-series knowledge graph, including:

Example III

An object of the present embodiment is to provide an electronic apparatus.

An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of embodiment one when the program is executed.

Example IV

An object of the present embodiment is to provide a computer-readable storage medium.

A computer readable storage medium having stored thereon a computer program which when executed by a processor implements the method described in embodiment one.

The steps involved in the devices of the second, third and fourth embodiments correspond to those of the first embodiment of the method, and the detailed description of the embodiments can be found in the related description section of the first embodiment. The term "computer-readable storage medium" should be taken to include a single medium or multiple media including one or more sets of instructions; it should also be understood to include any medium capable of storing, encoding or carrying a set of instructions for execution by a processor and that cause the processor to perform any one of the methods of the present invention.

It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented by general-purpose computer means, alternatively they may be implemented by program code executable by computing means, whereby they may be stored in storage means for execution by computing means, or they may be made into individual integrated circuit modules separately, or a plurality of modules or steps in them may be made into a single integrated circuit module. The present invention is not limited to any specific combination of hardware and software.

While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.

Claims

1. An interpretable commodity recommendation method based on a time sequence knowledge graph is characterized by comprising the following steps of:

acquiring a historical click sequence of a user for commodities, and constructing a time sequence knowledge graph based on the historical click sequence, wherein the time sequence knowledge graph comprises entities in a plurality of time periods and relations among the entities; the entity comprises a user and a commodity;

Based on the initial entity, carrying out behavior selection of setting step length according to the knowledge graph of each time period in sequence, and obtaining the state and the true value of each time period; the state comprises an initial entity, a final entity and a corresponding path; the initial entity is a user, and the final entity is a commodity;

according to the state and the true value of each time period, adopting a GRU network to recommend commodities and giving out explanation paths;

The step of acquiring the state of each time period comprises the following steps:

Searching all behaviors connected with the initial entity in the knowledge graph as candidate behavior sets, carrying out probability evaluation, and selecting M behaviors with highest probability evaluation;

For each of the M behaviors, searching all the behaviors connected with the behavior in the knowledge graph as candidate behavior sets of the behavior, performing probability evaluation, and selecting M behaviors with highest probability evaluation; repeating the steps until the set step length is reached; obtaining a state set comprising an initial entity, a final entity and a corresponding path;

The step of obtaining the state reality value of each time period comprises the following steps:

for the path set when the set step length is reached, evaluating the possibility of each path, rewarding according to the possibility, and evaluating the diversity; the evaluation is that the interaction possibility between the user and the final entity reached after the step T is completed, the higher the possibility is, the rewards are carried out, and otherwise, the punishment is carried out;

The calculation formula of the rewards is as follows:

Wherein, Representative of the final entity reached after T steps,Estimated is the probability of the user interacting with the final entity,Representing the sum of probabilities for all items selectable by the user; PATH PATTERN, representing a 1-inverse t step, j represents the number of forward-transferred relationships in the PATH PATTERN;

The calculation formula of the diversity evaluation is as follows:

Where F is the number of existing paths, Is an embedded vector of all relationships in the existing path, i.e，Is the current path;

starting from an initial user entity, optimizing a path of a current time period to maximize a state cumulative reward obtained through the selection of a T step length, and mapping each state into a true value to measure a final reward;

the calculation formula of the jackpot is as follows:

the method for recommending the commodity and giving the explanation path by adopting the GRU network comprises the following steps:

Taking the state and the true value of each time period as input, and obtaining a plurality of candidate commodities and paths between a user and each candidate commodity by adopting a GRU network;

Selecting an interpretation path with highest efficiency as a commodity selected by the user based on the efficiency of the path between the user and each candidate commodity; the efficiency of the path is inversely proportional to the length of the path, and the specific expression formula is as follows:

。

2. The method for interpretable commodity recommendation based on a temporal knowledge base of claim 1, wherein constructing the temporal knowledge base based on a historical click sequence comprises:

dividing the historical click sequence according to a set time interval;

And for the historical click sequence in each time period, acquiring the entity and the relation in the historical click sequence, and constructing a knowledge graph corresponding to the time period according to the entity and the relation.

3. The method for recommending an interpretable commodity based on a time-series knowledge graph according to claim 2, wherein constructing a knowledge graph corresponding to the time period according to the entity and the relationship comprises:

embedding each entity and relationship into a fixed length vector;

writing each entity and ID thereof into a knowledge graph in the form of a set expression;

relationships are added bi-directionally between the respective entities.

4. The method for recommending an interpretable commodity based on a time-series knowledge graph according to claim 1, wherein the knowledge graph of each time period is further pruned according to the degree of the entity before the state of each time period is acquired.

5. A system for an interpretable commodity recommendation method based on a time-series knowledge-graph according to any one of claims 1-4, comprising:

the knowledge graph construction module is configured to acquire a historical click sequence of a user for the commodity and construct a time sequence knowledge graph based on the historical click sequence, wherein the time sequence knowledge graph comprises entities of a plurality of time periods and relations among the entities; the entity comprises a user and a commodity;

The state evaluation module is configured to sequentially select behaviors of setting step length according to the knowledge graph of each time period based on the initial entity, and acquire the state and the true value of each time period; the state comprises an initial entity, a final entity and a corresponding path; the initial entity is a user, and the final entity is a commodity;

6. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the interpretable commodity recommendation method based on a time-series knowledge-graph according to any one of claims 1-4 when the program is executed by the processor.

7. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the interpretable commodity recommendation method based on a time-series knowledge graph according to any one of claims 1-4.