WO2023097929A1 - 一种基于改进型kgat模型的知识图谱推荐方法及系统 - Google Patents

一种基于改进型kgat模型的知识图谱推荐方法及系统 Download PDF

Info

Publication number
WO2023097929A1
WO2023097929A1 PCT/CN2022/081055 CN2022081055W WO2023097929A1 WO 2023097929 A1 WO2023097929 A1 WO 2023097929A1 CN 2022081055 W CN2022081055 W CN 2022081055W WO 2023097929 A1 WO2023097929 A1 WO 2023097929A1
Authority
WO
WIPO (PCT)
Prior art keywords
recommendation
layer
user
items
item
Prior art date
Application number
PCT/CN2022/081055
Other languages
English (en)
French (fr)
Inventor
徐慧英
朱信忠
靳林通
Original Assignee
浙江师范大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江师范大学 filed Critical 浙江师范大学
Publication of WO2023097929A1 publication Critical patent/WO2023097929A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Definitions

  • This application belongs to the technical field of machine learning, and in particular relates to a knowledge map recommendation method and system based on an improved KGAT model.
  • the recommendation system is one of the effective tools to solve the problem of information overload.
  • the system generally includes user portrait modeling, item portrait modeling and recommendation algorithm modules, of which the recommendation algorithm is the core module.
  • Recommendation algorithms are generally divided into demographic-based recommendation, content-based recommendation, and collaborative filtering algorithms.
  • the algorithm based on collaborative filtering is the most widely used and successful algorithm, because it does not depend on the feature data of users or items, and only recommends based on the historical interaction data between users and items, but it still has data sparseness and cold start. question.
  • Knowledge map is a part of knowledge engineering technology. It is a huge heterogeneous information network, and its basic elements are triples. For example, (h, r, t) represents a triple, and h, r, t represent Head node, relationship and tail node.
  • the knowledge map establishes a deep semantic relationship between items, so that more related information between items can be mined. Applying the knowledge map to the recommendation system can effectively alleviate problems such as data sparsity and cold start.
  • Knowledge Graph Attention Network Model is a knowledge graph embedding learning model based on a hybrid method, which uses both the node information of the graph and the edge relationship information of the graph, and the learning based on attention The method gives the weight information of the edge.
  • the existing KGAT model firstly performs embedding learning on the knowledge graph based on the TransR graph embedding algorithm.
  • the TransR model projects the triplet data into the vector space where the relation r is located. The projection process will bring noise and affect the model performance.
  • the attention distribution function of the existing knowledge graph attention network model is too simple, and improving it will also improve the performance of the model to a certain extent.
  • this application aims to disclose a knowledge map recommendation method and system based on the improved KGAT model, which solves the problems of the existing KGAT model and realizes the generation of personalized recommendation reasons.
  • the present application discloses a knowledge map recommendation method based on an improved KGAT model, including:
  • the domain knowledge graph is constructed
  • the user is used as a node of the domain knowledge graph, and the historical interaction between the user and the item is added to the knowledge graph as a relationship edge to construct a collaborative knowledge graph;
  • the improved KGAT model is used to learn the collaborative knowledge map to obtain the vectorized representation of users and items and the weight information of adjacent edges of item nodes; the improved KGAT model uses a three-layer MLP network as the attention distribution function, to generate weight information;
  • the recommendation reason is generated based on the ranking result and the weight information of the adjacent edge of the item node corresponding to the ranking result in the collaborative knowledge graph; when the user requests a recommendation, the recommendation reason is generated together with the item to generate a recommendation list and returned to the user.
  • the improved KGAT model is used to simultaneously learn the structural information of the map and the collaborative interaction information of users and items in a joint learning manner, and obtain the vectorized representation of the structural information of the map and the historical interaction information of users and items; use the attention-based mechanism to learn the synergy The weight information of the adjacent edges of the item nodes in the knowledge graph; save the learned vectorized representation of users and items and the weight information of the adjacent edges of the item nodes.
  • the improved KGAT model includes graph embedding learning layer, attention propagation layer and prediction layer;
  • the atlas embedding learning layer is used to adopt the distance-based translation model transR atlas embedding algorithm to learn the vector representation of the entity nodes of atlas;
  • the attention propagation layer is used for information propagation, knowledge perception attention, and information aggregation to obtain multi-layer user node vector representations and multi-layer item node vector representations;
  • the prediction layer performs model prediction and optimization according to the output of the information dissemination layer to obtain user and item representation vectors.
  • the three-layer MLP network as the attention distribution function is expressed as:
  • ⁇ ′(h, r, t) is the attention distribution function
  • relu is the hidden layer activation function
  • e r and is the embedding vector in the vector space determined by the relation r
  • W 1 and W 2 are trainable weight parameter matrices
  • is a bitwise multiplication operation
  • N h is a set of triples (h, r, t).
  • the aggregation function for information aggregation in the information dissemination layer is as follows:
  • the node representation learned by the e h graph embedding layer represents the head node h as the weighted sum of the tail nodes connected to it; where " ⁇ " represents a bitwise multiplication operation, and W 3 and W 4 are trainable parameter matrices;
  • the node information fused with a layer of relationship is expressed as:
  • said sorting all sets of items to be recommended includes:
  • the weight information of all adjacent edges of the nodes in the corresponding collaborative knowledge graph is obtained, the edge with the largest weight is selected, and then the recommendation explanation is generated according to the type or attribute of the edge , and save the generated recommendation reason together with the corresponding item, when the user requests a recommendation, generate a recommendation list together with the item and return it to the user.
  • Another aspect of the present application discloses a knowledge map recommendation system based on an improved KGAT model, including:
  • the domain knowledge map building module is used to construct and store the schema layer of the knowledge map; import the acquired data layer data into the graph database after knowledge fusion;
  • a collaborative knowledge graph construction module used to use the user as a node of the knowledge graph, and add the historical interaction between the user and the item as an edge representing the relationship to the knowledge graph to construct a collaborative knowledge graph;
  • Graph embedding and recommendation model training module used to learn the collaborative knowledge graph using the KGAT model to obtain a set of all items to be recommended;
  • a sorting module configured to sort all sets of items to be recommended, and select the first N items as the final set of items to be recommended for users;
  • the recommendation reason generation module is used to generate a recommendation reason based on the weight information of the edge adjacent to the item node corresponding to the ranking result in the sorting result and the collaborative knowledge map; the recommendation reason is saved together with the corresponding item, and when the user requests a recommendation When , a recommendation list is generated together with the item and returned to the user.
  • an electronic device including:
  • processors one or more processors
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the above-mentioned method for recommending knowledge graphs based on the improved KGAT model.
  • Another aspect of the present application discloses a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, the above-mentioned improved KGAT model-based knowledge graph recommendation method is implemented.
  • This application realizes the generation of a personalized recommendation list for users, and can also generate a personalized recommendation reason, thereby improving the credibility of the recommendation result.
  • the improved KGAT model reduces the noise impact caused by the TransR projection process, and the improved attention distribution function uses a three-layer MLP network instead of the dot product of the attention distribution function in the reference model Operation, the training has obtained more accurate weight data.
  • Experimental results on three public datasets demonstrate that the proposed model outperforms existing methods.
  • FIG. 1 is a flowchart of a knowledge map recommendation method in an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a collaborative knowledge map in an embodiment of the present application
  • Fig. 3 is the improved KGAT model structure schematic diagram in the embodiment of the present application.
  • FIG. 4 is a schematic diagram of mining the first-order connectivity information of adjacent nodes in the embodiment of the present application.
  • Fig. 5 is a schematic block diagram of composition and connection of the knowledge map recommendation system in the embodiment of the present application.
  • Fig. 6 is a schematic block diagram of composition and connection of electronic equipment in the embodiment of the present application.
  • FIG. 7 is a block diagram of a computer-readable medium in an embodiment of the present application.
  • This embodiment discloses a knowledge map recommendation method based on the improved KGAT model, as shown in Figure 1, including:
  • Step S1 constructing a domain knowledge graph by constructing a schema layer and a graph database of the knowledge graph;
  • Step S2 taking the user as a node of the domain knowledge graph, and adding the historical interaction between the user and the item as the edge of the relationship into the knowledge graph to construct a collaborative knowledge graph;
  • Step S3 using the improved KGAT model to learn the collaborative knowledge map to obtain the vectorized representation of users and items and the weight information of adjacent edges of item nodes;
  • the improved KGAT model uses a three-layer MLP network as the attention Assignment function to generate weight information;
  • Step S4 sorting all sets of items to be recommended, and selecting the first N items as the final set of items to be recommended for the user;
  • Step S5 Generate a recommendation reason based on the ranking result and the weight information of the adjacent edge of the item node corresponding to the ranking result in the collaborative knowledge graph; when the user requests a recommendation, generate a recommendation list together with the recommendation reason and the item and return it to the user.
  • a top-down construction method is adopted to firstly construct a knowledge map pattern layer ontology structure, and store the pattern layer. Then the data layer data is obtained from structured or unstructured data sources, and stored in the graph database after knowledge fusion; the data sources are actually collected data that exist in reality.
  • the collaborative knowledge map constructed mainly serves the improved KGAT model in this embodiment, and is used to learn the features of items in the knowledge map and the associated features of items and items, and at the same time learn user features, so that The idea of collaborative filtering can be applied for recommendation.
  • the structure of the collaborative knowledge map is shown in Figure 2.
  • the construction method first takes the user as the node of the knowledge graph, and then adds the historical interaction between the user and the item as the edge representing the relationship to the knowledge graph. Building a collaborative knowledge map does not change the structure of the domain knowledge map in step S1, only the map data stored in the form of triples (h, r, t) in the data file derived from the domain knowledge map described in module 1 Add users and historical interaction data between users and items.
  • items 1, 2, 3, and 4 in Figure 2 can be the names of movies, and entities 1, 2, and 3 can be the names of stars, where the relationship r 1 is interaction, and r 2 is the director, r 3 is the leading role.
  • the improved KGAT model when used to learn the collaborative knowledge map, the improved KGAT model is used to simultaneously learn the structural information of the map and the collaborative interaction information of users and items in a joint learning manner, and obtain Vectorized representation of graph structure information and historical interaction information of users and items; use attention-based learning to obtain weight information of adjacent edges of item nodes in collaborative knowledge graph; save learned vectorized representations of users and items and item nodes The weight information of adjacent edges.
  • the structure of the improved KGAT model is shown in Figure 3, including graph embedding learning layer, attention propagation layer and prediction layer;
  • the atlas embedding learning layer is used to adopt the distance-based translation model transR atlas embedding algorithm to learn the vector representation of the entity nodes of atlas;
  • the attention propagation layer is used for information propagation, knowledge perception attention, and information aggregation to obtain multi-layer user node vector representations and multi-layer item node vector representations;
  • the prediction layer performs model prediction and optimization according to the output of the information dissemination layer to obtain user and item representation vectors.
  • the distance-based translation model transR graph embedding algorithm is used to learn the vector representation of the entity nodes of the graph, assuming that the embedded vector representation of the triplet (h, r, t) in the vector space determined by the relation r is respectively e r and Then they are related:
  • 2 is the L2 norm
  • W r ⁇ R k ⁇ d is a trainable parameter transformation matrix
  • e h , e t ⁇ R d , e r ⁇ R k , e h , e t are the d-dimensional embedding vector representations of entity h and entity t
  • er is A k-dimensional embedding vector representation of the relation r.
  • g(h,r,t′) is a negative sampling sample
  • is a sigmoid function
  • ln is a natural logarithm
  • first-order connectivity information adjacent to nodes is mined, and then higher-order connectivity information is recursively mined.
  • the head node h can be expressed as the weighted sum of the tail nodes connected to it:
  • ⁇ (h, r, t) is the weight of the relationship r
  • e t is the embedding vector learned based on TransR in the previous layer
  • e h is the embedding vector learned based on TransR in the previous layer
  • the weight ⁇ (h,r,t) is learned based on the attention mechanism.
  • the attention distribution function adopts a three-layer MLP network.
  • a three-layer MLP network as an attention distribution function is expressed as
  • relu is the hidden layer activation function; That is, the mappings of e t and e h in the vector space where the relation r is located respectively. is the approximate representation of the tail node in the vector space where the relation r is located W 1 and W 2 are trainable weight parameter matrices, and ⁇ represents a bitwise multiplication operation. Then use the softmax function for normalization:
  • N h is a set of triples (h, r, t).
  • the multi-layer vector representation of user and item nodes is to connect the vector representations of different layers, namely:
  • the final predicted score is:
  • the final loss function is the collaborative loss L CF of user-item interaction plus the graph loss L KG of the first layer, plus a parameter regularization term:
  • the synergy loss LCF for user-item interaction is defined as:
  • O ⁇ (u,i,j)
  • R + is a positive sample
  • R - is a negative sample
  • is a sigmoid function .
  • sorting all sets of items to be recommended includes:
  • the recall uses recall algorithms based on items, collaborative filtering, popular statistics, etc., and the recalled set of candidate items to be recommended is a much smaller set than all sets of items to be recommended.
  • step S5 based on the sorting result in the step S4 and the weight information of the edge adjacent to the item node corresponding to the sorting result in the knowledge map obtained from the model training in the step S3.
  • the weight information of all adjacent edges of the nodes in the corresponding knowledge graph is obtained, the edge with the largest weight is selected, and then a recommendation explanation is generated according to the type or attribute of the edge, and the generated recommendation The reason is stored together with the corresponding item, and when the user requests a recommendation, a recommendation list is generated together with the item and returned to the user.
  • this implementation realizes the generation of personalized recommendation lists for users, and at the same time generates personalized recommendation reasons, thereby improving the credibility of the recommendation results.
  • the improved KGAT model reduces the noise impact caused by the TransR projection process, and the improved attention distribution function uses a three-layer MLP network instead of the dot product of the attention distribution function in the reference model Operation, the training has obtained more accurate weight data.
  • Experimental results on three public datasets demonstrate that the proposed model outperforms existing methods.
  • This embodiment discloses a knowledge map recommendation system based on the improved KGAT model, as shown in Figure 5, including:
  • the domain knowledge map building module is used to construct and store the pattern layer of the knowledge map; import the acquired data layer data into the graph database after knowledge fusion;
  • a collaborative knowledge graph construction module used to use the user as a node of the knowledge graph, and add the historical interaction between the user and the item as an edge representing the relationship to the knowledge graph to construct a collaborative knowledge graph;
  • Graph embedding and recommendation model training module used to learn the collaborative knowledge graph using the KGAT model to obtain a set of all items to be recommended;
  • a sorting module configured to sort all sets of items to be recommended, and select the first N items as the final set of items to be recommended for users;
  • the recommendation reason generation module is used to generate a recommendation reason based on the weight information of the edge adjacent to the item node corresponding to the ranking result in the sorting result and the collaborative knowledge map; the recommendation reason is saved together with the corresponding item, and when the user requests a recommendation When , a recommendation list is generated together with the item and returned to the user.
  • Embodiment 1 is not described here one by one.
  • This embodiment compares with existing methods on multiple data sets to verify the effectiveness of the improved knowledge map proposed in this application.
  • the recommended items that are also of interest to users are defined as TP
  • the items that are recommended but not of interest to users are defined as FP
  • the items that are not recommended but are of interest to users are defined as FP.
  • FN the items that are not recommended and not of interest to users
  • Table 2 the above concepts can be described as a confusion matrix.
  • the accuracy rate indicates the ratio of all recommended items that users are interested in and items that are not recommended to users that are not interested in the total number of samples, that is:
  • the precision rate is also called the precision rate, which indicates the proportion of recommended items that users are interested in to all recommended items, namely:
  • the recall rate is also called the recall rate, which indicates the proportion of recommended items of interest to all users, namely:
  • the data set is divided according to the ratio of 2:8, 20% of which is used as the verification data set, and 80% is used as the training set.
  • the size of the embedding vector is set to 64, the batch size is set to 1024, and the learning rate range is adjusted at 0.1- 0.001, and initialize the regularization parameter coefficient to 0.
  • the accuracy of the model reaches the best.
  • This embodiment discloses an electronic device, the block diagram of which is shown in FIG. 6 , and the electronic device 900 shown in FIG. 6 is only an example, and should not impose any limitation on the function and application scope of the embodiment of the present application.
  • electronic device 600 is represented in the form of a general-purpose computing device.
  • Components of the electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one storage unit 620, a bus 630 connecting different system components (including the storage unit 620 and the processing unit 610), a display unit 640, and the like.
  • the storage unit stores program codes, and the program codes can be executed by the processing unit 610, so that the processing unit 610 executes the improved KGAT model-based knowledge map recommendation method described in Embodiment 1.
  • the storage unit 620 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 6201 and/or a cache storage unit 6202 , and may further include a read-only storage unit (ROM) 6203 .
  • RAM random access storage unit
  • ROM read-only storage unit
  • Storage unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including but not limited to: an operating system, one or more application programs, other program modules, and program data, Implementations of networked environments may be included in each or some combination of these examples.
  • Bus 630 may represent one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local area using any of a variety of bus structures. bus.
  • the technical solution according to the embodiment of the present application can be embodied in the form of software products, and the software products can be stored in a non-volatile storage medium (which can be CD-ROM, U disk, mobile hard disk etc.) or on the network, including several instructions to make a computing device (which may be a personal computer, server, or network device, etc.) execute the above-mentioned method according to the embodiment of the present application.
  • a non-volatile storage medium which can be CD-ROM, U disk, mobile hard disk etc.
  • a computing device which may be a personal computer, server, or network device, etc.
  • a software product may utilize any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer readable storage medium may include a data signal carrying readable program code in baseband or as part of a carrier wave traveling as part of a data signal. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a readable storage medium may also be any readable medium other than a readable storage medium that can send, propagate or transport a program for use by or in conjunction with an instruction execution system, apparatus or device.
  • the program code contained on the readable storage medium may be transmitted by any suitable medium, including but not limited to wireless, cable, optical cable, RF, etc., or any suitable combination of the above.
  • Program codes for performing the operations of the present application can be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming Language - such as "C" or similar programming language.
  • the program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server to execute.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (for example, using an Internet service provider). business to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet service provider for example, using an Internet service provider
  • the above-mentioned computer-readable medium carries one or more programs.
  • the improved KGAT model-based knowledge map recommendation method as described in Embodiment 1 is implemented during execution.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请涉及一种基于改进型KGAT模型的知识图谱推荐方法及系统;方法包括:构建出领域知识图谱;将用户作为领域知识图谱的节点,将用户与物品的历史交互作为关系的边加入到知识图谱中,构建出协同知识图谱;采用改进型KGAT模型对协同知识图谱进行学习得到用户和物品的向量化表示以及物品节点相邻边的权重信息;对全部待推荐物品集合进行排序,选取前N个物品作为要为用户推荐的物品集;基于排序结果和协同知识图谱中与排序结果相对应的物品节点相邻边的权重信息生成推荐理由;在用户请求推荐时,将推荐理由与物品一起生成推荐列表返回给用户。本申请实现了针对用户生成个性化的推荐列表,同时也能生成个性化的推荐理由,从而提升了推荐结果的可信性。

Description

一种基于改进型KGAT模型的知识图谱推荐方法及系统 技术领域
本申请属于机器学习技术领域,具体涉及一种基于改进型KGAT模型的知识图谱推荐方法及系统。
背景技术
推荐系统是解决信息超载问题的有效工具之一,系统一般包含用户画像建模、物品画像建模以及推荐算法模块,其中推荐算法是核心模块。推荐算法一般分为基于人口统计学的推荐、基于内容的推荐以及协同过滤算法。基于协同过滤的算法是应用最为广泛也是最成功的算法,原因在于其不依赖于用户或者物品的特征数据,仅根据用户与物品的历史交互数据进行推荐,但其仍然存在数据稀疏和冷启动等问题。
知识图谱是知识工程技术的一部分,它是一张巨大的异构信息网络,其基本组成元素是三元组,如(h,r,t)表示一个三元组,h、r、t分别表示头结点、关系和尾节点。知识图谱建立了物品间深层次的语义关系,从而可以挖掘到更多的物品间的关联信息,将知识图谱应用于推荐系统,可以有效缓解数据稀疏和冷启动等问题。
应用知识图谱到推荐系统一般有三种方法:基于嵌入的方法、基于路径的方法和混合方法。基于嵌入的方法和基于路径的方法各有其优缺点,混合方法结合前两种方式,是一种更为高效的方式。知识图注意力网络模型(KGAT,Knowledge Graph Attention Network)是一种基于混合方式的知识图谱嵌入学习模型,既利用了图谱的节点信息也利用了图谱的边的关系信息,同时基于注意力的学习方式给出了边的权重信息。
现有的KGAT模型,其首先基于TransR图谱嵌入算法对知识图进行嵌入学习,TransR模型将三元组数据投影到关系r所在的向量空间中,该投影过程会 带来噪声影响,从而影响模型性能,另外,现有知识图注意力网络模型的注意力分配函数过于简单,对其进行改进也将在一定程度提升模型性能。
发明内容
鉴于上述的分析,本申请旨在公开了一种基于改进型KGAT模型的知识图谱推荐方法及系统,解决现有的KGAT模型的问题,实现了个性化推荐理由的生成。
本申请一方面公开了一种基于改进型KGAT模型的知识图谱推荐方法,包括:
通过构建知识图谱的模式层和图数据库,构建出领域知识图谱;
将用户作为所述领域知识图谱的节点,将用户与物品的历史交互作为关系的边加入到知识图谱中,构建出协同知识图谱;
采用改进型KGAT模型对所述协同知识图谱进行学习得到用户和物品的向量化表示以及物品节点相邻边的权重信息;所述改进型KGAT模型中的采用三层MLP网络作为注意力分配函数,以生成权重信息;
对全部待推荐物品集合进行排序,选取前N个物品作为最终要为用户推荐的物品集;
基于排序结果和协同知识图谱中与排序结果相对应的物品节点相邻边的权重信息生成推荐理由;在用户请求推荐时,将所述推荐理由与物品一起生成推荐列表返回给用户。
进一步地,采用改进型KGAT模型对所述协同知识图谱进行学习中,
利用改进型KGAT模型,以联合学习方式同时学习图谱的结构信息以及用户和物品的协同交互信息,得到图谱结构信息以及用户和物品的历史交互信息的向量化表示;采用基于注意力机制学习得到协同知识图谱中物品节点相邻边的权重信息;保存学习到的用户和物品的向量化表示以及物品节点相邻边的权重信息。
进一步地,所述改进型KGAT模型,包括图谱嵌入学习层、注意力传播层和 预测层;
所述图谱嵌入学习层,用于采用基于距离的翻译模型transR图谱嵌入算法,学习图谱的实体节点的向量表示;
所述注意力传播层,用于进行信息传播、知识感知注意力以及信息聚合,得到多层用户节点向量表示及多层物品节点向量表示;
所述预测层,根据信息传播层的输出进行模型预测和优化得到用户以及物品的表征向量。
进一步地,所述信息传播层中,作为注意力分配函数的三层MLP网络表示为:
Figure PCTCN2022081055-appb-000001
其中,ω′(h,r,t)为注意力分配函数;relu为隐藏层激活函数;
Figure PCTCN2022081055-appb-000002
e r
Figure PCTCN2022081055-appb-000003
为在关系r所确定的向量空间中的嵌入向量;W 1、W 2为可训练权重参数矩阵,⊙为按位乘运算;
基于注意力机制学习得到权重
Figure PCTCN2022081055-appb-000004
N h为三元组(h,r,t)的集合。
进一步地,所述信息传播层中进行信息聚合的聚合函数如下:
Figure PCTCN2022081055-appb-000005
其中,e h图谱嵌入层学习到的节点表示,
Figure PCTCN2022081055-appb-000006
为将头结点h表示为与其相连的尾节点的加权和;其中“⊙”表示按位乘运算,W 3、W 4为可训练参数矩阵;
融合了一层关系的节点信息表示为:
Figure PCTCN2022081055-appb-000007
通过迭代,得出关于实体的第L层表示为:
Figure PCTCN2022081055-appb-000008
其中,
Figure PCTCN2022081055-appb-000009
Figure PCTCN2022081055-appb-000010
是上一层信息传播步骤生成的尾节点t的表示。
进一步地,所述对全部待推荐物品集合进行排序,包括:
1)将全部待推荐物品集合进行召回,得到一个粗略排序结果的候选推荐物品集;
2)根据推荐目标用户和候选待推荐物品集,加载训练得到的用户以及物品的特征向量,将用户特征向量与全部物品特征向量相乘作为预测的用户对物品的点击概率;
3)将候选待推荐物品集按点击概率由大到小排序,选取前N个物品作为最终要为用户推荐的物品集并进行保存。
进一步地,在生成推荐理由时,根据排序结果,获取到其对应的协同知识图谱中节点的全部相邻边的权重信息,选取权重最大的一条边,然后根据该边的类型或者属性生成推荐解释,并将生成的推荐理由与对应的物品一起保存,在用户请求推荐时,与物品一起生成推荐列表返回给用户。
本申请另一方面公开了一种基于改进型KGAT模型的知识图谱推荐系统,包括:
领域知识图谱构建模块,用于构建并存储知识图谱的模式层;将获取的数据层数据经过知识融合后导入到图数据库;
协同知识图谱构建模块,用于将用户作为所述知识图谱的节点,将用户与物品的历史交互作为代表关系的边加入到知识图谱中构建协同知识图谱;
图谱嵌入及推荐模型训练模块,用于采用KGAT模型对所述协同知识图谱进行学习得到全部待推荐物品集合;
排序模块,用于对所述全部待推荐物品集合进行排序,选取前N个物品作为最终要为用户推荐的物品集;
推荐理由生成模块,用于基于排序结果和协同知识图谱中与排序结果相对应的物品节点相邻的边的权重信息生成推荐理由;将所述推荐理由与对应的物品一起保存,在用户请求推荐时,与物品一起生成推荐列表返回给用户。
本申请另一方面公开了一种电子设备,包括:
一个或多个处理器;
存储装置,用于存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上所述的基于改进型KGAT模型的知识图谱推荐方法。
本申请另一方面公开了一种计算机可读介质,其上存储有计算机程序,所述程序被处理器执行时实现如上所述的基于改进型KGAT模型的知识图谱推荐方法。
本申请至少可实现以下有益效果之一:
本申请实现了针对用户生成个性化的推荐列表,同时也能生成个性化的推荐理由,从而提升了推荐结果的可信性。
并且,相较于现有的KGAT模型,改进型KGAT模型减小了采用TransR投影过程带来的噪音影响,改进的注意力分配函数采用三层MLP网络代替参照模型中注意力分配函数的点积操作,训练得到了更为精确的权重数据。在三个公共数据集上的实验结果证明了本模型的性能优于现有方法。
附图说明
附图仅用于示出具体实施例的目的,而并不认为是对本申请的限制,在整个附图中,相同的参考符号表示相同的部件。
图1为本申请实施例中的知识图谱推荐方法流程图;
图2为本申请实施例中的协同知识图谱的结构示意图;
图3为本申请实施例中的改进型KGAT模型结构示意图;
图4为本申请实施例中的挖掘相邻节点的一阶连通信息示意图;
图5为本申请实施例中的知识图谱推荐系统组成连接示意框图;
图6为本申请实施例中的电子设备组成连接示意框图;
图7为本申请实施例中的计算机可读介质的框图。
具体实施方式
下面结合附图来具体描述本申请的优选实施例,其中,附图构成本申请一部分,并与本申请的实施例一起用于阐释本申请的原理。
实施例一
本实施例公开了一种基于改进型KGAT模型的知识图谱推荐方法,如图1所示,包括:
步骤S1、通过构建知识图谱的模式层和图数据库,构建出领域知识图谱;
步骤S2、将用户作为所述领域知识图谱的节点,将用户与物品的历史交互作为关系的边加入到知识图谱中,构建出协同知识图谱;
步骤S3、采用改进型KGAT模型对所述协同知识图谱进行学习得到用户和物品的向量化表示以及物品节点相邻边的权重信息;所述改进型KGAT模型中的采用三层MLP网络作为注意力分配函数,以生成权重信息;
步骤S4、对全部待推荐物品集合进行排序,选取前N个物品作为最终要为用户推荐的物品集;
步骤S5、基于排序结果和协同知识图谱中与排序结果相对应的物品节点相邻边的权重信息生成推荐理由;在用户请求推荐时,将所述推荐理由与物品一起生成推荐列表返回给用户。
具体的,所述步骤S1中,采用自顶向下的构建方式,首先构建出知识图谱模式层本体架构,并对模式层进行存储。然后从结构化或非结构化的数据源中获取数据层数据,经过知识融合后,保存到图数据库中;所述数据源为实际采集的现实中存在的数据。
具体的,所述步骤S2中,构建的协同知识图谱,主要服务于本实施例中改进型KGAT模型,用于学习知识图谱中物品的特征以及物品与物品的关联特征,同时学习用户特征,从而可以应用协同过滤的思想进行推荐。协同知识图谱的结构示意如图2。构建方法先将用户作为知识图谱的节点,再将用户与物品的历 史交互作为代表关系的边加入到知识图谱中。构建协同知识图谱并不改变步骤S1中的领域知识图谱结构,仅是在模块1中所诉的领域知识图谱导出的数据文件中,以三元组(h,r,t)形式保存的图谱数据中加入用户以及用户与物品的历史交互数据。
不是一般性的,以电影推荐为例,图2中的物品1、2、3、4可为电影的名称,而实体1、2、3可为明星的名字,其中关系r 1为交互,r 2为导演,r 3为主演。
具体的,所述步骤S3中,采用改进型KGAT模型对所述协同知识图谱进行学习中,利用改进型KGAT模型,以联合学习方式同时学习图谱的结构信息以及用户和物品的协同交互信息,得到图谱结构信息以及用户和物品的历史交互信息的向量化表示;采用基于注意力机制学习得到协同知识图谱中物品节点相邻边的权重信息;保存学习到的用户和物品的向量化表示以及物品节点相邻边的权重信息。
改进型KGAT模型结构示意如图3所示,包括图谱嵌入学习层、注意力传播层和预测层;
所述图谱嵌入学习层,用于采用基于距离的翻译模型transR图谱嵌入算法,学习图谱的实体节点的向量表示;
所述注意力传播层,用于进行信息传播、知识感知注意力以及信息聚合,得到多层用户节点向量表示及多层物品节点向量表示;
所述预测层,根据信息传播层的输出进行模型预测和优化得到用户以及物品的表征向量。
模型具体实施过程如下:
1)图谱嵌入层:
采用基于距离的翻译模型transR图谱嵌入算法学习图谱的实体节点的向量表示,假设三元组(h,r,t)在关系r所确定的向量空间中的嵌入向量表示分别为
Figure PCTCN2022081055-appb-000011
e r
Figure PCTCN2022081055-appb-000012
那么它们存在关系:
Figure PCTCN2022081055-appb-000013
设W r是投影矩阵,那么就有三者间的距离公式:
Figure PCTCN2022081055-appb-000014
其中,|||| 2为L2范数,
Figure PCTCN2022081055-appb-000015
W r∈R k×d是可训练的参数转换矩阵;e h、e t∈R d,e r∈R k,e h、e t是实体h、实体t的d维嵌入向量表示,er是关系r的k维嵌入向量表示。
然后根据该距离建立损失函数:
L KG=∑ (h,r,t,t′)∈T-lnδ(g(h,r,t′)-g(h,r,t));
其中
Figure PCTCN2022081055-appb-000016
g(h,r,t′)为负采样样本,δ为sigmoid函数,ln为自然对数。
2)信息传播层:
以图卷积的形式首先挖掘节点相邻的一阶连通信息,然后再递归的挖掘更高阶的连通信息。如图4所示,要挖掘相邻节点的一阶连通信息,可以将头结点h表示为与其相连的尾节点的加权求和:
Figure PCTCN2022081055-appb-000017
其中,ω(h,r,t)是关系r的权重,e t是上一层中基于TransR学习到的嵌入向量,
Figure PCTCN2022081055-appb-000018
是加权求和后的节点表示,区别于基于TransR学习到的头结点的图嵌入表示向量e h。权重ω(h,r,t)基于注意力机制学习得到。
优选的,注意力分配函数采用三层MLP网络。作为注意力分配函数的三层MLP网络表示为
Figure PCTCN2022081055-appb-000019
其中,relu为隐藏层激活函数;
Figure PCTCN2022081055-appb-000020
即e t,e h在分别在关系r所在向量空间中的映射。
Figure PCTCN2022081055-appb-000021
为关系r所在向量空间中尾节点的近似表示
Figure PCTCN2022081055-appb-000022
W 1、W 2为可训练权重参数矩阵,⊙表示按位乘运算。然后采用softmax函数做归一化处理:
关系r的权重
Figure PCTCN2022081055-appb-000023
Figure PCTCN2022081055-appb-000024
与图谱嵌入层学习到的节点表示e h做信息聚合,聚合函数如下:
Figure PCTCN2022081055-appb-000025
Figure PCTCN2022081055-appb-000026
表示融合了一层关系的节点信息表示,其中“⊙”表示按位乘运算,W 3、W 4为可训练参数矩阵。
重复以上步骤,迭代得出关于实体的第L层表示e (l)
Figure PCTCN2022081055-appb-000027
其中
Figure PCTCN2022081055-appb-000028
Figure PCTCN2022081055-appb-000029
是上一层信息传播步骤生成的尾节点t的表示,N h为三元组(h,r,t)的集合。
3)预测层:
将以上得到的多层用户节点向量表示
Figure PCTCN2022081055-appb-000030
及多层物品节点向量表示
Figure PCTCN2022081055-appb-000031
进行相乘得到预测得分
Figure PCTCN2022081055-appb-000032
其中用户与物品节点的多层向量表示是将不同层的向量表示进行连接,即:
Figure PCTCN2022081055-appb-000033
最终预测得分为:
Figure PCTCN2022081055-appb-000034
最终的损失函数为用户与物品交互的协同损失L CF加第一层的图谱损失L KG,再加上一个参数正则化项:
Figure PCTCN2022081055-appb-000035
用户与物品交互的协同损失L CF定义为:
Figure PCTCN2022081055-appb-000036
其中,O={(u,i,j)|(u,i)∈R +,(u,j)∈R -},R +为正例样本,R -为负例样本,δ为sigmoid函数。
具体的,所述步骤S4中,对全部待推荐物品集合进行排序,包括:
1)将全部待推荐物品集合进行召回,得到一个粗略排序结果的候选推荐物品集;
其中,召回采用基于物品、协同过滤、热门统计等召回算法,召回得到的候选待推荐物品集是一个比全部待推荐物品集小得多的集合。
2)根据推荐目标用户和候选待推荐物品集,加载训练得到的用户以及物品的特征向量,将用户特征向量与全部物品特征向量相乘作为预测的用户对物品的点击概率;
3)将候选待推荐物品集按点击概率由大到小排序,选取前N个物品作为最终要为用户推荐的物品集并进行保存。
具体的,所述步骤S5中,基于步骤S4中的排序结果以及步骤S3中模型训练得到的知识图谱中与排序结果相对应的物品节点相邻的边的权重信息。
更具体的,根据排序结果,获取到其对应的知识图谱中节点的全部相邻边的权重信息,选取权重最大的一条边,然后根据该边的类型或者属性生成推荐解释,并将生成的推荐理由与对应的物品一起保存,在用户请求推荐时,与物品一起生成推荐列表返回给用户。
综上所述,本实施实现了针对用户生成个性化的推荐列表,同时也能生成个性化的推荐理由,从而提升了推荐结果的可信性。
并且,相较于现有的KGAT模型,改进型KGAT模型减小了采用TransR投影过程带来的噪音影响,改进的注意力分配函数采用三层MLP网络代替参照模型中注意力分配函数的点积操作,训练得到了更为精确的权重数据。在三个公共数据集上的实验结果证明了本模型的性能优于现有方法。
实施例二
本实施例公开了一种基于改进型KGAT模型的知识图谱推荐系统,如图5所示,包括:
领域知识图谱构建模块,用于构建并存储知识图谱的模式层;将获取的数 据层数据经过知识融合后导入到图数据库;
协同知识图谱构建模块,用于将用户作为所述知识图谱的节点,将用户与物品的历史交互作为代表关系的边加入到知识图谱中构建协同知识图谱;
图谱嵌入及推荐模型训练模块,用于采用KGAT模型对所述协同知识图谱进行学习得到全部待推荐物品集合;
排序模块,用于对所述全部待推荐物品集合进行排序,选取前N个物品作为最终要为用户推荐的物品集;
推荐理由生成模块,用于基于排序结果和协同知识图谱中与排序结果相对应的物品节点相邻的边的权重信息生成推荐理由;将所述推荐理由与对应的物品一起保存,在用户请求推荐时,与物品一起生成推荐列表返回给用户。
本实施例中的具体的技术细节与有益的效果与实施例一中的相同,请参照
实施例一,在此就不一一赘述了。
实施例三
本实施例在多个数据集上与现有方法进行对比验证本申请提出的改进型知识图谱的有效性。
1、数据集
本实验选取Amazon-book、LastFM以及Yelp2018三个数据集,相关数据集描述如表1:
表1
Figure PCTCN2022081055-appb-000037
2、评价指标
本实施例选取准确率(accuracy)、精准率(precision)和召回率(recall)三个模型评价指标。在测试数据集中,结合预测结果,将被推荐出来的也是用户感兴趣的定义为TP,将被推荐出来的但不是用户感兴趣的定义为FP,将未被推荐出来的但却是用户感兴趣的定义为FN,将未被推荐出来的并且也不是用户感兴趣的定义为TN。如下表2,以上概念可以描述为混淆矩阵。
表2
  用户感兴趣(正类) 用户不感兴趣(负类)
被推荐出 TP FP
未被推荐出 FN TN
准确率表示所有被推荐出来的用户感兴趣的项以及未被推荐出来的用户不感兴趣的项占到总样本数的比例,即:
Figure PCTCN2022081055-appb-000038
精准率也称为查准率,表示被推荐出来的用户感兴趣的项占全部被推荐出来的项的比例,即:
Figure PCTCN2022081055-appb-000039
召回率也称为查全率,表示被推荐出来的用户感兴趣的项占到全部用户感兴趣的项的比例,即:
Figure PCTCN2022081055-appb-000040
3、学习过程
学习中将数据集按2:8比例进行划分,其中20%作为验证数据集,80%做训练集,将嵌入向量的大小设置为64,批处理大小设为1024,学习率范围调整在0.1-0.001之间,并将正则化参数系数初始化为0。本实施例过程中在对本例模型进行100次迭代后,模型的准确率达到了最佳。
4、对比结果
本申请在三个数据集上与BPRMF[Rendle S,Freudenthaler C,Gantner Z,et al.BPR:Bayesian personalized ranking from implicit feedback[C].//The 25th Conference on Uncertainty in Artificial Intelligence.Canada:UAI,2009:452461]、CKE[Zhang F,Yuan N J,Lian D,et al.Colaborative knowledge base embedding for Recommender Systems[C].//Procedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.San Francisco,USA,2016:353-362]以及KGAT[Wang X,He X,Cao Y,et al.Kgat:Knowledge Graph Attention Network for recommendation[C].//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Anchorage:KDD,2019:950958]的对比结果如表3所示,在针对top-20的推荐中,本申请提出的改进型知识图注意力网络模型的性能明显优于对比方法。
表3
Figure PCTCN2022081055-appb-000041
实施例四
本实施例公开了一种电子设备,其框图如图6所示,图6显示的电子设备900仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图6中,电子设备600以通用计算设备的形式表现。电子设备600的组件可以包括但不限于:至少一个处理单元610、至少一个存储单元620、连接不同系统组件(包括存储单元620和处理单元610)的总线630、显示单元640等。
其中,存储单元存储有程序代码,程序代码可以被处理单元610执行,使得处理单元610执行实施例一中所述的基于改进型KGAT模型的知识图谱推荐方法。
存储单元620可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)6201和/或高速缓存存储单元6202,还可以进一步包括只读存储单元(ROM)6203。
存储单元620还可以包括具有一组(至少一个)程序模块6205的程序/实用工具6204,这样的程序模块6205包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。
总线630可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。
实施例五
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。
因此,如图7所示,根据本申请实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、或者网络设备等)执行根据本申请实施方式的上述方法。
软件产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读 信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
计算机可读存储介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读存储介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。可读存储介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本申请操作的程序代码,程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被一个该设备执行时,执行时实现如实施例一中所述的基于改进型KGAT模型的知识图谱推荐方法。
以上所述,仅为本申请较佳的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。

Claims (10)

  1. 一种基于改进型KGAT模型的知识图谱推荐方法,其特征在于,包括:
    通过构建知识图谱的模式层和图数据库,构建出领域知识图谱;
    将用户作为所述领域知识图谱的节点,将用户与物品的历史交互作为关系的边加入到知识图谱中,构建出协同知识图谱;
    采用改进型KGAT模型对所述协同知识图谱进行学习得到用户和物品的向量化表示以及物品节点相邻边的权重信息;所述改进型KGAT模型中的采用三层MLP网络作为注意力分配函数,以生成权重信息;
    对全部待推荐物品集合进行排序,选取前N个物品作为最终要为用户推荐的物品集;
    基于排序结果和协同知识图谱中与排序结果相对应的物品节点相邻边的权重信息生成推荐理由;在用户请求推荐时,将所述推荐理由与物品一起生成推荐列表返回给用户。
  2. 根据权利要求1所述的知识图谱推荐方法,其特征在于,采用改进型KGAT模型对所述协同知识图谱进行学习中,
    利用改进型KGAT模型,以联合学习方式同时学习图谱的结构信息以及用户和物品的协同交互信息,得到图谱结构信息以及用户和物品的历史交互信息的向量化表示;采用基于注意力机制学习得到协同知识图谱中物品节点相邻边的权重信息;保存学习到的用户和物品的向量化表示以及物品节点相邻边的权重信息。
  3. 根据权利要求2所述的知识图谱推荐方法,其特征在于,所述改进型KGAT模型,包括图谱嵌入学习层、注意力传播层和预测层;
    所述图谱嵌入学习层,用于采用基于距离的翻译模型transR图谱嵌入算法,学习图谱的实体节点的向量表示;
    所述注意力传播层,用于进行信息传播、知识感知注意力以及信息聚合, 得到多层用户节点向量表示及多层物品节点向量表示;
    所述预测层,根据信息传播层的输出进行模型预测和优化得到用户以及物品的表征向量。
  4. 根据权利要求3所述的知识图谱推荐方法,其特征在于,所述信息传播层中,作为注意力分配函数的三层MLP网络表示为:
    Figure PCTCN2022081055-appb-100001
    其中,ω′(h,r,t)为注意力分配函数;relu为隐藏层激活函数;
    Figure PCTCN2022081055-appb-100002
    e r
    Figure PCTCN2022081055-appb-100003
    为在关系r所确定的向量空间中的嵌入向量;W 1、W 2为可训练权重参数矩阵,⊙为按位乘运算;
    基于注意力机制学习得到权重
    Figure PCTCN2022081055-appb-100004
    N h为三元组(h,r,t)的集合。
  5. 根据权利要求4所述的知识图谱推荐方法,其特征在于,所述信息传播层中进行信息聚合的聚合函数如下:
    Figure PCTCN2022081055-appb-100005
    其中,e h图谱嵌入层学习到的节点表示,
    Figure PCTCN2022081055-appb-100006
    为将头结点h表示为与其相连的尾节点的加权和;W 3、W 4为可训练参数矩阵;
    融合了一层关系的节点信息表示为:
    Figure PCTCN2022081055-appb-100007
    通过迭代,得出关于实体的第L层表示为:
    Figure PCTCN2022081055-appb-100008
    其中,
    Figure PCTCN2022081055-appb-100009
    是上一层信息传播步骤生成的尾节点t的表示。
  6. 根据权利要求1所述的知识图谱推荐方法,其特征在于,
    所述对全部待推荐物品集合进行排序,包括:
    1)将全部待推荐物品集合进行召回,得到一个粗略排序结果的候选推荐物品集;
    2)根据推荐目标用户和候选待推荐物品集,加载训练得到的用户以及物品的特征向量,将用户特征向量与全部物品特征向量相乘作为预测的用户对物品的点击概率;
    3)将候选待推荐物品集按点击概率由大到小排序,选取前N个物品作为最终要为用户推荐的物品集并进行保存。
  7. 根据权利要求1所述的知识图谱推荐方法,其特征在于,在生成推荐理由时,根据排序结果,获取到其对应的协同知识图谱中节点的全部相邻边的权重信息,选取权重最大的一条边,然后根据该边的类型或者属性生成推荐解释,并将生成的推荐理由与对应的物品一起保存,在用户请求推荐时,与物品一起生成推荐列表返回给用户。
  8. 一种基于改进型KGAT模型的知识图谱推荐系统,其特征在于,包括:
    领域知识图谱构建模块,用于构建并存储知识图谱的模式层;将获取的数据层数据经过知识融合后导入到图数据库;
    协同知识图谱构建模块,用于将用户作为所述知识图谱的节点,将用户与物品的历史交互作为代表关系的边加入到知识图谱中构建协同知识图谱;
    图谱嵌入及推荐模型训练模块,用于采用KGAT模型对所述协同知识图谱进行学习得到全部待推荐物品集合;
    排序模块,用于对所述全部待推荐物品集合进行排序,选取前N个物品作为最终要为用户推荐的物品集;
    推荐理由生成模块,用于基于排序结果和协同知识图谱中与排序结果相对应的物品节点相邻的边的权重信息生成推荐理由;将所述推荐理由与对应的物品一起保存,在用户请求推荐时,与物品一起生成推荐列表返回给用户。
  9. 一种电子设备,其特征在于,包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-7中任一项所述的基于改进型KGAT模型的知识图谱推荐方法。
  10. 一种计算机可读介质,其上存储有计算机程序,其特征在于,所述程序被处理器执行时实现如权利要求1-7中任一项所述的基于改进型KGAT模型的知识图谱推荐方法。
PCT/CN2022/081055 2021-12-01 2022-03-16 一种基于改进型kgat模型的知识图谱推荐方法及系统 WO2023097929A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111457641.6 2021-12-01
CN202111457641.6A CN114048331A (zh) 2021-12-01 2021-12-01 一种基于改进型kgat模型的知识图谱推荐方法及系统

Publications (1)

Publication Number Publication Date
WO2023097929A1 true WO2023097929A1 (zh) 2023-06-08

Family

ID=80212007

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/081055 WO2023097929A1 (zh) 2021-12-01 2022-03-16 一种基于改进型kgat模型的知识图谱推荐方法及系统

Country Status (2)

Country Link
CN (1) CN114048331A (zh)
WO (1) WO2023097929A1 (zh)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116541538A (zh) * 2023-07-06 2023-08-04 广东信聚丰科技股份有限公司 基于大数据的智慧学习知识点挖掘方法及系统
CN117033775A (zh) * 2023-07-28 2023-11-10 广东工业大学 基于知识图谱的工业软件的组件推荐方法及系统
CN117056575A (zh) * 2023-10-12 2023-11-14 深圳市华图测控系统有限公司 一种基于智能图书推荐系统数据采集的方法
CN117216417A (zh) * 2023-11-07 2023-12-12 北京智谱华章科技有限公司 融合知识信息和协同信息的推荐方法、装置、设备及介质
CN117539996A (zh) * 2023-11-21 2024-02-09 北京拓医医疗科技服务有限公司 一种基于用户画像的咨询问答方法及系统
CN117573904A (zh) * 2024-01-17 2024-02-20 广东讯飞启明科技发展有限公司 基于识别分析的多媒体教学资源知识图谱生成方法及系统
CN117609466A (zh) * 2023-12-04 2024-02-27 北方工业大学 一种基于大数据分析的语音智能问答系统
CN117688247A (zh) * 2024-01-31 2024-03-12 云南大学 推荐方法、终端设备及存储介质
CN117952724A (zh) * 2024-03-21 2024-04-30 烟台大学 基于知识图谱和神经网络的物品推荐方法、系统和设备
CN117992679A (zh) * 2024-02-23 2024-05-07 宁夏大学 一种项目推荐方法、系统与计算机设备
CN118013135A (zh) * 2024-02-29 2024-05-10 重庆理工大学 基于关系图卷积神经网络的图对比学习推荐方法
CN118052291A (zh) * 2024-04-16 2024-05-17 北京海纳数聚科技有限公司 一种基于扩张因果图嵌入的垂直领域大语言模型训练方法
CN118133883A (zh) * 2024-05-06 2024-06-04 杭州海康威视数字技术股份有限公司 图采样方法、图谱预测方法及存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114048331A (zh) * 2021-12-01 2022-02-15 浙江师范大学 一种基于改进型kgat模型的知识图谱推荐方法及系统
CN115221413B (zh) * 2022-08-03 2023-04-14 湖北工业大学 一种基于交互式图注意力网络的序列推荐方法及系统
CN115934990B (zh) * 2022-10-24 2023-05-12 北京数慧时空信息技术有限公司 基于内容理解的遥感影像推荐方法
CN117252664A (zh) * 2023-11-10 2023-12-19 浙江口碑网络技术有限公司 药品推荐理由生成方法、装置、介质及设备
CN117290611B (zh) * 2023-11-24 2024-02-23 北京信立方科技发展股份有限公司 基于多层次知识图谱的仪器推荐方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190266497A1 (en) * 2018-02-23 2019-08-29 Microsoft Technology Licensing, Llc Knowledge-graph-driven recommendation of career path transitions
CN112199508A (zh) * 2020-08-10 2021-01-08 淮阴工学院 一种基于远程监督的参数自适应农业知识图谱推荐方法
CN113158033A (zh) * 2021-03-19 2021-07-23 浙江工业大学 一种基于知识图谱偏好传播的协同推荐模型构建方法
CN114048331A (zh) * 2021-12-01 2022-02-15 浙江师范大学 一种基于改进型kgat模型的知识图谱推荐方法及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190266497A1 (en) * 2018-02-23 2019-08-29 Microsoft Technology Licensing, Llc Knowledge-graph-driven recommendation of career path transitions
CN112199508A (zh) * 2020-08-10 2021-01-08 淮阴工学院 一种基于远程监督的参数自适应农业知识图谱推荐方法
CN113158033A (zh) * 2021-03-19 2021-07-23 浙江工业大学 一种基于知识图谱偏好传播的协同推荐模型构建方法
CN114048331A (zh) * 2021-12-01 2022-02-15 浙江师范大学 一种基于改进型kgat模型的知识图谱推荐方法及系统

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LI, KUNLUN ET AL.: "Recommendation Algorithm Based on Attention Mechanism and Improved TF-IDF", COMPUTER ENGINEERING, vol. 47, no. 8, 31 August 2021 (2021-08-31), pages 69 - 76, XP009546862, ISSN: 1000-3428 *
WANG, XIANG ET AL.: "KGAT: Knowledge Graph Attention Network for Recommendation", KDD '19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 25 July 2019 (2019-07-25), pages 950 - 958, XP058635204, DOI: 10.1145/3292500.3330989 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116541538A (zh) * 2023-07-06 2023-08-04 广东信聚丰科技股份有限公司 基于大数据的智慧学习知识点挖掘方法及系统
CN116541538B (zh) * 2023-07-06 2023-09-01 广东信聚丰科技股份有限公司 基于大数据的智慧学习知识点挖掘方法及系统
CN117033775A (zh) * 2023-07-28 2023-11-10 广东工业大学 基于知识图谱的工业软件的组件推荐方法及系统
CN117033775B (zh) * 2023-07-28 2024-03-19 广东工业大学 基于知识图谱的工业软件的组件推荐方法及系统
CN117056575A (zh) * 2023-10-12 2023-11-14 深圳市华图测控系统有限公司 一种基于智能图书推荐系统数据采集的方法
CN117056575B (zh) * 2023-10-12 2024-01-30 深圳市华图测控系统有限公司 一种基于智能图书推荐系统数据采集的方法
CN117216417B (zh) * 2023-11-07 2024-02-20 北京智谱华章科技有限公司 融合知识信息和协同信息的推荐方法、装置、设备及介质
CN117216417A (zh) * 2023-11-07 2023-12-12 北京智谱华章科技有限公司 融合知识信息和协同信息的推荐方法、装置、设备及介质
CN117539996A (zh) * 2023-11-21 2024-02-09 北京拓医医疗科技服务有限公司 一种基于用户画像的咨询问答方法及系统
CN117609466A (zh) * 2023-12-04 2024-02-27 北方工业大学 一种基于大数据分析的语音智能问答系统
CN117573904A (zh) * 2024-01-17 2024-02-20 广东讯飞启明科技发展有限公司 基于识别分析的多媒体教学资源知识图谱生成方法及系统
CN117573904B (zh) * 2024-01-17 2024-04-30 广东讯飞启明科技发展有限公司 基于识别分析的多媒体教学资源知识图谱生成方法及系统
CN117688247A (zh) * 2024-01-31 2024-03-12 云南大学 推荐方法、终端设备及存储介质
CN117688247B (zh) * 2024-01-31 2024-04-12 云南大学 推荐方法、终端设备及存储介质
CN117992679A (zh) * 2024-02-23 2024-05-07 宁夏大学 一种项目推荐方法、系统与计算机设备
CN118013135A (zh) * 2024-02-29 2024-05-10 重庆理工大学 基于关系图卷积神经网络的图对比学习推荐方法
CN117952724A (zh) * 2024-03-21 2024-04-30 烟台大学 基于知识图谱和神经网络的物品推荐方法、系统和设备
CN118052291A (zh) * 2024-04-16 2024-05-17 北京海纳数聚科技有限公司 一种基于扩张因果图嵌入的垂直领域大语言模型训练方法
CN118133883A (zh) * 2024-05-06 2024-06-04 杭州海康威视数字技术股份有限公司 图采样方法、图谱预测方法及存储介质

Also Published As

Publication number Publication date
CN114048331A (zh) 2022-02-15

Similar Documents

Publication Publication Date Title
WO2023097929A1 (zh) 一种基于改进型kgat模型的知识图谱推荐方法及系统
US11782992B2 (en) Method and apparatus of machine learning using a network with software agents at the network nodes and then ranking network nodes
US9864807B2 (en) Identifying influencers for topics in social media
Luo et al. Online learning of interpretable word embeddings
Gui et al. Embedding learning with events in heterogeneous information networks
Lee et al. Streamlined mean field variational Bayes for longitudinal and multilevel data analysis
CN112905801A (zh) 基于事件图谱的行程预测方法、系统、设备及存储介质
Liu High performance latent dirichlet allocation for text mining
Myneni et al. Correlated cluster-based imputation for treatment of missing values
Shakibian et al. Multi-kernel one class link prediction in heterogeneous complex networks
Xu et al. GripNet: Graph information propagation on supergraph for heterogeneous graphs
Zhao et al. Distributed optimization of graph convolutional network using subgraph variance
CN110717116B (zh) 关系网络的链接预测方法及系统、设备、存储介质
Agarwal et al. WGSDMM+ GA: A genetic algorithm-based service clustering methodology assimilating dirichlet multinomial mixture model with word embedding
CN116959600A (zh) 分子状态预测方法、装置和存储介质
Ahmed et al. Federated deep active learning for attention-based transaction classification
Liu POI recommendation model using multi-head attention in location-based social network big data
CN114912009A (zh) 用户画像的生成方法、装置、电子设备和计算机程序介质
Hirchoua et al. Topic hierarchies for knowledge capitalization using hierarchical Dirichlet processes in big data context
Yue et al. Probabilistic Approaches for Social Media Analysis: Data, Community and Influence
Yang et al. Sampling to maintain approximate probability distribution under chi-square test
Wu et al. Multi-hop community question answering based on multi-aspect heterogeneous graph
Tu et al. Differential information diffusion model in social network
Zheng et al. Designing a parallel Feel-the-Way clustering algorithm on HPC systems
Khan et al. HITS-GNN: A simplified propagation scheme for graph neural networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22899752

Country of ref document: EP

Kind code of ref document: A1