CN116662637A - Content recommendation method, device, apparatus, storage medium and program product - Google Patents

Content recommendation method, device, apparatus, storage medium and program product Download PDF

Info

Publication number
CN116662637A
CN116662637A CN202210153239.7A CN202210153239A CN116662637A CN 116662637 A CN116662637 A CN 116662637A CN 202210153239 A CN202210153239 A CN 202210153239A CN 116662637 A CN116662637 A CN 116662637A
Authority
CN
China
Prior art keywords
content
node
features
feature
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210153239.7A
Other languages
Chinese (zh)
Inventor
常亚宁
马建强
林宇澄
李作潮
黄海兵
亓超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202210153239.7A priority Critical patent/CN116662637A/en
Publication of CN116662637A publication Critical patent/CN116662637A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a content recommendation method, a content recommendation device, content recommendation equipment, a storage medium and a program product, and relates to the field of artificial intelligence. The method comprises the following steps: obtaining a content heterogeneous diagram, wherein the content heterogeneous diagram comprises homogeneous element paths and heterogeneous element paths, the homogeneous element paths are formed by content nodes corresponding to content, and the heterogeneous element paths are formed by content nodes and attribute nodes corresponding to content attributes; extracting characteristics of a target content node based on the homogeneous element path to obtain content basic characteristics, wherein the target content node belongs to the content node; extracting characteristics of the target content node based on the heterogeneous path to obtain content attribute characteristics; performing feature fusion on the content basic features and the content attribute features to obtain target content features of target content nodes; content recommendation is performed based on the target content characteristics. The method provided by the embodiment of the application can improve the accuracy of the target content characterization of the target content characteristics, and is beneficial to improving the accuracy of content recommendation.

Description

Content recommendation method, device, apparatus, storage medium and program product
Technical Field
The embodiment of the application relates to the field of artificial intelligence, in particular to a content recommendation method, a content recommendation device, a storage medium and a program product.
Background
Currently, when a user views some content, such as video, articles, and the like, a content recommendation function exists, the content which may be of interest to the user is recommended for the user to view or browse.
In the related art, when content recommendation is performed, the relevance between contents is determined based on attribute information of the contents, so that video recommendation is performed based on the relevance with the contents. However, when content recommendation is performed based on only the attribute information of the content, only similar content of the same attribute can be recommended, resulting in lower accuracy of content recommendation.
Disclosure of Invention
The embodiment of the application provides a content recommendation method, a device, equipment, a storage medium and a program product, which are beneficial to improving the content recommendation accuracy. The technical scheme is as follows:
in one aspect, an embodiment of the present application provides a content recommendation method, where the method includes:
obtaining a content heterogeneous diagram, wherein the content heterogeneous diagram comprises a homogeneous element path and a heterogeneous element path, the homogeneous element path is formed by content nodes corresponding to content, and the heterogeneous element path is formed by the content nodes and attribute nodes corresponding to content attributes;
performing feature extraction on a target content node based on the homogeneous element path to obtain content basic features, wherein the target content node belongs to the content node;
Performing feature extraction on the target content node based on the heterogeneous path to obtain content attribute features;
performing feature fusion on the content basic feature and the content attribute feature to obtain a target content feature of the target content node;
and recommending the content based on the target content characteristics.
In another aspect, an embodiment of the present application provides a content recommendation apparatus, including:
the system comprises an acquisition module, a content heterogeneous graph and a content analysis module, wherein the content heterogeneous graph comprises a homogeneous element path and a heterogeneous element path, the homogeneous element path is formed by content nodes corresponding to content, and the heterogeneous element path is formed by the content nodes and attribute nodes corresponding to content attributes;
the first extraction module is used for extracting characteristics of a target content node based on the homogeneous element path to obtain content basic characteristics, wherein the target content node belongs to the content node;
the second extraction module is used for extracting the characteristics of the target content node based on the heterogeneous path to obtain content attribute characteristics;
the feature fusion module is used for carrying out feature fusion on the content basic features and the content attribute features to obtain target content features of the target content node;
And the content recommendation module is used for recommending the content based on the target content characteristics.
In another aspect, embodiments of the present application provide a computer device, where the computer device includes a processor and a memory, where at least one instruction, at least one program, a code set, or an instruction set is stored, where the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by the processor to implement the content recommendation method as described in the above aspect.
In another aspect, a computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions loaded and executed by a processor to implement the content recommendation method as described in the above aspect is provided.
In another aspect, embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions so that the computer device performs the content recommendation method provided in the above aspect.
The technical scheme provided by the embodiment of the application has the beneficial effects that at least:
in the embodiment of the application, the homonymous path and the heteronymous path in the content heterograms are utilized to respectively extract the characteristics of the target content nodes corresponding to the target content, and the obtained characteristics are subjected to characteristic fusion to obtain the target content characteristics finally representing the target content. The method provided by the embodiment of the application can be used for determining the target content characteristics by fusing the association relation between the contents and the attribute information of the contents, so that the target content characteristics are fused with various information, and the accuracy of representing the target content by the target content characteristics is improved, thereby simultaneously considering the association relation between the contents and the attribute information corresponding to the contents for recommendation when the content is recommended based on the target content characteristics, and being beneficial to improving the accuracy of content recommendation.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a content recommendation method according to an embodiment of the present application;
FIG. 2 illustrates a schematic diagram of an implementation environment provided by an exemplary embodiment of the present application;
FIG. 3 illustrates a flow chart of a content recommendation method provided by an exemplary embodiment of the present application;
FIG. 4 is a flowchart illustrating a content recommendation method provided by another exemplary embodiment of the present application;
FIG. 5 illustrates a schematic diagram of an implementation of a feature extraction process provided by an exemplary embodiment of the present application;
FIG. 6 illustrates a flowchart of a content recommendation method provided by another exemplary embodiment of the present application;
FIG. 7 illustrates a schematic diagram of an implementation of a contrast learning process provided by an exemplary embodiment of the present application;
FIG. 8 is a schematic diagram of a content recommendation process provided by an exemplary embodiment of the present application;
FIG. 9 is a block diagram illustrating a structure of a content recommendation device according to an exemplary embodiment of the present application;
fig. 10 is a schematic diagram showing the structure of a computer device according to an exemplary embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.
References herein to "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.
In the related art, when content is recommended, the content is recommended based on attribute information of the content, for example, the content with the same label determines that the content has relevance, so that the content with the same label is recommended, however, when the content is recommended based on the mode, the content is recommended, which has a certain limitation, and the recommended content is possibly not in line with the requirement of a user, namely, the content is not recommended accurately. Therefore, in the embodiment of the application, when content recommendation is performed, the content recommendation is performed by utilizing the association information between the contents and the attribute information of the contents, so that the accuracy of the content recommendation is improved.
Fig. 1 is a schematic diagram of content recommendation according to an embodiment of the present application. When content recommendation is performed, feature extraction can be performed on the target content node according to the homogeneous element path and the heterogeneous element path in the content heterogeneous graph 101 respectively to obtain a content basic feature 102 and a content attribute feature 103, and feature fusion can be performed on the content basic feature 102 and the content attribute feature 103 to obtain a target content feature 104, so that content recommendation can be performed based on the target content feature 104.
In the embodiment of the application, the accuracy of representing the target content by the target content characteristics can be improved because the association relation between the target content and other contents and the attribute information of the target content are considered when the target content characteristics of the target content are determined, so that the accuracy of content recommendation can be improved when the content recommendation is performed based on the target content characteristics.
The method provided by the embodiment of the application can be applied to any scene of content recommendation. The following describes schematically an application scenario of the content recommendation method provided by the embodiment of the present application.
1. Applicable to video recommendation scenes
When the method is applied to the video recommendation scene, the method provided by the embodiment of the application can be applied to a background server of a video playing platform. The background server can perform feature extraction on the video based on the homogeneous element paths in the video heterogeneous graph to obtain video basic features, perform feature extraction on the video based on the heterogeneous element paths to obtain video attribute features, and perform feature fusion on the video basic features and the video attribute features to obtain video features, so that similar videos are determined based on the video features of each video, and video recommendation is performed.
2. Applicable to article recommendation scenes
When the method is applied to the article recommendation scene, the method provided by the embodiment of the application can be applied to a background server of an article browsing platform. The background server can extract the characteristics of the articles based on the homogeneous element paths in the heterogeneous charts of the articles to obtain basic characteristics of the articles, extract the characteristics of the articles based on the heterogeneous element paths to obtain attribute characteristics of the articles, and perform characteristic fusion on the basic characteristics of the articles and the attribute characteristics of the articles to obtain the characteristics of the articles, so that similar articles are determined based on the characteristics of the articles of each article, and article recommendation is performed.
The above description is only illustrative of an application scenario, and the method provided by the embodiment of the present application may be applied to other scenarios where content recommendation is required, and the embodiment of the present application is not limited to an actual application scenario.
FIG. 2 illustrates a schematic diagram of an implementation environment provided by an exemplary embodiment of the present application. The implementation environment includes a terminal 210 and a server 220. The data communication between the terminal 210 and the server 220 is performed through a communication network, alternatively, the communication network may be a wired network or a wireless network, and the communication network may be at least one of a local area network, a metropolitan area network, and a wide area network.
The terminal 210 is an electronic device provided with a content recommendation function. The electronic device may be a mobile terminal such as a smart phone, a tablet computer, a notebook computer, or a terminal such as a desktop computer, a projection computer, a smart television, or an intelligent vehicle terminal, which is not limited in the embodiment of the present application.
The server 220 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content delivery network (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligence platform. In the embodiment of the present application, the server 220 is a background server that provides a content recommendation function client in the terminal 210, and may construct a content heterogeneous graph based on the association relationship between the contents and the attribute information of the contents, and perform feature extraction on each content based on the content heterogeneous graph, so as to obtain content features of each content, thereby performing content recommendation based on the content features. In one possible implementation, when the user views the target content, the server 220 may acquire the target content at the terminal 210, thereby determining recommended content corresponding to the target content, and transmit the recommended content to the terminal 210 for presentation of the recommended content.
For convenience of description, the following embodiments are described as examples of the content recommendation method performed by the computer device. The computer device may be a terminal or a server as shown in fig. 2. The embodiment of the application can be applied to various scenes, including but not limited to cloud technology, artificial intelligence, intelligent transportation, auxiliary driving and the like.
Referring to fig. 3, a flowchart of a content recommendation method according to an exemplary embodiment of the present application is shown. This embodiment will be described by taking the method for a computer device as an example, and the method includes the following steps.
Step 301, obtaining a content heterogeneous graph, wherein the content heterogeneous graph comprises homogeneous element paths and heterogeneous element paths, the homogeneous element paths are composed of content nodes corresponding to content, and the heterogeneous element paths are composed of content nodes and attribute nodes corresponding to content attributes.
Alternatively, the graph is made up of a set of nodes (nodes) and edges (edges) connecting the nodes. Heterogeneous graphs refer to graphs that contain multiple types of nodes or multiple types of edges. The content heterogeneous graph is composed of content nodes corresponding to the content, attribute nodes of content attributes and edges connecting the nodes. Wherein different edges may be used to indicate different relationships between nodes. For example, when the content is video content, the content heterogeneous graph may include video nodes corresponding to videos, and attribute nodes corresponding to video attributes, such as label nodes corresponding to video labels, actor nodes corresponding to actors in the videos, director nodes corresponding to video directors, and the like.
In the embodiment of the application, the content heterogeneous graph is a heterogeneous graph formed by homogeneous element paths and heterogeneous element paths. Where a meta-path refers to a particular path pattern connecting two entities. The homogeneous element path refers to an element path formed by the same type of nodes, taking video as an example, namely 'video-video (I-I)', wherein the nodes connected in the homogeneous element path have association relations with corresponding content, and the association relations are the same, for example, the association relations can be that the video is watched by the same user, collected by the same user, praised or commented by the same user and the like. The heterogeneous meta-path refers to a meta-path formed by nodes of different types, wherein the meta-path comprises content nodes and attribute nodes. For example, "video-actor-video (I-a-I)". And the content heterogeneous graph can comprise heterogeneous element paths of different types, and the heterogeneous path represents the relationship between the content and different attributes. For example, various heterogeneous paths such as "video-actor-video", "video-director-video (I-D-I)", and "video-tag-video (I-T-I)" may be included.
Step 302, extracting characteristics of a target content node based on the homogeneous element path to obtain basic characteristics of the content, wherein the target content node belongs to the content node.
The target content node is one of the content nodes in the content heterogeneous graph.
In one possible implementation manner, since the content nodes in the homogeneous element paths have an association relationship, feature extraction can be performed on the target content nodes based on the homogeneous element paths related to the target content nodes in the content heterogeneous graph to obtain content basic features, so that common features between the target content corresponding to the target content nodes and the content having the association relationship can be learned.
Alternatively, the content base feature is a vector representation of the content in a low-dimensional space, which is an implicit representation, representing a multi-dimensional vector.
And step 303, extracting the characteristics of the target content node based on the heterogeneous path to obtain the content attribute characteristics.
And simultaneously, the characteristic extraction is carried out on the target content node based on the heterogeneous path related to the target content node, and as the heterogeneous path can indicate the attribute of the content, the characteristic extraction is carried out on the target content node based on the heterogeneous path, so that the attribute characteristic of the target content can be learned.
Accordingly, the content attribute feature is also a vector representation of the content in a low-dimensional space.
And in the content heterogram, different types of heterogeneous element paths are included, and correspondingly, the heterogeneous element paths related to the target content node also include different types, so when the characteristic extraction is performed on the target content node based on the heterogeneous element paths, the characteristic extraction is performed on the target content node based on the heterogeneous element paths of different types, and different content attribute characteristics corresponding to the heterogeneous element paths of different types are obtained.
For example, extracting features of a target video node corresponding to a target video based on video-actor-video to obtain attribute features of video actors; performing feature extraction on a target video node based on video-director-video to obtain video director attribute features; and extracting the characteristics of the target video node based on the video-tag-video to obtain the attribute characteristics of the video tag.
And step 304, carrying out feature fusion on the content basic features and the content attribute features to obtain target content features of the target content nodes.
In the embodiment of the application, the characteristic extraction is carried out on the target content node based on the homogeneous element path and the heterogeneous element path respectively to obtain the content basic characteristic and the content attribute characteristic of the target content under different dimensions, and the characteristic fusion is carried out on the characteristics obtained under different dimensions to obtain the target content characteristic of the final target content node, so that the target content is represented by the target content characteristic. The characteristics learned based on the homogeneous element path and the heterogeneous element path are fused, so that the accuracy of the characteristic of the target content for representing the target content is higher, and the characteristic of the target content is similar to the characteristic of the content with strong association relation with the target content and the content with the same attribute because the characteristic of the target content is learned based on the association relation between the contents and the attribute of the content, so that the content which is more similar to the target content can be recommended when the content is recommended based on the characteristic of the target content.
Optionally, when feature fusion is performed on the content basic feature and the content attribute feature, the feature may be processed averagely, or the feature fusion may be performed by adopting an attention mechanism.
In step 305, content recommendation is performed based on the target content characteristics.
After obtaining the target content features, the computer device may make recommendations for related content based on the target content features.
In summary, in the embodiment of the present application, the homography path and the heterography path in the content heterogram are used to respectively perform feature extraction on the target content node corresponding to the target content, and perform feature fusion on the obtained features, so as to obtain the target content feature finally representing the target content. The method provided by the embodiment of the application can be used for determining the target content characteristics by fusing the association relation between the contents and the attribute information of the contents, so that the target content characteristics are fused with various information, and the accuracy of representing the target content by the target content characteristics is improved, thereby simultaneously considering the association relation between the contents and the attribute information corresponding to the contents for recommendation when the content is recommended based on the target content characteristics, and being beneficial to improving the accuracy of content recommendation.
Referring to fig. 4, a flowchart of a content recommendation method according to another exemplary embodiment of the present application is shown. This embodiment will be described by taking the method for a computer device as an example, and the method includes the following steps.
In step 401, a homogeneity path is constructed based on association information between each content.
In one possible implementation, the computer device first builds a content heterogram based on the acquisition information, thereby extracting content features of the respective node content based on the content heterogram.
Alternatively, the content may be any content that can be recommended, such as video content, article content (books, papers, journals, etc.), merchandise, and the like.
Optionally, the computer device constructs the homogeneity path based on the association information between the content. Wherein the associated information is determined based on user consumption behavior, such as viewing, commenting, collecting, etc. information of the content. When the content is watched or reviewed or collected by the same user, the association relationship between the two contents is determined. In this embodiment, when the homogeneous element path is constructed based on the association information, association information corresponding to the content nodes connected in the homogeneous element path is the same. For example, the content corresponding to the connected content nodes in the homogeneous path is the content watched by the same user.
And when the associated information is acquired, the condition that the information is invalid information may exist, so that the acquired information can be filtered, and effective associated information is obtained. For example, when the associated information is determined based on the user viewing information, and the user viewing the content may only be viewed for a short period of time, at this time, it cannot be regarded that the user has viewed the content, determined as invalid information. Therefore, when the associated information is determined based on the viewing information, the invalid viewing is first filtered, and the filtering threshold may be set in advance at the time of filtering. Different filtering thresholds may be set for different contents, for example, when the content is video, the filtering threshold may be a threshold corresponding to a viewing duration proportion, for example, may be 20%, and when a ratio of the viewing duration to the video duration is greater than or equal to 20%, it is determined that the content is effective viewing information. When the content is an article, the filtering threshold may be a threshold corresponding to a viewing time period, for example, 1 minute, and when the viewing time period is longer than 1 minute, it is determined as effective viewing information.
Optionally, when the homogeneous element paths are constructed based on the association information between the contents, the association between the contents needs to be determined according to the association information, and when the association is strong, the paths can be constructed between the corresponding content nodes. This approach may include the following steps 401a-401c (not shown):
in step 401a, a mutual transition probability between contents is determined based on the number of co-occurrences between contents, where the number of co-occurrences is used to indicate the number of times the contents have the same association information, and the mutual transition probability is used to indicate the association between contents.
The computer device determines the number of co-occurrences between the respective contents based on the acquired association information. Wherein the co-occurrence number is a number indicating that the contents have the same association information therebetween. For example, taking the content as a video and the associated information as information watched by the same user as an example, the co-occurrence number is the number of times that two contents have been watched by the same user. After the number of co-occurrence times between the contents is determined, the probability of mutual transition between the contents can be determined based on the number of co-occurrence times between the contents.
Taking the example of determining the probability of mutual transition between the first content and the second content as the example, the probability of transition between the first content and the second content may be the ratio of the number of co-occurrences between the first content and the second content to the sum of the number of co-occurrences between the first content and the other content. If the correlation between the first content and the second content is determined based on the transition probability alone, there may be a case where the determined correlation deviates from the actual correlation. In combination with the above example, when the first content is a popular video and the second content is a non-popular video, and the correlation between the first content and the second content is determined based on the transition probability, since the first content is a popular video, the probability of being watched by the same user with other videos is relatively high, the transition probability is also relatively high, and the correlation between the first content and the second content is likely to be low, so that the correlation determination is inaccurate. The mutual transition probability between the contents is the product of the mutual transition probability between the contents, namely the product of the transition probability between the first content and the second content and the transition probability between the second content and the first content.
In step 401b, co-occurrence content is determined based on the probability of mutual transition between the content, and the probability of mutual transition between the co-occurrence content is higher than the probability of mutual transition between other content.
After the mutual transition probability among the contents is obtained, the co-occurrence content can be determined based on the mutual transition probability, the correlation between the mutual transition probability and the contents is a positive correlation, and when the mutual transition probability is higher, the correlation between the contents is higher, and the co-occurrence content can be determined. And it should be noted that the co-occurrence content may be a plurality of pairs of content with a relatively high probability of mutual transition. For example, for video a, the probability of mutual transition between the video a and other videos is ranked from high to low, and the top few videos and video a are determined to be co-occurrence videos.
In step 401c, a homogeneous element path is constructed based on co-occurrence content.
After the co-occurrence content is determined, a path can be constructed among content nodes corresponding to the co-occurrence content, so that a plurality of homogeneous element paths are obtained, and the homogeneous element paths are completedAnd (5) construction. For example, video a and video B are co-occurrence content, video B and video C are co-occurrence content, and a homogeneous element path (I A -I B -I C )。
Step 402, based on the attribute information of each content, constructing a heterogeneous path, wherein the content nodes in the same heterogeneous path have the same attribute as the corresponding content.
Accordingly, the computer device constructs a heterogeneous path based on the attribute information of the content. Wherein content nodes in the same heterogeneous path have the same attribute. For example, with video content, when the heterogeneous path is I-D-I, two videos I in the path are guided by the same director.
In one possible implementation, when the contents are different, the heterogeneous paths may be constructed based on the attribute information corresponding to the contents, respectively.
Optionally, when the content is video content, the content attribute includes at least one of a video tag, a video director, and a video director; optionally, when the content is an article content, the content attribute includes at least one of an article tag, an article author, and an article provenance; optionally, when the content is a commodity, the content attribute includes at least one of a commodity label, a commodity merchant, and a commodity price.
The computer equipment constructs a plurality of homogeneous element paths and heterogeneous element paths through the acquired information to obtain a content heterogeneous graph, so that the association relationship between the contents and the subordinate relationship between the contents and the attributes are represented.
The present step and the step 401 may be performed sequentially or in the same order, and the embodiment only describes the embodiment, and the execution timing is not limited.
Step 403, obtaining a content heterogeneous map.
When the content characteristics corresponding to the content nodes are extracted, a pre-built content heterogeneous diagram can be obtained, so that the characteristic extraction of the content nodes is performed based on a plurality of homogeneous element paths and a plurality of heterogeneous element paths in the content heterogeneous diagram, and the content node characteristics corresponding to each node are obtained.
And step 404, performing graph convolution on the target content node according to the homogeneous element path to obtain the content basic characteristics.
In the embodiment of the application, the characteristic extraction process is a graph convolution process. Wherein the graph convolution is performed by a convolutional layer in the graph neural network. The graph convolution process is a process of aggregating the target node characteristics corresponding to the target content node and the neighbor node characteristics of the neighbor nodes. The node characteristics of each node are characteristics after preprocessing, and are used for representing corresponding nodes. For example, the node characteristics of the content node may be some characteristics of the node, such as video, for example, video material, video duration, and so on; while node features for an attribute node may be some of the features that the attribute has. The preprocessing process comprises feature cleaning, feature encoding, feature normalization and the like. The feature cleaning is used for filtering abnormal features, the feature coding is a process of coding original features, the coding can be carried out according to different feature properties, and the coded features can be normalized.
The process of graph rolling comprises the process of aggregating the target node characteristics and the neighbor node characteristics according to the homogeneous element paths and aggregating the target node characteristics and the neighbor node characteristics according to the heterogeneous element paths. In one possible implementation, the process of graph convolution for a target content node in terms of a homogeneous path may include the following steps 404a-404b (not shown in the figure):
step 404a, selecting at least one first-order neighbor content node corresponding to the target content node based on the homogeneous element path.
Firstly, selecting at least one first-order neighbor content node corresponding to a target content node by taking the target content node as a center based on a homogeneous element path in a content heterogeneous graph. The first-order neighbor content node of the target content node refers to a neighbor node directly connected with the target content node, the second-order neighbor content node of the target content node refers to a neighbor node directly connected with the first-order neighbor node, and at least one first-order neighbor content node of the target content node can be selected by pushing the first-order neighbor content node.
Wherein the number of selected nodes can be preset for selecting neighboring nodes, for example, for selecting each node10 neighbor nodes are selected. Taking the example that the selected at least first-order neighbor content node comprises a first-order neighbor node and a second-order neighbor node, when the target content node is I 0 When the method is used, according to the homogeneous element path, the selected first-order neighbor node is I 1 The second-order neighbor node is I 2 I.e. according to a plurality of homogeneous element paths I 0 -I 1 -I 2 Selecting, wherein, during selecting, selecting I 0 10 first-order neighbor nodes I of (1) 1 Based on each content node I 1 10 neighbor nodes I are selected 2 Obtaining I 0 100 second order neighbor nodes I of (1) 2
And step 404b, performing feature aggregation on the neighbor node features of the at least one first-order neighbor content node and the target node features of the target content node to obtain content basic features.
After the at least first-order neighbor content node is selected, feature aggregation is carried out on the neighbor node features of the at least first-order neighbor content node and the target node features of the target content node, so that the content basic features of the target content are obtained.
Optionally, the at least first-order neighbor content node comprises a first-order neighbor node and a second-order neighbor node of the target content node. Feature aggregation may include the following processes:
step one, carrying out feature aggregation on node features of the first-order neighbor nodes and node features of the second-order neighbor nodes to obtain neighbor aggregation features.
The method comprises the steps that firstly, feature aggregation is conducted on node features of second-order neighbor nodes and node features of first-order neighbor nodes, in the aggregation process, all node features of the second-order neighbor nodes are fused to obtain fusion features of the second-order neighbor nodes, feature fusion is conducted on node features of all first-order neighbor nodes to obtain fusion features of the first-order neighbor nodes, and feature aggregation is conducted on the fusion features of the first-order neighbor nodes and the fusion features of the second-order neighbor nodes to obtain neighbor aggregation features.
As shown in fig. 5, in the first layer 501 of the graph roll layer, a second order neighbor node I is formed 2 Feature aggregation to first order neighbor nodesI 1 And obtaining the neighbor aggregation characteristics.
And secondly, feature aggregation is carried out on the neighbor aggregation features and the target node features, so that content basic features are obtained.
After the neighbor aggregation feature is obtained by aggregation, feature aggregation is carried out on the neighbor aggregation feature and the target node feature to obtain the content basic feature
As shown in fig. 5, in the second layer 502 of graph convolution, feature aggregation is performed on the neighbor aggregation features and the target node features to obtain I 0 Content-based features of (a).
And step 405, performing graph convolution on the target content node according to the heterogeneous path to obtain the content attribute characteristics.
The computer device simultaneously performs graph convolution on the target content node according to the heteroelement path. The process of performing the graph convolution according to the heterogeneous path is the same as that of performing the graph convolution according to the homogeneous path. Firstly, at least one first-order neighbor node of a target content node is selected based on a heterogeneous path, and the at least one first-order neighbor node comprises an attribute node and a content node. Taking the heterogeneous path I-T-I as an example, the heterogeneous path I is adopted 0 -T 0 -I 1 When selecting, firstly selecting I 0 First order neighbor node T of (a) 0 Based on each node T 0 Selecting neighbor node I 1 Obtain I 0 Is a second order neighbor node I of (a) 1
As shown in fig. 5, in the first layer 503 of the graph roll layer, the second-order neighbor node I is 1 Feature aggregation to first-order neighbor node T 0 The neighbor aggregation feature is obtained, and in the second layer 504 of the graph roll stacking layer, the neighbor aggregation feature and the target content node I 0 Feature aggregation is carried out on node features of the content to obtain content attribute features
Step 406, masking the content attribute features, where the masking is used to conceal a portion of the content attribute features.
And after obtaining the content basic characteristics and the content attribute characteristics, carrying out characteristic fusion on each characteristic to obtain the target content characteristics corresponding to the target content nodes finally extracted based on multiple dimensions. Illustratively, graph convolution is performed based on homogeneous element paths to obtain content basic featuresDrawing convolution is carried out based on the heterogeneous path I-T-I, I-A-I and I-D-I respectively to obtain corresponding content attribute characteristics ∈>And +.>Thereby will->And +.>Feature fusion is carried out, and normalization is carried out after the feature fusion to obtain target video features of the target video>(embedding)。
Since there may be some missing feature information corresponding to the content, for example, the author information or the tag information of the video is missing, in one possible implementation, before feature fusion, masking is performed on the content attribute features, so that part of the content attribute features are hidden with a certain probability.
And multiplying the mask matrix by a feature matrix corresponding to the content attribute features in the mask processing process to obtain the content attribute features after mask processing.
And step 407, carrying out feature fusion on the content basic features and the mask processed content attribute features to obtain target content features.
Because the mask processing conceals part of the content attribute features, the content basic features and the remaining non-concealed content attribute features are subjected to feature fusion, the influence of feature information deletion on the extraction target content features is reduced, and the robustness of the model is improved.
Step 408, determining similar content features in the content feature index based on the target content features, wherein the content feature index contains content features corresponding to each content.
Alternatively, after determining the content features corresponding to each content, a content feature index may be established based on the content features, where the content feature index may be established based on vector index engines Annoy, faiss. And before establishing the content feature index, the factors such as the content integrity or the content characteristics of each content can be initially screened to obtain candidate content, and then the content feature index is established based on the content features (embedding) corresponding to each content in the candidate content. Schematically, when the video index is established, video filtering may be performed firstly based on factors including video integrity, video playing source, category to which the video belongs, such as (shadow, view, heddle, animation, child) and the like, to obtain candidate videos, and then the video feature index may be established based on the ebedding of each video in the candidate videos.
After establishing the content feature index, similar content for each content may be determined based on the content feature index. When determining similar content of the target content, similar content features may be determined in the content feature index based on the target content features, and in one possible implementation, feature distances between the target content features and content features of each candidate content may be calculated to determine feature similarities, rank each candidate content feature based on the feature similarities, and determine the ranked prior feature as a similar content feature. For example, the top 5 content features ranked first may be determined to be similar content features.
Step 409, content recommendation is performed based on similar content characteristics.
After determining the similar content features of the target content features, the content corresponding to the similar content features can be determined as the content with strong association with the target content, and when the similar content of the target content is recommended, the content corresponding to the similar content features can be recommended.
In this embodiment, the computer device performs graph convolution on the target content node based on the constructed homogeneous element path and heterogeneous element path, so that the obtained target content feature fuses prior knowledge, that is, content attribute information, and posterior knowledge, that is, association information between the contents, thereby improving accuracy of extracting the target content feature.
In the embodiment, when the homogeneous element path is constructed, extraction is performed based on the mutual transition probability between the contents, so that strong correlation between the connected contents in the homogeneous element path can be ensured, and the accuracy of feature extraction is improved.
In one possible implementation, the process of graph convolution is performed by a convolution layer of the graph neural network. The graphic neural network is a network trained based on sample content, and the training process of the graphic neural network will be schematically described below.
Referring to fig. 6, a flowchart of a neural network training process according to an exemplary embodiment of the present application is shown. This embodiment will be described by taking the method for a computer device as an example, and the method includes the following steps.
And 601, carrying out graph convolution on the sample content nodes according to the homogeneity element path to obtain the predicted content basic characteristics.
Alternatively, the graph algorithm used in the graph neural network may be graph SAGE, graph annotation force network (Graph Attention Network, GAT), disenGCN, and the like, which is not limited in this embodiment. Alternatively, a portion of the content nodes in the content heterogram may be determined as sample content nodes. The graph convolution is performed on the sample content node according to the homogeneity path to obtain the predicted content basic feature, where the process of performing the graph convolution on the sample content node according to the homogeneity path may refer to the process of performing the graph convolution on the target content node according to the homogeneity path in the step 404, which is not described herein.
And 602, carrying out graph convolution on the sample content nodes according to the heterogeneous element path to obtain predicted content attribute characteristics.
Likewise, the graph convolution is performed on the sample content node according to each heterogenous element path to obtain different predicted content attribute features, where the process of performing the graph convolution on the sample content node according to the heterogenous element path may refer to the process of performing the graph convolution on the target content node according to the heterogenous element path in step 405, which is not described in detail in this embodiment.
And 603, carrying out feature fusion on the predicted content basic features and the predicted content attribute features to obtain predicted content features.
And carrying out feature fusion on the predicted content basic features and each predicted content attribute feature to obtain predicted content features corresponding to the sample content nodes.
Step 604, updating and training the graph neural network based on the predicted content basic characteristics, the predicted content attribute characteristics and the predicted content characteristics of each sample content node.
By the above manner, the predicted content basic feature, the predicted content attribute feature and the predicted content feature after feature fusion corresponding to each sample content node can be obtained, so that based on each feature of each sample content node, the neural network is updated and trained, where the training process may include the following steps 604a-604c (not shown in the figure):
In step 604a, a correlation loss is determined based on the predicted content characteristics of the positive sample content node pair and the negative sample content node pair, wherein the positive sample content node pair contains content nodes corresponding to two sample contents having an association relationship or the same attribute, and the negative sample content node pair contains content nodes corresponding to two sample contents having no association relationship and no same attribute.
A correlation penalty is first determined based on the predicted content characteristics of the sample content nodes, wherein the correlation is between the corresponding content of two nodes in a node pair, which can be determined based on the predicted content characteristics of the two nodes. The correlation between the positive sample content node pairs is higher, and the positive correlation relation is formed between the positive sample content node pairs and the extracted content feature accuracy, namely, the higher the correlation between the positive sample content node pairs obtained based on the prediction content feature determination is, the more accurate the corresponding prediction content feature is. And the correlation among negative sample content nodes obtained based on the predicted content characteristics and the accuracy of the extracted content characteristics are in a negative correlation relationship. Thus, the correlation loss can be determined by using the correlation between the nodes in the positive sample content node and the correlation between the nodes in the negative sample content node, so as to update the network parameters of the graph neural network.
Before determining the correlation loss, first a positive sample content node pair and a negative sample content node pair are determined, and the process may include the steps of:
step one, determining two connected sample content nodes in the content heterogeneous graph as positive sample content node pairs.
Firstly, sampling can be performed based on the content heterogeneous graph, and an association relationship exists between two connected nodes in the content heterogeneous graph, namely, the correlation of the corresponding content of the two nodes is higher, so that the two connected sample content nodes in the content heterogeneous graph can be determined to be positive sample content node pairs. Illustratively, when a sample content node I in a content heterogeneous graph 0 And I 1 When connected, can connect I 0 And I 1 Is determined to be a positive sample content node pair.
And step two, replacing a single node in the positive sample content node pair with other content nodes to obtain a negative sample content node pair, wherein the other content nodes are not connected with the non-replaced node in the positive sample content node pair in the content heterogeneous graph.
After the positive sample content node pair is determined, a single node in the positive sample content node pair may be replaced with a content node in the positive sample content node pair that is not connected to the non-replaced content node, thereby obtaining a negative sample content node pair, e.g., positive sample content node pair I 0 And I 1 Middle I 1 Replaced by I 3 ,I 3 And I 0 Are not connected in the content heterogeneous graph, thereby obtaining a negative sample content node pair I 0 And I 3
After determining the positive and negative sample content node pairs, a correlation loss may be determined based on the predicted content characteristics of the nodes in each node pair, wherein the steps may be included:
step one, calculating a positive correlation value between the predicted content characteristics corresponding to the positive sample content node pair.
The positive correlation value represents the correlation between the corresponding predicted content characteristics of the sample content nodes in the positive sample content node pair. The method is as follows:
wherein,,is the predictive content feature of i,/->Is the predicted content characteristic of i ', where i and i' represent two different sample content nodes in a positive sample content node pair.
And step two, calculating a negative correlation value between the predicted content characteristics corresponding to the negative sample content node pair.
The negative correlation value represents the correlation between the corresponding predicted content characteristics of the sample content nodes in the negative sample content node pair. The calculation method can be as described in the first step.
And step three, determining the correlation loss based on the positive correlation value and the negative correlation value.
The manner of determining the correlation loss may be as follows:
Where k is the total number of samples,i.e. representing the negative phase corresponding to the nth negative sample content node pairAnd (5) an off value.
Step 604b, determining a comparison loss based on a first content feature pair and a second content feature pair, wherein the first content feature pair is a feature pair composed of different types of predicted features corresponding to the same sample content node, the different types of predicted features are predicted content base features or predicted content attribute features, and the second content feature pair is a feature pair composed of different types of predicted features corresponding to different sample content nodes.
In this embodiment, in order to further increase the learning effect of the graph neural network, a contrast loss is introduced. Wherein the contrast loss is determined based on the inter-feature differences obtained from meta-paths constructed from different perspectives. The first content feature pair refers to a feature pair extracted from the same sample content node under different view angles, for example, a feature pair formed by a predicted content basic feature obtained by performing graph convolution according to a homogeneous element path and a predicted content attribute feature obtained by performing graph convolution according to a heterogeneous element path is the first content feature pair, or a feature pair formed by two predicted content attribute features obtained by performing graph convolution according to two different heterogeneous element paths is the first content feature pair. The second content feature pair is a feature pair extracted from different sample content nodes under different viewing angles, for example, a feature pair formed by a predicted content basic feature of the first sample content node and a predicted content attribute feature of the second sample content node is the second content feature pair.
The contrast loss may be determined based on a correlation of the first content feature to the intermediate feature and a correlation of the second content feature to the intermediate feature. Wherein, the process of determining contrast loss may comprise the steps of:
step one, calculating a first correlation value between a first content feature pair.
The first correlation value is used for indicating the correlation of the features between the first content feature pair. The calculation method is as follows:
where n represents the nth sample node, i, j represents different views (different meta paths), h ni Namely, the sample node n is embedding under the i view angle, namely, the predicted content basic characteristic or the predicted content attribute characteristic obtained by carrying out graph convolution based on the meta-path i, and h nj I.e., the sample node n is emmbedding at view j.
And step two, calculating a second correlation value between the second content feature pairs.
The second correlation value is indicative of a correlation of the features between the second pair of content features. The calculation method is as follows:
where n' is a sample content node different from n.
And thirdly, determining comparison loss based on each first correlation value and each second correlation value, wherein different first correlation values correspond to different first content feature pairs, and different second correlation values correspond to different second content feature pairs.
After obtaining the first correlation value of the first content feature pair and the second correlation value of the second content feature pair under the two views, the contrast loss under the two views can be determined in the following manner:
wherein l cl I.e., the contrast loss of the sample node N between the viewing angles i and j, τ is the temperature coefficient, is the adjustable constant, and N is the total number of sample nodes.
Since the features extracted based on the multiple views may be included, the contrast loss of the sample node n between every two views may be determined based on the above manner, and the respective contrast losses of the sample node n may be summed to obtain the total contrast loss of the final sample node n. Because the N samples are included, the comparison total loss corresponding to each of the N samples can be summed to obtain the comparison learning total loss in one batch.
Schematically, as shown in FIG. 7, for sample content node i 1 Feature and sample content node i obtained by carrying out graph convolution based on each element path 2 And determining corresponding contrast loss between every two features obtained by carrying out graph convolution on the basis of each element path.
In step 604c, the graph neural network is updated based on the correlation loss and the comparison loss.
After the correlation loss and the contrast loss are determined, a gradient descent method can be adopted to update the network parameters of the graph neural network until the correlation loss and the contrast loss are converged, so that the trained graph neural network is obtained. In a possible implementation manner, taking video recommendation as an example, a learning framework is shown in fig. 8, where a heterogeneous information acquisition module 801, a data preprocessing module 802, a learning module 803, and an evaluation and use module 804 are included in a chart in the recommendation process. The heterogeneous information obtaining module 801 includes obtaining node information, including obtaining a video, a tag, a director, an actor, and the like, and obtaining a relationship between nodes, including a co-occurrence relationship and an attribute relationship, where the co-occurrence relationship is used to construct a homogeneous element path, and the attribute relationship is used to construct a heterogeneous element path. And then, carrying out data preprocessing, including data cleaning, feature processing and heterogeneous graph construction. The graph representation learning can be performed after preprocessing, wherein the graph representation learning module 803 comprises a graph framework and a graph algorithm, the graph framework refers to a graph neural network framework, tools and platforms for integrating and realizing the graph algorithm, such as DGL (Deep Graph Library), plato and the like, a graph neural network can be constructed based on the graph framework and the graph algorithm, so that graph convolution is performed by using the graph neural network to obtain content videos, losses are determined by using various video features, updating training is performed on the graph neural network, and finally video features of each video are extracted by using the graph neural network. After obtaining the video features of each video, the evaluation and use module 804 may construct an index based on each video feature and determine recommended videos corresponding to each video based on the index.
In this embodiment, update training is performed on the graph neural network through the correlation loss and the contrast loss, that is, embedding of noise information is suppressed by using self-supervision learning and contrast learning, so that the learning effect of the graph neural network is enhanced, and the accuracy of extracting the corresponding features of the target content by the graph neural network is improved.
It should be noted that, the information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals related to the present application are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of the related data is required to comply with the relevant laws and regulations and standards of the relevant countries and regions. For example, the consumer behavior of the user involved in the present application is obtained under the condition of sufficient authorization.
FIG. 9 is a block diagram illustrating a content recommendation device according to an exemplary embodiment of the present application, as shown in the drawings, the device includes:
the obtaining module 901 is configured to obtain a content heterogeneous graph, where the content heterogeneous graph includes a homogeneous element path and a heterogeneous element path, the homogeneous element path is formed by a content node corresponding to content, and the heterogeneous element path is formed by the content node and an attribute node corresponding to a content attribute.
The first extracting module 902 is configured to perform feature extraction on a target content node based on the homogeneity path, so as to obtain a content basic feature, where the target content node belongs to the content node.
And the second extraction module 903 is configured to perform feature extraction on the target content node based on the heterogenous path, so as to obtain a content attribute feature.
And a first fusion module 904, configured to perform feature fusion on the content basic feature and the content attribute feature to obtain a target content feature of the target content node.
The content recommendation module 905 is configured to recommend content based on the target content feature.
Optionally, the first extracting module 902 is further configured to:
performing graph convolution on the target content node according to the homogeneous element path to obtain the content basic feature;
the second extraction module 903 is further configured to:
and carrying out graph convolution on the target content node according to the heterogeneous path to obtain the content attribute characteristics.
Optionally, the first extracting module 902 includes:
the node selection unit is used for selecting at least one first-order neighbor content node corresponding to the target content node based on the homogeneous element path;
And the feature aggregation unit is used for carrying out feature aggregation on the neighbor node features of the at least one first-order neighbor content node and the target node features of the target content node to obtain the content basic features.
Optionally, the at least one first-order neighbor content node includes a first-order neighbor node and a second-order neighbor node of the target content node;
the feature aggregation unit is further configured to:
performing feature aggregation on the node features of the first-order neighbor nodes and the node features of the second-order neighbor nodes to obtain neighbor aggregation features;
and performing feature aggregation on the neighbor aggregation features and the target node features to obtain the content basic features.
Optionally, the graph rolling process is performed by a convolution layer in the graph neural network, and the apparatus further includes:
the first convolution module is used for carrying out graph convolution on the sample content nodes according to the homogeneous element paths to obtain predicted content basic characteristics;
the second convolution module is used for carrying out graph convolution on the sample content nodes according to the heterogeneous element path to obtain predicted content attribute characteristics;
the second fusion module is used for carrying out feature fusion on the predicted content basic features and the predicted content attribute features to obtain predicted content features;
And the training module is used for updating and training the graph neural network based on the predicted content basic characteristics, the predicted content attribute characteristics and the predicted content characteristics of each sample content node.
Optionally, the training module includes:
a first determining unit, configured to determine a correlation loss based on the predicted content characteristics of a positive sample content node pair including content nodes corresponding to two sample contents having an association relationship or having the same attribute and a negative sample content node pair including content nodes corresponding to two sample contents having no association relationship and no same attribute;
a second determining unit, configured to determine a contrast loss based on a first content feature pair and a second content feature pair, where the first content feature pair is a feature pair composed of different types of predicted features corresponding to the same sample content node, the different types of predicted features are the predicted content base feature or the predicted content attribute feature, and the second content feature pair is a feature pair composed of different types of predicted features corresponding to different sample content nodes;
And the training unit is used for updating and training the graph neural network based on the correlation loss and the contrast loss.
Optionally, the first determining unit is further configured to:
calculating a positive correlation value between the predicted content characteristics corresponding to the positive sample content node pair;
calculating a negative correlation value between the predicted content characteristics corresponding to the negative sample content node pair;
the correlation loss is determined based on the positive correlation value and the negative correlation value.
Optionally, the apparatus further includes:
the node determining module is used for determining two connected sample content nodes in the content heterogeneous graph as the positive sample content node pair;
and the node replacement module is used for replacing a single node in the positive sample content node pair with other content nodes to obtain the negative sample content node pair, wherein the other content nodes are not connected with the non-replaced node in the positive sample content node pair in the content heterogeneous graph.
Optionally, the second determining unit is further configured to:
calculating a first correlation value between the first pair of content features;
calculating a second correlation value between the second pair of content features;
And determining the comparison loss based on each first correlation value and each second correlation value, wherein different first correlation values correspond to different first content feature pairs, and different second correlation values correspond to different second content feature pairs.
Optionally, the content attribute features include at least one content attribute feature corresponding to the content attribute;
the first fusing module 904 includes:
the mask processing unit is used for carrying out mask processing on the content attribute characteristics, and the mask processing is used for hiding part of the content attribute characteristics;
and the fusion unit is used for carrying out feature fusion on the content basic features and the mask processed content attribute features to obtain the target content features.
Optionally, the apparatus further includes:
the first construction module is used for constructing the homogeneous element path based on the association information among the contents;
and the second construction module is used for constructing the heterogeneous element path based on the attribute information of each content, and the content nodes in the same heterogeneous element path have the same attribute corresponding to the content.
Optionally, the first building module includes:
a third determining unit configured to determine a mutual transition probability between contents based on a number of co-occurrence times between contents, the number of co-occurrence times being used to indicate a number of times that the contents have the same association information, the mutual transition probability being used to indicate an association between the contents;
A fourth determining unit configured to determine co-occurrence content based on a mutual transition probability between the content, the mutual transition probability between the co-occurrence content being higher than a mutual transition probability between other content;
and the construction unit is used for constructing the homogeneous element path based on the co-occurrence content.
Optionally, the content recommendation module 905 includes:
a fifth determining unit, configured to determine similar content features in a content feature index based on the target content features, where the content feature index includes content features corresponding to each content;
and the content recommendation unit is used for recommending the content based on the similar content characteristics.
Optionally, when the content is video content, the content attribute includes at least one of a video tag, a video director, and a video director;
when the content is the article content, the content attribute comprises at least one of article labels, article authors and article sources;
when the content is commodity, the content attribute comprises at least one of commodity label, commodity merchant and commodity price.
In the embodiment of the application, the homonymous path and the heteronymous path in the content heterograms are utilized to respectively extract the characteristics of the target content nodes corresponding to the target content, and the obtained characteristics are subjected to characteristic fusion to obtain the target content characteristics finally representing the target content. The method provided by the embodiment of the application can be used for determining the target content characteristics by fusing the association relation between the contents and the attribute information of the contents, so that the target content characteristics are fused with various information, and the accuracy of representing the target content by the target content characteristics is improved, thereby simultaneously considering the association relation between the contents and the attribute information corresponding to the contents for recommendation when the content is recommended based on the target content characteristics, and being beneficial to improving the accuracy of content recommendation.
Referring to fig. 10, a schematic structural diagram of a computer device according to an exemplary embodiment of the present application is shown. Specifically, the present application relates to a method for manufacturing a semiconductor device. The computer apparatus 1000 includes a central processing unit (Central Processing Unit, CPU) 1001, a system memory 1004 including a random access memory 1002 and a read only memory 1003, and a system bus 1005 connecting the system memory 1004 and the central processing unit 1001. The computer device 1000 also includes a basic Input/Output system (I/O) 1006, which helps to transfer information between various devices within the computer, and a mass storage device 1007 for storing an operating system 1013, application programs 1014, and other program modules 1015.
In some embodiments, the basic input/output system 1006 includes a display 1008 for displaying information and an input device 1009, such as a mouse, keyboard, or the like, for a user to input information. Wherein the display 1008 and the input device 1009 are connected to the central processing unit 1001 via an input output controller 1010 connected to a system bus 1005. The basic input/output system 1006 may also include an input/output controller 1010 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, the input output controller 1010 also provides output to a display screen, a printer, or other type of output device.
The mass storage device 1007 is connected to the central processing unit 1001 through a mass storage controller (not shown) connected to the system bus 1005. The mass storage device 1007 and its associated computer-readable media provide non-volatile storage for the computer device 1000. That is, the mass storage device 1007 may include a computer readable medium (not shown) such as a hard disk or drive.
The computer readable medium may include computer storage media and communication media without loss of generality. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes random access Memory (RAM, random Access Memory), read Only Memory (ROM), flash Memory or other solid state Memory technology, compact disk (CD-ROM), digital versatile disk (Digital Versatile Disc, DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will recognize that the computer storage medium is not limited to the one described above. The system memory 1004 and mass storage devices 1007 described above may be collectively referred to as memory.
The memory stores one or more programs configured to be executed by the one or more central processing units 1001, the one or more programs containing instructions for implementing the methods described above, the central processing unit 1001 executing the one or more programs to implement the methods provided by the various method embodiments described above.
According to various embodiments of the application, the computer device 1000 may also operate by being connected to a remote computer on a network, such as the Internet. I.e., the computer device 1000 may be connected to the network 1012 through a network interface unit 1011 connected to the system bus 1005, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 1011.
The memory also includes one or more programs stored in the memory, the one or more programs including steps for performing the methods provided by the embodiments of the present application, as performed by the computer device.
The embodiment of the present application also provides a computer readable storage medium, where at least one instruction, at least one program, a code set, or an instruction set is stored, where at least one instruction, at least one program, a code set, or an instruction set is loaded and executed by a processor to implement the content recommendation method described in any one of the embodiments above.
Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions so that the computer device performs the content recommendation method provided in the above aspect.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program for instructing related hardware, and the program may be stored in a computer readable storage medium, which may be a computer readable storage medium included in the memory of the above embodiments; or may be a computer-readable storage medium, alone, that is not incorporated into the terminal. The computer readable storage medium stores at least one instruction, at least one program, a code set, or an instruction set, where the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by a processor to implement the content recommendation method according to any of the method embodiments described above.
Alternatively, the computer-readable storage medium may include: ROM, RAM, solid state disk (SSD, solid State Drives), or optical disk, etc. The RAM may include, among other things, resistive random access memory (ReRAM, resistance Random Access Memory) and dynamic random access memory (DRAM, dynamic Random Access Memory). The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing description of the preferred embodiments of the present application is not intended to limit the application, but is intended to cover all modifications, equivalents, alternatives, and improvements falling within the spirit and principles of the application.

Claims (18)

1. A content recommendation method, the method comprising:
obtaining a content heterogeneous diagram, wherein the content heterogeneous diagram comprises a homogeneous element path and a heterogeneous element path, the homogeneous element path is formed by content nodes corresponding to content, and the heterogeneous element path is formed by the content nodes and attribute nodes corresponding to content attributes;
Performing feature extraction on a target content node based on the homogeneous element path to obtain content basic features, wherein the target content node belongs to the content node;
performing feature extraction on the target content node based on the heterogeneous path to obtain content attribute features;
performing feature fusion on the content basic feature and the content attribute feature to obtain a target content feature of the target content node;
and recommending the content based on the target content characteristics.
2. The method according to claim 1, wherein the feature extraction of the target content node based on the homogeneity path, to obtain a content base feature, includes:
performing graph convolution on the target content node according to the homogeneous element path to obtain the content basic feature;
the feature extraction is performed on the target content node based on the heterogeneous path to obtain content attribute features, including:
and carrying out graph convolution on the target content node according to the heterogeneous path to obtain the content attribute characteristics.
3. The method of claim 2, wherein said performing graph convolution on said target content node according to said homogeneity-path results in said content base feature, comprising:
Selecting at least one first-order neighbor content node corresponding to the target content node based on the homogeneous element path;
and performing feature aggregation on the neighbor node features of the at least one first-order neighbor content node and the target node features of the target content node to obtain the content basic features.
4. A method according to claim 3, wherein the at least first-order neighbor content node comprises a first-order neighbor node and a second-order neighbor node of the target content node;
the feature aggregation is performed on the neighbor node features of the at least one first-order neighbor content node and the target node features of the target content node to obtain the content basic feature, and the method comprises the following steps:
performing feature aggregation on the node features of the first-order neighbor nodes and the node features of the second-order neighbor nodes to obtain neighbor aggregation features;
and performing feature aggregation on the neighbor aggregation features and the target node features to obtain the content basic features.
5. The method of claim 2, wherein the graph rolling process is performed by a convolutional layer in a graph neural network, the method further comprising:
carrying out graph convolution on the sample content nodes according to the homogeneous element paths to obtain predicted content basic characteristics;
Carrying out graph convolution on the sample content nodes according to the heterogeneous element paths to obtain predicted content attribute characteristics;
performing feature fusion on the predicted content basic features and the predicted content attribute features to obtain predicted content features;
and updating and training the graphic neural network based on the predicted content basic characteristics, the predicted content attribute characteristics and the predicted content characteristics of each sample content node.
6. The method of claim 5, wherein the updating training the graph neural network based on the predicted content base features, the predicted content attribute features, and the predicted content features for each sample content node comprises:
determining a correlation loss based on the predicted content characteristics of a positive sample content node pair and a negative sample content node pair, wherein the positive sample content node pair comprises content nodes corresponding to two sample contents with association relation or same attribute, and the negative sample content node pair comprises content nodes corresponding to two sample contents without association relation and same attribute;
determining a contrast loss based on a first content feature pair and a second content feature pair, wherein the first content feature pair is a feature pair consisting of different types of predicted features corresponding to the same sample content node, the different types of predicted features are the predicted content basic features or the predicted content attribute features, and the second content feature pair is a feature pair consisting of different types of predicted features corresponding to different sample content nodes;
And based on the correlation loss and the contrast loss, updating and training the graph neural network.
7. The method of claim 6, wherein the determining a correlation loss based on the predicted content characteristics of positive sample content nodes and negative sample content nodes comprises:
calculating a positive correlation value between the predicted content characteristics corresponding to the positive sample content node pair;
calculating a negative correlation value between the predicted content characteristics corresponding to the negative sample content node pair;
the correlation loss is determined based on the positive correlation value and the negative correlation value.
8. The method of claim 6, wherein the method further comprises:
determining two connected sample content nodes in the content heterogeneous graph as the positive sample content node pair;
and replacing a single node in the positive sample content node pair with other content nodes to obtain the negative sample content node pair, wherein the other content nodes are not connected with the non-replaced node in the positive sample content node pair in the content heterogeneous graph.
9. The method of claim 6, wherein determining a contrast loss based on the first pair of content features and the second pair of content features comprises:
Calculating a first correlation value between the first pair of content features;
calculating a second correlation value between the second pair of content features;
and determining the comparison loss based on each first correlation value and each second correlation value, wherein different first correlation values correspond to different first content feature pairs, and different second correlation values correspond to different second content feature pairs.
10. The method according to any one of claims 1 to 9, wherein the content attribute features comprise content attribute features corresponding to at least one content attribute;
and performing feature fusion on the content basic feature and the content attribute feature to obtain a target content feature of the target content node, wherein the method comprises the following steps:
performing mask processing on the content attribute features, wherein the mask processing is used for hiding part of the content attribute features;
and carrying out feature fusion on the content basic features and the content attribute features after mask processing to obtain the target content features.
11. The method according to any one of claims 1 to 9, wherein prior to the obtaining the content heterogram, the method further comprises:
constructing the homogeneous element path based on the association information among the contents;
And constructing the heterogeneous path based on the attribute information of each content, wherein the content nodes in the same heterogeneous path have the same attribute as the corresponding content.
12. The method of claim 11, wherein constructing the homogeneity path based on association information between respective contents comprises:
determining the mutual transition probability between the contents based on the co-occurrence times between the contents, wherein the co-occurrence times are used for indicating the times of the same association information between the contents, and the mutual transition probability is used for indicating the association between the contents;
determining co-occurrence content based on the mutual transition probability between the content, wherein the mutual transition probability between the co-occurrence content is higher than the mutual transition probability between other content;
and constructing the homogeneous element path based on the co-occurrence content.
13. The method according to any one of claims 1 to 9, wherein said content recommendation based on said target content characteristics comprises:
determining similar content characteristics in a content characteristic index based on the target content characteristics, wherein the content characteristic index comprises content characteristics corresponding to each content;
and recommending the content based on the similar content characteristics.
14. The method according to any one of claims 1 to 9, wherein,
when the content is video content, the content attribute comprises at least one of a video tag, a video director and a video director;
when the content is the article content, the content attribute comprises at least one of article labels, article authors and article sources;
when the content is commodity, the content attribute comprises at least one of commodity label, commodity merchant and commodity price.
15. A content recommendation device, the device comprising:
the system comprises an acquisition module, a content heterogeneous graph and a content analysis module, wherein the content heterogeneous graph comprises a homogeneous element path and a heterogeneous element path, the homogeneous element path is formed by content nodes corresponding to content, and the heterogeneous element path is formed by the content nodes and attribute nodes corresponding to content attributes;
the first extraction module is used for extracting characteristics of a target content node based on the homogeneous element path to obtain content basic characteristics, wherein the target content node belongs to the content node;
the second extraction module is used for extracting the characteristics of the target content node based on the heterogeneous path to obtain content attribute characteristics;
The feature fusion module is used for carrying out feature fusion on the content basic features and the content attribute features to obtain target content features of the target content node;
and the content recommendation module is used for recommending the content based on the target content characteristics.
16. A computer device comprising a processor and a memory having stored therein at least one instruction, at least one program, code set or instruction set that is loaded and executed by the processor to implement the content recommendation method of any one of claims 1 to 14.
17. A computer readable storage medium having stored therein at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program, the code set, or instruction set being loaded and executed by a processor to implement the content recommendation method of any one of claims 1 to 14.
18. A computer program product, characterized in that it comprises computer instructions stored in a computer-readable storage medium, from which computer instructions a processor of a computer device reads, the processor executing the computer instructions to implement the content recommendation method according to any of claims 1 to 14.
CN202210153239.7A 2022-02-18 2022-02-18 Content recommendation method, device, apparatus, storage medium and program product Pending CN116662637A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210153239.7A CN116662637A (en) 2022-02-18 2022-02-18 Content recommendation method, device, apparatus, storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210153239.7A CN116662637A (en) 2022-02-18 2022-02-18 Content recommendation method, device, apparatus, storage medium and program product

Publications (1)

Publication Number Publication Date
CN116662637A true CN116662637A (en) 2023-08-29

Family

ID=87717641

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210153239.7A Pending CN116662637A (en) 2022-02-18 2022-02-18 Content recommendation method, device, apparatus, storage medium and program product

Country Status (1)

Country Link
CN (1) CN116662637A (en)

Similar Documents

Publication Publication Date Title
JP7104244B2 (en) User tag generation method and its devices, computer programs and computer equipment
US20230208793A1 (en) Social media influence of geographic locations
CN107220365B (en) Accurate recommendation system and method based on collaborative filtering and association rule parallel processing
CN112364204B (en) Video searching method, device, computer equipment and storage medium
CN111429161B (en) Feature extraction method, feature extraction device, storage medium and electronic equipment
CN113761359B (en) Data packet recommendation method, device, electronic equipment and storage medium
WO2024021685A1 (en) Reply content processing method and media content interactive content interaction method
CN111522979B (en) Picture sorting recommendation method and device, electronic equipment and storage medium
CN115344698A (en) Label processing method, label processing device, computer equipment, storage medium and program product
CN113656589B (en) Object attribute determining method, device, computer equipment and storage medium
CN113821676A (en) Video retrieval method, device, equipment and storage medium
CN112948681A (en) Time series data recommendation method fusing multi-dimensional features
CN110851708B (en) Negative sample extraction method, device, computer equipment and storage medium
CN117251622A (en) Method, device, computer equipment and storage medium for recommending objects
CN116561443A (en) Item recommendation method and device for double-message propagation diagram based on attribute expansion
CN117216362A (en) Content recommendation method, device, apparatus, medium and program product
CN115809339A (en) Cross-domain recommendation method, system, device and storage medium
CN117033754A (en) Model processing method, device, equipment and storage medium for pushing resources
CN117688390A (en) Content matching method, apparatus, computer device, storage medium, and program product
KR20230148523A (en) Multimedia recommendation method and system preserving the unique characteristics of modality
Mohamad et al. Collaborative filtering approach for movie recommendations
CN116662637A (en) Content recommendation method, device, apparatus, storage medium and program product
CN114090848A (en) Data recommendation and classification method, feature fusion model and electronic equipment
CN111414538A (en) Text recommendation method and device based on artificial intelligence and electronic equipment
Chu et al. Towards a sparse low-rank regression model for memorability prediction of images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40091023

Country of ref document: HK