WO2020147594A1

WO2020147594A1 - Method, system, and device for obtaining expression of relationship between entities, and advertisement retrieval system

Info

Publication number: WO2020147594A1
Application number: PCT/CN2020/070249
Authority: WO
Inventors: 陈怡然; 温世阳; 吴文金; 林伟; 朱晓宇
Original assignee: 阿里巴巴集团控股有限公司
Priority date: 2019-01-16
Filing date: 2020-01-03
Publication date: 2020-07-23
Also published as: CN111444394B; CN111444394A

Abstract

A method, system, and device for obtaining an expression of a relationship between entities, and an advertisement retrieval system. The method comprises: dividing, according to a meta path, a heterogeneous graph into at least two heterogeneous subgraphs, and obtaining a batch of sample data; learning the sample data according to the heterogeneous subgraphs so as to obtain vector expressions of nodes in the heterogeneous subgraphs; aggregating, on the basis of the sample data, vector expressions of identical nodes in different heterogeneous subgraphs so as to obtain a same vector expression for the identical nodes in the different heterogeneous subgraphs; performing optimization on a parameter of a model on the basis of the sample data and the same vector expression for the identical nodes; and obtaining the next batch of sample data and learning the same until all batches of sample data have been learned, so as to obtain a low-dimensional vector expression of each node in the heterogeneous graph. The method enables learning of complex heterogeneous graphs, ensures high processing speed and high efficiency, and can be used in search advertising to improve the degree of matching for retrieved advertisements.

Description

Method, system and equipment for obtaining relationship expression between entities, and advertisement recall system

This application claims the priority of a Chinese patent application filed on January 16, 2019 with the application number 201910041466.9 and the invention title "Methods, systems and equipment for obtaining expressions of inter-entity relationships, and advertising recall systems", the entire contents of which are incorporated by reference In this application.

Technical field

The present invention relates to the technical field of data mining, in particular to a method, system and equipment for obtaining expressions of relationships between entities, and an advertisement recall system.

Background technique

With the popularization of mobile terminals and application software, service providers in the fields of social networking, e-commerce, logistics, travel, food delivery, marketing, etc. have accumulated massive amounts of business data. Based on the massive amounts of business data, they can mine the relationships between different business entities (entities). Relationship has become an important technical research direction in the field of data mining. With the improvement of machine processing capabilities, more and more technicians have begun to study how to mine through machine learning technology.

The inventor of the present invention found:

At present, learning massive business data through machine learning technology to obtain a graph for expressing entities and relationships between entities, that is, graph learning on massive business data has become a preferred technical direction. Simply understand that a graph is composed of nodes and edges. A node is used to represent an entity, and the edge between nodes is used to represent the relationship between nodes. A graph generally includes more than two nodes and more than one edge. Therefore, a graph can also be understood as consisting of a collection of nodes and a collection of edges, usually expressed as: G(V, E), where G represents the graph , V represents the set of nodes in the graph G, and E is the set of edges in the graph G. Graphs can be divided into homogeneous graphs and heterogeneous graphs. A heterogeneous graph refers to different types of nodes in a graph (the types of edges can be the same or different), or different types of edges in a graph (the types of nodes can be the same or different). Therefore, when there are many types of entities that need to be expressed by multiple types of nodes, or the relationship between entities does not need to be expressed by multiple types of edges, it is preferable to express these entities and their relationships through heterogeneous graphs. When the magnitude of the nodes and edges included in the heterogeneous graph is very large, the heterogeneous graph will be extremely complex and the amount of data will be very large. Therefore, reducing the complexity and data volume of the heterogeneous graph becomes the field Technical problems faced by technicians.

Summary of the invention

In view of the above-mentioned problems, the present invention is proposed to provide a method, system and equipment for obtaining expressions of relationships between entities, and an advertisement recall system that overcomes or at least partially solves the above-mentioned problems.

The embodiment of the present invention provides an advertisement recall system, including a system for obtaining relationship expressions between entities and an advertisement recall matching system;

The system for obtaining expressions of relationships between entities is used to construct a heterogeneous graph for advertisement search scenarios, and the node types in the heterogeneous graph include: at least one of advertisements, commodities, and query words. The types of edges include at least one of click edges, co-click edges, collaborative filtering edges, content semantically similar edges, and attribute similar edges;

Divide the pre-built heterogeneous graph into at least two heterogeneous subgraphs according to a predefined meta-path, where the meta-path is used to express the structure of the heterogeneous subgraph and the types of nodes and edges included in the heterogeneous subgraph;

Obtain a batch of sample data;

The preset graph convolution model learns a batch of sample data according to heterogeneous subgraphs to obtain the vector expression of nodes in heterogeneous subgraphs. A graph convolution model corresponds to a heterogeneous subgraph;

The preset aggregation model is based on sample data and aggregates the vector expressions of the same node in different heterogeneous subgraphs to obtain the same vector expression of the same node in different heterogeneous subgraphs;

The preset loss function optimizes the parameters of the model based on the same vector expression of the sample data and the same node;

Continue to acquire the sample data of the next batch for learning, until the sample data of all batches have been learned, and the low-dimensional vector expressions of the advertising nodes, commodity nodes, and query word nodes included in the heterogeneous graph are obtained. A node of corresponds to an entity in the sample data;

The advertisement recall matching system is configured to use the low-dimensional vector expressions of query term nodes, commodity nodes and search advertisement nodes obtained by the system for obtaining inter-entity relationship expressions to determine the relationship between query term nodes, commodity nodes and search advertisement nodes According to the matching degree, select search advertisements that match the product and query terms according to the set requirements.

In an optional embodiment, a meta-path corresponds to a heterogeneous subgraph, and the meta-path is used to express the structure of the heterogeneous subgraph and the node types and edge types included in the heterogeneous subgraph are specifically: a meta-path Used to express the structure of a heterogeneous subgraph and the types of nodes and edges included in the heterogeneous subgraph;

The splitting the heterogeneous graph into at least two heterogeneous subgraphs according to the preset meta-path specifically includes:

Split the heterogeneous graph into at least two heterogeneous subgraphs according to at least two preset meta-paths.

In an optional embodiment, the system for obtaining expressions of relationships between entities uses a preset graph convolution model to learn the sample data according to heterogeneous subgraphs to obtain vector expressions of nodes in the heterogeneous subgraphs , Specifically including:

The preset graph convolution model obtains the information of the nodes in the heterogeneous graph according to the attribute information of each node in the heterogeneous subgraph and the structure information and attribute information of the at least first-order neighbor nodes of each node in the heterogeneous subgraph. Vector expression.

In an optional embodiment, the system for obtaining expressions of relationships between entities aggregates vector expressions of the same node in different heterogeneous subgraphs based on sample data through a preset aggregation model to obtain different heterogeneous subgraphs The same vector expression of the same node includes:

The preset aggregation model is based on the sample data, using attention mechanism aggregation learning, fully connected aggregation learning, or weighted average aggregation learning to aggregate the vector expressions of the same node in different heterogeneous subgraphs to obtain different heterogeneous subgraphs. The same vector representation of the same node in the graph.

In an optional embodiment, the advertisement recall matching system determining the degree of matching among query term nodes, commodity nodes and search advertisement nodes includes:

Use the attention mechanism or the fully connected aggregation mechanism or the weighted average aggregation mechanism to converge the low-dimensional vector expression of the query term node and the low-dimensional vector expression of the user's pre-clicked product node under the same query term to obtain the low dimensionality of the virtual request node Vector expression; the virtual request node is a virtual node constructed by the query term node and the commodity node pre-clicked by the user under the common query term;

According to the low-dimensional vector expression of the virtual request node and the low-dimensional vector expression of the search advertisement node, the matching degree between the query term node, the product node and the search advertisement node is determined.

In an optional embodiment, the advertisement recall matching system selects search advertisements that match the product and the query term according to the matching degree, including:

According to the cosine distance between the low-dimensional vector expression of the virtual request node and the low-dimensional vector expression of the search advertisement node, a search advertisement whose distance meets the set requirement is selected.

The embodiment of the present invention also provides a method for obtaining expressions of relationships between entities, including:

Obtain a batch of sample data;

Continue to acquire the sample data of the next batch for learning, until the sample data of all batches have been learned, and a low-dimensional vector expression of each node in the heterogeneous graph is obtained. A node in the heterogeneous graph corresponds to the sample data Of an entity.

In an optional embodiment, the one meta-path is used to express the structure of a heterogeneous subgraph and the node types and edge types included in the heterogeneous subgraph, specifically:

A meta-path includes node types and edge types alternately arranged in order. Among them, the node types are ranked first and last. The order of the node types and edge types expresses the structure of heterogeneous subgraphs;

The splitting a heterogeneous graph into at least two heterogeneous subgraphs according to at least two preset meta-paths specifically includes:

For each of the at least two preset meta-paths, obtain the corresponding type of node in the heterogeneous graph according to the node type included in the meta-path; according to the type of the edge connecting each adjacent node, Obtain the edges that meet the requirements from the heterogeneous graph; the obtained nodes of the corresponding type and the edges that meet the requirements form the heterogeneous subgraph corresponding to the meta-path.

In an optional embodiment, the preset graph convolution model learns the sample data according to the heterogeneous subgraph to obtain the vector expression of the nodes in the heterogeneous subgraph, which specifically includes:

The preset graph convolution model learns the sample data according to the attribute information of each node in the heterogeneous subgraph and the structure information and attribute information of the at least first-order neighbor nodes of each node in the heterogeneous subgraph, to obtain Vector expression of nodes in the heterogeneous subgraph.

In an optional embodiment, the preset graph convolution model is based on the attribute information of each node in the heterogeneous subgraph and the structure information and attributes of at least first-order neighbor nodes of each node in the heterogeneous subgraph. Information, learning the sample data to obtain the vector expression of each node in the heterogeneous subgraph, specifically including:

Traverse the sample data, read the recorded entity for a piece of sample data currently traversed, and find the corresponding node of the entity in the heterogeneous graph;

From the heterogeneous subgraph including the node, read the neighboring nodes of the first order to the Nth order of the node, where N is a preset positive integer;

The preset graph convolution model performs an N-layer convolution operation according to the attribute information of the node and the attribute information and structure information of the first to Nth-order neighbor nodes to obtain the vector expression of the node.

In an optional embodiment, the preset graph convolution model is based on the attribute information of each node in the heterogeneous subgraph and the structure information and attributes of at least first-order neighbor nodes of each node in the heterogeneous subgraph. Information, learning the sample data to obtain the vector expression of each node in the heterogeneous graph, specifically including:

Sampling the neighbor nodes of the same order according to the weight of the edges between the nodes on the neighbor nodes of the first to Nth order of the node according to a preset number to obtain the first to Nth neighbor nodes after sampling;

The preset graph convolution model performs an N-layer convolution operation according to the attribute information of the node and the attribute information and structure information of the first to Nth-order neighbor nodes after sampling to obtain the vector expression of the node.

In an optional embodiment, the preset aggregation model aggregates vector expressions of the same node in different heterogeneous subgraphs based on sample data to obtain the same vector expression of the same node in different heterogeneous subgraphs, Specifically:

The preset aggregation model is based on the sample data, and uses an attention mechanism or a fully connected aggregation mechanism or a weighted average aggregation mechanism to aggregate the vector expressions of the same node in different heterogeneous subgraphs to obtain the same The same vector representation of the node.

The embodiment of the present invention also provides a system for obtaining expressions of relationships between entities, including: a registration device, a storage device, a calculation device, and a parameter exchange device;

Storage device for storing data of heterogeneous sub-graphs;

The computing device is used to obtain the data of the heterogeneous subgraph from the storage device through the registration device, and learn the sample data based on the heterogeneous graph by using the above-mentioned method of obtaining the relationship expression between entities to obtain the low-dimensionality of each node in the heterogeneous graph Vector expression

The parameter exchange device is used for parameter interaction with the computing device.

The beneficial effects of the foregoing technical solutions provided by the embodiments of the present invention include at least:

Based on the heterogeneous subgraphs after the split of the heterogeneous graph, the graph convolution model is used to learn the sample data, and the vector expressions of the same nodes obtained by the learning of each heterogeneous subgraph are merged, and the vector expressions of the same nodes are The fusion result optimizes the parameters of the machine learning model, which is used to learn the next batch of samples, realize the iterative learning of the samples, and finally obtain the low-dimensional vector expression for the nodes in the heterogeneous graph, thereby reducing the heterogeneous graph learning process In order to avoid the explosive growth of training parameters and the exponential growth of the number of neighbor nodes with the number of layers in the process of heterogeneous graph processing, the speed and efficiency of heterogeneous graph learning are improved. The heterogeneous graph learning method is used in the advertisement search scene, mining the entity relationship in the advertisement search scene to realize the use of a large amount of information to accurately realize the advertisement recall, improve the quality of the advertisement recall, and use all advertisements as candidates to ensure that it can be recalled under any traffic Enough advertisements can be achieved in one step through the vector method.

Other features and advantages of the present invention will be explained in the subsequent description, and partly become obvious from the description, or be understood by implementing the present invention. The purpose and other advantages of the present invention can be realized and obtained by the structures specifically pointed out in the written description, claims, and drawings.

The technical solutions of the present invention will be further described in detail below through the accompanying drawings and embodiments.

BRIEF DESCRIPTION

The accompanying drawings are used to provide a further understanding of the present invention, and constitute a part of the specification. Together with the embodiments of the present invention, they are used to explain the present invention and do not constitute a limitation to the present invention. In the drawings:

FIG. 1 is a flowchart of a method for obtaining expressions of relationships between entities in Embodiment 1 of the present invention;

2 is a schematic diagram of the principle of a method for obtaining expressions of relationships between entities in Embodiment 2 of the present invention;

3 is a flowchart of a method for obtaining expressions of relationships between entities in Embodiment 2 of the present invention;

Figure 4a is an exemplary diagram of a heterogeneous graph constructed in Embodiment 2 of the present invention;

Figure 4b is another example diagram of a heterogeneous graph constructed in Embodiment 2 of the present invention;

FIG. 5 is an exemplary diagram of splitting a heterogeneous graph into heterogeneous subgraphs in Embodiment 2 of the present invention;

6 is an example diagram of a convolutional network model of heterogeneous subgraphs in the second embodiment of the present invention;

Fig. 7 is an example diagram of neighbor node sampling in the second embodiment of the present invention;

FIG. 8 is a schematic structural diagram of a system for obtaining an expression of a relationship between entities in an embodiment of the present invention;

Fig. 9 is a schematic structural diagram of an advertisement recall system in an embodiment of the present invention.

detailed description

Hereinafter, exemplary embodiments of the present disclosure will be described in more detail with reference to the accompanying drawings. Although the exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure can be implemented in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided to enable a more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

In order to solve the problem that the training parameters increase exponentially when learning heterogeneous graphs in the prior art, and the neighbor sampling also increases exponentially with the increase of the number of layers, which causes the device to be unable to support such a large order of magnitude of operations. One method can well solve the above-mentioned problems, effectively reduce the amount of expression of the relationship between entities obtained at the data in the process of heterogeneous graph learning, and has fast processing speed and high efficiency.

Graph learning has a wide range of applications in mining various data relationships in the real world. For example, it is used in search advertising platforms to mine the correlation between search requests and advertisements and click-through-rate (CTR). That is to say, the method of the present invention can be used in the field of advertisement search for the recall of search advertisements. Search advertising refers to advertisements that advertisers determine relevant keywords based on the content and characteristics of their products or services, write advertising content, and independently place prices in the search results corresponding to the keywords. Search ads recall refers to the selection of the most relevant ads from a large collection of ads through a certain algorithm or model.

Existing search ad recall technologies may screen "high-quality" advertisements based on the degree of matching between query terms and advertiser bid words, the advertiser's purchase price, and users' statistical preferences for advertisements; or add each user's Historical behavior data, personalized matching recall of ads.

The inventor found in the research of the prior art that the existing recall technology either only emphasizes the matching degree between the advertisement and the query word, or only emphasizes the improvement of the recall advertisement revenue, and lacks an integrated model to take both of the two. Since the quality of advertisement recall is very important to search advertisement revenue and user experience, the inventor provided a graph learning technology that can be used to obtain expressions of relationships between entities in the advertisement recall process, which can obtain more high-quality, Users are more concerned about the ad recall collection.

In the following, specific embodiments are used to describe in detail the method and system for obtaining the expression of the relationship between entities, and the specific implementation manner for the advertisement recall system.

Example one

The first embodiment of the present invention provides a method for obtaining expressions of relationships between entities. The process is shown in FIG. 1, and includes the following steps:

Step S101: Divide the pre-built heterogeneous graph into at least two heterogeneous subgraphs according to the pre-defined meta-path. The meta-path is used to express the structure of the heterogeneous subgraph and the types of nodes and edges included in the heterogeneous subgraph. .

A meta path corresponds to a heterogeneous subgraph. The meta path is used to express the structure of the heterogeneous subgraph and the node types and edge types included in the heterogeneous subgraph are specifically: a meta path is used to express the structure of a heterogeneous subgraph and The node type and edge type included in the heterogeneous subgraph. Specifically, a meta-path includes node types and edge types alternately arranged in order. Among them, the node type is ranked first and last. The order of the node type and edge type expresses the heterogeneous subgraph structure.

Splitting the heterogeneous graph into at least two heterogeneous subgraphs according to a preset meta-path specifically includes splitting the heterogeneous graph into at least two heterogeneous subgraphs according to at least two preset meta-paths. Specifically, for each of the at least two preset meta-paths, the corresponding type of node in the heterogeneous graph is obtained according to the node type included in the meta-path; according to the type of the edge connecting each adjacent node, from Obtain the edges that meet the requirements in the heterogeneous graph; the obtained nodes of the corresponding type and the edges that meet the requirements form the heterogeneous subgraph corresponding to the meta-path.

Step S102: Acquire sample data of a batch.

The sample data can be divided into multiple batches and learn in batches based on heterogeneous subgraphs.

Step S103: The preset graph convolution model learns a batch of sample data (or sample data set) according to the heterogeneous subgraph to obtain the vector expression of the nodes in the heterogeneous subgraph, a graph convolution model Corresponds to a heterogeneous subgraph.

In this step, the preset graph convolution model learns the sample data according to the attribute information of each node in the heterogeneous subgraph and the structure information and attribute information of the at least first-order neighbor nodes of each node in the heterogeneous subgraph , Get the vector expression of the nodes in the heterogeneous graph. There can be two alternative situations:

One is to learn and process the sample data based on all nodes in the heterogeneous subgraph, including:

From the heterogeneous subgraph that includes the node, read the node's first-order to N-th order neighbor nodes, where N is a preset positive integer;

One is to learn and process the sample data from the sampling results of the nodes in the heterogeneous subgraph, including:

The first to Nth order neighbor nodes of the node sample the neighbor nodes of the same order according to the preset number according to the weight of the edges between the nodes, to obtain the first to Nth order neighbor nodes after sampling;

Step S104: The preset aggregation model aggregates the vector expressions of the same node in different heterogeneous subgraphs based on the sample data to obtain the same vector expression of the same node in different heterogeneous subgraphs.

The preset aggregation model is based on sample data, using attention mechanism aggregation learning, fully connected aggregation learning, or weighted average aggregation learning to aggregate the vector expressions of the same node in different heterogeneous subgraphs to obtain the same node in different heterogeneous subgraphs The same vector expression of.

Step S105: The preset loss function optimizes the parameters of the model based on the sample data and the same vector expression of the same node.

After obtaining the same vector expression of the same node in different heterogeneous subgraphs, the vector expressions of at least two types of the same node are used to converge to obtain the low-dimensional vector expression of the virtual request node; the virtual request node is through a certain association relationship A virtual node constructed by at least two types of nodes; according to the low-dimensional vector expression of the virtual request node and the low-dimensional vector expression of another type of node, determine the associated parameters between at least three types of nodes, and according to the associated parameters Optimize model parameters.

Step S106: Whether the sample data of all batches have been acquired, if not, go to step 107; if so, go to step S108.

Step 107: Obtain sample data of the next batch, and return to step S103.

In this way, the sample data of the next batch will continue to be obtained for learning until the sample data of all batches have been learned.

Step S108: Obtain a low-dimensional vector expression of each node in the heterogeneous graph. A node in the heterogeneous graph corresponds to an entity in the sample data.

After all batches of samples have been learned, a low-dimensional vector expression of each node in the composition can be obtained. Among them, a low-dimensional vector expression of each node in the composition is the last batch of samples learned, and the aggregation model is The same vector representation of the same node in different heterogeneous subgraphs.

After the samples of all batches are learned, the matching degree between different types of nodes can also be obtained. The matching degree is the correlation parameter between the nodes obtained last time by the loss function.

In the above method of this embodiment, based on the heterogeneous subgraphs after the split of the heterogeneous graphs, the machine learning model is used to learn the sample data, and the vector expressions of the same nodes obtained by the learning of each heterogeneous subgraph are merged, and According to the fusion result of the vector expression of the same node, the parameters of the machine learning model are optimized, which is used to learn the next batch of samples, realize the iterative learning of the samples, and finally obtain the low-dimensional vector expression for the nodes in the heterogeneous graph. This reduces the amount of data processing in the process of heterogeneous graph learning, avoids the explosive growth of training parameters and the exponential growth of the number of neighbor nodes with the number of layers in the process of heterogeneous graph processing, and improves the speed and efficiency of heterogeneous graph learning.

Example 2

The second embodiment of the present invention provides a specific implementation process of a method for obtaining expressions of relationships between entities. The process of implementing advertisement recall in a search advertisement scenario is taken as an example for description. The implementation principle of the method is shown in FIG. 2 and the flow is shown in FIG. As shown in 3, including the following steps:

Step S301: Construct a heterogeneous graph.

Taking the advertisement search scenario as an example, a large-scale heterogeneous graph is constructed for the search recall scene based on user logs and related products and advertisement data, which serves as a rich search interaction graph for the advertisement search scene, and the constructed heterogeneous graph is used as the follow-up Graph data input, such as the graph data of the heterogeneous graph at the bottom in Figure 2.

An example of the constructed heterogeneous graph is shown in Figure 4a. The heterogeneous graph includes multiple types of nodes such as Query, Item, and Ad to represent different entities in the search scenario. The heterogeneous graph includes multiple types of Edges to represent multiple relationships between entities. Among them, the node type and its specific meaning can be shown in Table 1 below, and the edge type and its meaning can be shown in Table 2 below.

Table 1

节点类型Node type	具体含义Specific meaning
ItemItem	广告搜索场景下的所有商品All products in the ad search scenario
AdAd	广告搜索场景下的搜索广告Search ads in ad search scenarios
QueryQuery	广告搜索场景下的用户查询词User query terms in ad search scenarios

Among them, the Query node and the Item node are used as user intention nodes to describe the user's personalized search intention, and the Ad node is the advertisement placed by the advertiser.

Table 2

among them:

The user behavior edge represents the user's historical behavior preference. For example, you can create a "click edge" between the Query node and the Item node or between the Query node and the Ad node and use the number of clicks as the edge weight to indicate Query and Item/ Clicks between Ad; For example, you can create a common click edge (session edge), which means the item or Ad that is clicked in the same session (time period) and Query; For example, you can also create a collaborative filtering edge (cf edge) to represent different nodes Collaborative filtering relationship. In the advertising search scenario, user behavior describes a dynamic relationship. Popular nodes (such as high-frequency Query nodes) will have more displays and clicks, and then have more dense The unpopular nodes and new nodes will have relatively sparse variable relationships and smaller edge weights, so user behavior edges can better describe popular nodes.

The content similarity edge (semantic edge) is used for the similarity between customer nodes. For example, an edge is established between Item nodes and the text similarity of its title is used as the weight. The content-similar edges reflect a static relationship between nodes, which is more stable, and can also well describe the relationship between unpopular nodes and new nodes.

The attribute similarity edge (domain edge) represents the overlap of domains between nodes, such as brand, category and other domains.

Figure 4b is a representation of the constructed heterogeneous graph, where nodes with the same shape represent nodes of the same type, and edges with the same linear shape represent edges of the same type.

Step S302: Divide the constructed heterogeneous graph into at least two heterogeneous subgraphs according to the preset meta-path. Among them, the meta-path is used to express the structure of the heterogeneous subgraph and the node types and edge types included in the heterogeneous subgraph.

The graph data to be learned in this application is essentially a heterogeneous graph, and there may be multiple types of points and multiple types of edges. The current graph convolutional neural network (GCN) is only suitable for isomorphic graphs. The use of graph convolutional neural networks for learning as isomorphic graphs cannot obtain effective low-dimensional vector expression. Therefore, in order to realize the learning of heterogeneous graphs, some meaningful meta-paths are defined to divide the original large heterogeneous graph into multiple meaningful heterogeneous subgraphs for learning.

Taking the advertisement search scenario as an example, the defined meta path can be shown in Table 3 below.

table 3

编号Numbering	元路径Meta path
aa	Item/Ad节点-共同点击边-Item/Ad节点-属性相似边-Item/Ad节点Item/Ad node-common click edge-Item/Ad node-attribute similar edge-Item/Ad node
bb	Item/Ad节点-点击边-Query节点-点击边-Item/Ad节点Item/Ad node-click edge-Query node-click edge-Item/Ad node
cc	Query节点-点击边-Item/Ad节点-共同点击边-Item/Ad节点Query node-click edge-Item/Ad node-joint click edge-Item/Ad node
dd	Query节点-协同过滤边-Query节点-语义相似边-Query节点Query node-collaborative filtering edge-Query node-semantic similarity edge-Query node

ee	Query节点-协同过滤边-Query节点-协同过滤边-Item/Ad节点Query node-collaborative filtering edge-Query node-collaborative filtering edge-Item/Ad node
ff	Query节点-点击边-Item/Ad节点-协同过滤边-Query节点Query node-click edge-Item/Ad node-collaborative filtering edge-Query node

Based on the defined meta-path, the constructed heterogeneous graph is split. As shown in Figure 5, the heterogeneous graph shown in Figure 4b is split. Taking 6 meta-paths as an example, for meta-path a , Meta-path b, Meta-path c, Meta-path d, Meta-path e, Meta-path f, which can be split into six sub-graphs: a, sub-graph b, sub-graph c, sub-graph d, sub-graph e, and sub-graph f Heterogeneous subgraphs.

Take meta-path a as an example: meta-path a includes node Item/Ad-joint click edge-node Item/Ad-attribute similar edge-node Item/Ad. When subgraph a is constructed according to meta-path a, from the heterogeneous graph constructed Obtain the nodes (Item and Ad) of the corresponding node type in the meta path a, and keep the edges that meet the requirements to obtain the subgraph a. The construction of heterogeneous subgraphs corresponding to other meta-paths is similar to meta-path a, and will not be repeated here.

Referring to Figure 2, the bottom is a heterogeneous graph constructed, based on the heterogeneous graph, the initial vector expression of each node is formed according to the characteristics of each node. For each specified node, define a meta-path containing the specified node, and construct a heterogeneous subgraph based on the defined meta-path. As shown in Figure 2, two meta-paths are defined for the search advertisement node (Ad), corresponding Split into two heterogeneous subgraphs; define four meta-paths for the query node (Query), and split four heterogeneous subgraphs accordingly; for k commodities such as 1, 2, ..., k (Item) node, each commodity node defines two meta-paths, and splits two heterogeneous subgraphs accordingly.

Step S303: Obtain sample data of a batch.

Extract sample data related to advertisement search from user log data. The sample data can come from user historical behavior logs, commodity basic attribute information table, advertisement basic attribute information table, query word basic attribute information table, etc.

It can be extracted in multiple batches. The sample data of each batch is sequentially input into the machine learning model for training and learning. The learning results of the previous batch can optimize the parameters of the model, and use the optimized parameters for the learning of the sample data of the next batch to achieve The effect of iterative learning to obtain the final learning result.

Step S304: The preset graph convolution model learns a batch of sample data according to heterogeneous subgraphs to obtain vector expressions of nodes in heterogeneous subgraphs, and one graph convolution model corresponds to one heterogeneous subgraph.

When learning sample data based on heterogeneous subgraphs, use the preset graph convolution model, based on the attribute information of each node in the heterogeneous subgraph and the structure of at least first-order neighbor nodes of each node in the heterogeneous subgraph Information and attribute information, learn the sample data, and get the vector expression of the nodes in the heterogeneous graph.

Referring to Figure 2, each heterogeneous sub-graph corresponds to a graph convolutional network model. For example, the two graph convolutional network models in the leftmost group in Figure 2 correspond to the two defined search advertisement nodes (Ad) respectively. The two heterogeneous sub-graphs split by a meta-path, the four graph convolutional network models in the second group from the left correspond to the four heterogeneous sub-graphs split from the four meta-paths defined by the query word node (Query). Composition graph; 1,..., k groups of graph convolutional network models on the right, the two graph convolutional network models in each group correspond to two meta-paths defined by an item (Item) node Heterogeneous subgraphs. When learning sample data based on each heterogeneous subgraph, the sample data is used as input to correspond to the corresponding node in the heterogeneous subgraph for learning. Each convolutional network model shown in Figure 2 can share ownership.

Take a heterogeneous subgraph as an example, traverse the sample data, read the recorded entity for a piece of sample data currently traversed, and find the corresponding node of the entity in the heterogeneous graph; from the heterogeneous subgraph that includes the node In, read the first to Nth order neighbor nodes of the node, N is a preset positive integer; the preset graph convolution model is based on the attribute information of the node and the first to Nth order neighbor nodes after sampling The attribute information and structure information are subjected to N-layer convolution operation to obtain the vector expression of the node.

Among them, the N-layer convolution operation is specifically: for a node in the heterogeneous subgraph, obtain its N-order neighbor node, and then perform the convolution operation hierarchically, for the N-1 order neighbor node, pair with the N-1 order neighbor node The vector expression of the N-order neighbor nodes connected by the node is convolved to obtain the neighbor low-dimensional vector expression of the N-1 order neighbor node. The neighbor low-dimensional vector expression of the N-1 order neighbor node and the N-1 order neighbor node The original low-dimensional vector expression is combined to obtain the new low-dimensional vector expression of the N-1 order neighbor node; and so on, ..., convolution operation is performed on the vector expression of the second order neighbor node connected to the first order neighbor node , Get the neighbor low-dimensional vector expression of the first-order neighbor node, combine the neighbor low-dimensional vector expression of the first-order neighbor node and the original low-dimensional vector expression of the first-order neighbor node to obtain the new low-dimensional vector of the first-order neighbor node Expression; perform convolution operation on the low-dimensional vector expression of each first-order neighbor node of the node to obtain the low-dimensional vector expression of the node's neighbors, and perform the low-dimensional vector expression of the node's neighbors and the original low-dimensional vector expression of the node Combining operations, the new neighbor low-dimensional vector expression of the node is obtained.

The principle of learning sample data based on a heterogeneous subgraph is shown in Fig. 6. Taking the subgraph a corresponding to meta path a as an example, for node 1, a graph convolutional network can be constructed as shown in Fig. 6. To facilitate observation, take the two-layer convolution structure as an example, which can actually be extended to multiple layers. As shown in Figure 6, the first-order neighbor nodes of node 1 in subgraph a have 2, 3, 4, and 6, and the second-order neighbor nodes have 1, 2, 3, 4, and 10. The second-

order neighbor nodes

1, 2, 3, 4, and 10 of node 1 in the subgraph pass through the graph convolution layer to obtain the low-dimensional vector representations of the

neighbor nodes

2, 3, 4, and 6 of the first-order neighbor nodes. The low-dimensional vector expressions of 4 and 6 are spliced and non-linearly transformed to obtain the final low-dimensional vector expressions of

nodes

2, 3, 4, and 6, which are used as input through the graph convolution layer, and the original low-dimensional vector expression of node 1 is spliced , The final low-dimensional vector expression of the second-order graph convolutional network of node 1 is obtained by conversion.

In the heterogeneous subgraph a, the final low-dimensional vector expression of other nodes is obtained in a manner similar to that of node 1, and will not be repeated here. The isolated node 8 without neighbor nodes retains the original vector expression. Based on a similar approach, the final low-dimensional vector expression of each node in each heterogeneous subgraph can be obtained.

Although the meta-path-based graph convolution advertisement recall scheme can effectively solve the advertisement recall scenario by using the graph convolution method, there is still the problem of calculation amount. Taking the heterogeneous subgraph shown in Figure 6 as an example, the number of neighbor nodes of a node increases exponentially with the increase of the number of graph convolutional layers. Node 1 has 3 first-order neighbors and 9 second-order neighbors. In a real scenario, there may be thousands of first-order neighbor nodes of a real node. As the number of layers increases, it is almost impossible to directly calculate the convolution result of a large number of nodes. Therefore, the hierarchical neighbors can be sampled based on beam-search, reducing the neighbor space complexity from O(n ^k ) to O(kn).

In an optional embodiment, when learning the sample data based on the heterogeneous subgraph, when there are many nodes in the heterogeneous subgraph, the neighbor nodes can be sampled, and the convolution calculation is performed based on the neighbor nodes in the sample. Take a heterogeneous subgraph as an example, traverse the sample data, read the recorded entity for a piece of sample data currently traversed, and find the corresponding node of the entity in the heterogeneous graph; from the heterogeneous subgraph that includes the node In, read the first to Nth order neighbor nodes of the node, N is a preset positive integer; the first to Nth order neighbor nodes of the node are compared to neighbors of the same order according to the weight of the edges between nodes The nodes sample according to the preset number to obtain the first to Nth order neighbor nodes after sampling; the preset graph convolution model is based on the attribute information of the node and the attribute information of the first to Nth order neighbor nodes after sampling Perform N-layer convolution with the structure information to obtain the vector expression of the node.

Referring to the sub-heterogeneous graph shown in Figure 6, taking the two-layer convolution structure as an example, the sum of the edge weights of neighbor nodes is used as the weight, and neighbor weighted sampling is performed on the nodes. The principle of sampling based on edge weights is shown in FIG. 7. The original convolution structure of node 1 is shown in the left figure in FIG. 7, and the weight of each edge is shown in the label number of each edge in the figure. If k=2, that is, only two neighbor nodes are selected for convolution operation in each layer, for node 1, the probability of first-order neighbors selecting nodes 2 and 4 is higher, because the weights of node 2 and node 4 are 3 and 4; if The first-order neighbor chooses nodes 2 and 4, and the second-order neighbor node has a higher probability of choosing

nodes

1 and 10, because at this time, only the edge weights connected to the first-order neighbor nodes 2 and 4 are calculated as weights. At this time, node 1 The weight of is 3+4=7, the weight of node 10 is 7, and the probability of being sampled is the highest.

When sampling based on the weight of the edge, the k nodes with the highest weight are selected according to the weight. In order to prevent the results of neighbor sampling from being too biased towards a small number of popular nodes, weighted sampling can be performed based on the node weight w to obtain k sampling nodes, and the weight w can be Expressed as:

among them,

Represents the edge weight of edge e,

L represents the current weight of the heavy layer node v, J _I v represents the node number of the nodes have an upper edge, l l represents layer, i and j is a sequence number for the specified node.

Layer node sampling can reduce the growth trend of neighbor nodes from exponential level to linear level on the basis of taking into account all the connection relationships of upper neighbor nodes.

Step S305: The preset aggregation model is based on the vector expression of the nodes in the heterogeneous subgraph, and the sample data is aggregated and learned to obtain the same vector expression of the same node in different heterogeneous subgraphs.

The same node may exist in different heterogeneous subgraphs. For example, node 1 exists in subgraphs a, b, c, e, and f, and different heterogeneous subgraph convolutional neural networks will get different node vector expressions. , And for the same node, only a unique low-dimensional vector expression is needed for subsequent recall work. Therefore, the attention mechanism or the fully connected aggregation mechanism or the weighted average aggregation mechanism is used to aggregate the vector expressions of the same node in different heterogeneous subgraphs, and the same vector expression of the same node in different heterogeneous subgraphs is obtained, which is to aggregate the weighted result As the final node low-dimensional vector expression (embedding) result.

The process of converging vector expressions of the same node in different heterogeneous graphs includes:

According to the vector expression of the node in each heterogeneous subgraph and the corresponding learning weight factor, calculate the weight of the vector expression of the node in each heterogeneous subgraph; take the attention mechanism as an example, calculate the weight

The formula is as follows:

among them,

Represents different vector expressions of the same node from multiple heterogeneous graphs,

Represents the learning weight factor.

Use the calculated weight to perform a weighted summation of the vector expression of the node in each heterogeneous subgraph to obtain the aggregated low-dimensional vector expression of the node

Among them, assuming that the type of node v is p, then

Represents the metapath set of the L-th layer node type p.

When neighbor sampling is added, the convolution model can be adjusted. The adjusted convolution model is as follows:

among them,

Represents the low-dimensional vector expression of the nodes at level 0,

Represents the low-dimensional vector expression of the aggregation of the neighbors of node v in the lth layer, WEIGHTEDMEAN represents the weighted average, N represents the neighbors of the node v that meets metapath s ^k , w represents the weight in the weighted average, and CONCAT represents the direct concatenation of the two vectors.

Represents the low-dimensional vector expression of the aggregated own information and neighbor information of the l-th node v, W represents the weight to be learned, and σ represents the nonlinear transformation.

Step S306: The preset loss function optimizes the parameters of the model based on the sample data and the same vector expression of the same node.

Taking the advertisement search scenario as an example, based on the above steps, low-dimensional vector expressions of advertisements, products, and query words can be obtained. According to the extracted sample data, in order to achieve personalized search recall, the user’s current query term and the user’s previously clicked advertisement or product are used as the user’s current search request. At the same time, the attention mechanism is used to express the low-dimensional vector of the query term (H _Q ) And multiple low-dimensional vector expressions of pre-clicks (H _1k , ..., H _Ik ) are aggregated into the final user search request vector. Calculate the cosine distance between the user search request vector (H _r ) and the current advertisement low-dimensional vector expression (H _ad ), use the click state as the label data (O _label ), and calculate the logistic regression (sigmoid) cross entropy as the model The final loss function is used to train the entire model.

Taking ad search as an example, when constructing the sample, the ads that are clicked under the current request are regarded as positive examples, and the ads that are not clicked are regarded as negative examples. The sample structure is as follows: (request, ad, click-label), including Requests, search ads, and click labels. Wherein, the request request=(query,{realtimeclicked items\ads} _k ), including search advertisements and multiple real-time clicked products.

Using sigmoid cross entropy as the loss function, the optimization objective of the entire model is expressed as:

Among them, y _i represents the label data, p _i represents the prior probability, v _request , v _ad represent the vector expression of the virtual request node and the advertising node, and R(v _request , v _ad ) represents the vector expression of the virtual request node and the advertising node The distance measurement function between.

Step S307: Whether the sample data of all batches have been acquired, if not, go to step 308; if yes, go to step S309.

Step S308: Obtain the sample data of the next batch, and return to step S304.

Step S309: Obtain a low-dimensional vector expression of each node in the heterogeneous graph. A node in the heterogeneous graph corresponds to an entity in the sample data.

Through repeated training on all batches of sample data, a low-dimensional vector expression of each node in the heterogeneous graph is obtained. A node in the heterogeneous graph corresponds to an entity in the sample data.

As shown in the system principle in Figure 2, the smallest edge is a schematic representation of a heterogeneous graph. The four rows of small white squares in the upper layer are the node vectors in the heterogeneous graph. The initial vector expression of each node is obtained, and then input into the learning model corresponding to each heterogeneous subgraph. After the learning model, a batch of sample data After learning, update the vector expression of each node in the heterogeneous subgraph according to the learning result, converge the vector expression of the same node in each heterogeneous subgraph, and obtain a converged vector expression of the same node, for example, in Figure 2

Is the converged vector expression of search advertising nodes,

Is the convergent vector expression of query term nodes,

It is the vector expression after the convergence of each commodity node. Correct

Get it

by

with

Obtain the loss function O _label , use O _label to optimize the system parameters of each model, use the parameter optimized system model to learn the next batch of sample data, and update the vector expression of each node in the heterogeneous subgraph according to the learning result Then, the vector expressions of the same node in each heterogeneous subgraph are aggregated to obtain a converged vector expression of the same node. The new loss function O _{label is} further obtained according to the aggregation result, and the model parameters are optimized and updated before continuing. The sample data of the next batch until the sample data of all batches have been learned, and a final vector expression of each node in the heterogeneous graph is obtained.

Based on the same inventive concept, the embodiments of the present invention also provide a system for obtaining expressions of relationships between entities. The system can be set up in network equipment, cloud equipment in the cloud, or architecture server equipment, client equipment and other equipment. The structure of the system is shown in FIG. 8 and includes: a registration device 803, a storage device 801, a calculation device 802, and a parameter exchange device 804.

The storage device 801 is used to store data of heterogeneous subgraphs;

The computing device 802 is configured to obtain the data of the heterogeneous subgraph from the storage device 801 through the registration device 803, and learn the sample data based on the heterogeneous graph by using the above-mentioned method of obtaining the relationship expression between entities to obtain each node in the heterogeneous graph The low-dimensional vector expression of.

The parameter exchange device 804 is used for parameter interaction with the computing device.

The computing device 802 obtains the data of each node and edge from the storage device through the registration device 803, including:

The computing device 802 sends a data query request to the registration device 803, the data query request includes the information of the heterogeneous subgraph to be queried; receives the query result returned by the registration device 803, and the query result includes the storage device information storing the heterogeneous subgraph data ; Obtain heterogeneous subgraph data from the corresponding storage device 801 according to the storage device information.

Optionally, the storage device 801 may also store data and sample data of each node and edge in the heterogeneous graph.

The computing device 802 sends a data query request to the registration device 803, the data query request includes the information of the node and edge to be queried; receiving the query result returned by the registration device 803, the query result includes the storage device information storing the data of the node and edge; Obtain the data of each node and edge from the corresponding storage device 801 according to the storage device information.

Based on the same inventive concept, an embodiment of the present invention also provides an advertisement recall system. As shown in FIG. 9, it includes a system 901 for obtaining relationship expressions between entities and an advertisement recall matching system 902;

The system 901 for obtaining expressions of relationships between entities is used to construct a heterogeneous graph for advertisement search scenarios. The node types in the heterogeneous graph include: at least one of advertisements, commodities, and query terms, and the type of edges Including at least one of a click side, a common click side, a collaborative filtering side, a content semantically similar side, and an attribute similar side;

Obtain a batch of sample data;

The advertisement recall matching system 902 is used to use the low-dimensional vector expressions of query term nodes, product nodes and search advertisement nodes obtained by the system for obtaining relationship expressions between entities to determine the degree of matching between query term nodes, commodity nodes and search advertisement nodes , According to the matching degree, select the search advertisement that matches the product and the query term with the set requirement.

In the above system, the meta-path defined by the system for obtaining the expression of the relationship between entities, one meta-path corresponds to a heterogeneous subgraph, and the meta-path is used to express the structure of the heterogeneous subgraph and the node types and edges included in the heterogeneous subgraph The specific type is: a meta-path is used to express the structure of a heterogeneous subgraph and the node types and edge types included in the heterogeneous subgraph; specifically: a meta-path includes node types and edge types alternately arranged in order, Among them, the node type is ranked first and last, and the arrangement order of node type and edge type expresses the structure of heterogeneous subgraph;

Optionally, the system for obtaining the expression of the relationship between entities divides the heterogeneous graph into at least two heterogeneous subgraphs according to a preset meta-path specifically including: dividing the heterogeneous graph into at least two preset meta-paths The graph is split into at least two heterogeneous subgraphs, specifically for each of the at least two preset meta-paths, the corresponding type in the heterogeneous graph is obtained according to the node type included in the meta-path Nodes; According to the types of edges connecting adjacent nodes, obtain the required edges from the heterogeneous graph; the obtained corresponding types of nodes and the required edges constitute the heterogeneous sub-path corresponding to the meta path Figure.

Optionally, the system for obtaining the expression of the relationship between entities learns the sample data according to the heterogeneous subgraph through a preset graph convolution model to obtain the vector expression of the nodes in the heterogeneous subgraph, which specifically includes: a preset The graph convolution model obtains the vector expression of the nodes in the heterogeneous graph according to the attribute information of each node in the heterogeneous subgraph and the structure information and attribute information of the at least first-order neighbor nodes of each node in the heterogeneous subgraph.

Optionally, the system for obtaining the expression of the relationship between entities uses a preset graph convolution model according to the attribute information of each node in the heterogeneous subgraph and the structural information of at least first-order neighbor nodes of each node in the heterogeneous subgraph and Attribute information to obtain the vector expression of each node in the heterogeneous graph, which specifically includes:

The preset graph convolution model performs an N-layer convolution operation according to the attribute information of the node and the attribute information and structure information of the first to Nth order neighbor nodes to obtain the vector expression of the node.

Optionally, the system for obtaining expressions of relationships between entities aggregates vector expressions of the same node in different heterogeneous subgraphs based on sample data through a preset aggregation model to obtain the same vector expression of the same node in different heterogeneous subgraphs , Specifically including:

Optionally, the advertisement recall matching system determines the degree of matching between query term nodes, product nodes and search advertisement nodes, including:

Optionally, the advertisement recall matching system selects search advertisements that match the product and query terms according to the matching degree, including:

According to the cosine distance between the low-dimensional fusion information vector of the virtual request node and the low-dimensional fusion information vector of the search advertisement node, a search advertisement whose distance meets the set requirement is selected.

An embodiment of the present invention also provides a computer-readable storage medium on which computer instructions are stored, and when the instructions are executed by a processor, the foregoing method for obtaining expressions of relationships between entities is implemented.

An embodiment of the present invention also provides a heterogeneous graph learning device, including: a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor implements the above-mentioned acquisition entity when the program is executed. The method of expressing the relationship.

Regarding the system in the foregoing embodiment, the specific manner in which each device or module performs operations has been described in detail in the embodiment related to the method, and will not be elaborated here.

Unless specifically stated otherwise, terms such as processing, calculation, operation, determination, display, etc. may refer to one or more actions and/or processes of processing or computing systems, or similar devices, and the actions and/or processes will be expressed as The data manipulation and conversion of physical (such as electronic) quantities in the registers or memory of the processing system becomes other data similarly represented as physical quantities in the memory, registers or other such information storage, transmission or display devices of the processing system. Information and signals can be represented using any of a variety of different technologies and methods. For example, the data, instructions, commands, information, signals, bits, symbols, and chips mentioned throughout the above description can be represented by voltage, current, electromagnetic waves, magnetic fields or particles, light fields or particles, or any combination thereof.

It should be understood that the specific order or hierarchy of steps in the disclosed process is an example of an exemplary method. Based on design preferences, it should be understood that the specific order or level of steps in the process can be rearranged without departing from the scope of protection of the present disclosure. The accompanying method claims present elements of the various steps in an exemplary order and are not intended to be limited to the specific order or hierarchy described.

In the above detailed description, various features are combined together in a single embodiment to simplify the present disclosure. This method of disclosure should not be interpreted as reflecting the intent that the implementation of the claimed subject matter needs to clearly state more features in each claim. On the contrary, as reflected in the appended claims, the present invention is in a state with fewer features than the disclosed single embodiment. Therefore, the appended claims are hereby clearly incorporated into the detailed description, with each claim standing alone as a separate preferred embodiment of the present invention.

Those skilled in the art should also understand that the various illustrative logical blocks, modules, circuits, and algorithm steps described in conjunction with the embodiments herein can all be implemented as electronic hardware, computer software, or a combination thereof. In order to clearly illustrate the interchangeability between hardware and software, various illustrative components, blocks, modules, circuits, and steps are described above generally around their functions. As for whether this function is implemented as hardware or as software, it depends on the specific application and the design constraints imposed on the entire system. Skilled technicians can implement the described functions in a flexible manner for each specific application, but this implementation decision should not be interpreted as a departure from the protection scope of the present disclosure.

The steps of the method or algorithm described in combination with the embodiments of this document can be directly embodied as hardware, a software module executed by a processor, or a combination thereof. The software module can be located in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM or any other form of storage medium known in the art. An exemplary storage medium is connected to the processor, so that the processor can read information from the storage medium and can write information to the storage medium. Of course, the storage medium may also be a component of the processor. The processor and the storage medium may be located in the ASIC. The ASIC can be located in the user terminal. Of course, the processor and the storage medium may also exist as discrete components in the user terminal.

For software implementation, the technology described in this application can be implemented with modules (for example, procedures, functions, etc.) that perform the functions described in this application. These software codes can be stored in a memory unit and executed by a processor. The memory unit may be implemented in the processor or outside the processor. In the latter case, it is communicatively coupled to the processor through various means, which are well known in the art.

The foregoing description includes examples of one or more embodiments. Of course, it is impossible to describe all possible combinations of components or methods in order to describe the above-mentioned embodiments, but those of ordinary skill in the art should recognize that the various embodiments can be further combined and arranged. Therefore, the embodiments described herein are intended to cover all such changes, modifications and variations that fall within the protection scope of the appended claims. In addition, with regard to the term "comprising" used in the specification or claims, the covering manner of the word is similar to that of the term "including", just as "including," is explained as an adaptor in the claims. In addition, any term "or" used in the description of the claims is intended to mean a "non-exclusive or".

Claims

An advertisement recall system, including a system for obtaining relationship expressions between entities and an advertisement recall matching system;

The system for obtaining expressions of relationships between entities is used to construct a heterogeneous graph for advertisement search scenarios. The types of nodes in the heterogeneous graph include at least one of advertisements, commodities, and query words, and the types of edges include At least one of a click side, a co-click side, a collaborative filtering side, a content semantically similar side, and an attribute similar side;

Divide the pre-built heterogeneous graph into at least two heterogeneous subgraphs according to a predefined meta-path, where the meta-path is used to express the structure of the heterogeneous subgraph and the types of nodes and edges included in the heterogeneous subgraph;

Obtain a batch of sample data;

The preset graph convolution model learns a batch of sample data according to heterogeneous subgraphs to obtain the vector expression of nodes in heterogeneous subgraphs. A graph convolution model corresponds to a heterogeneous subgraph;

The preset aggregation model is based on sample data and aggregates the vector expressions of the same node in different heterogeneous subgraphs to obtain the same vector expression of the same node in different heterogeneous subgraphs;

The preset loss function optimizes the parameters of the model based on the same vector expression of the sample data and the same node;

Continue to acquire the sample data of the next batch for learning, until the sample data of all batches have been learned, and the low-dimensional vector expressions of the advertising nodes, commodity nodes, and query word nodes included in the heterogeneous graph are obtained. A node of corresponds to an entity in the sample data;

The advertisement recall matching system is configured to use the low-dimensional vector expressions of query term nodes, commodity nodes and search advertisement nodes obtained by the system for obtaining inter-entity relationship expressions to determine the relationship between query term nodes, commodity nodes and search advertisement nodes According to the matching degree, select search advertisements that match the product and query terms according to the set requirements.
The system of claim 1, wherein a meta-path corresponds to a heterogeneous subgraph, and the meta-path is used to express the structure of the heterogeneous subgraph and the node types and edge types included in the heterogeneous subgraph are specifically : A meta-path is used to express the structure of a heterogeneous subgraph and the node types and edge types included in the heterogeneous subgraph;

The splitting the heterogeneous graph into at least two heterogeneous subgraphs according to the preset meta-path specifically includes:

Split the heterogeneous graph into at least two heterogeneous subgraphs according to at least two preset meta-paths.
The system according to claim 1, wherein the system for obtaining expressions of relationships between entities uses a preset graph convolution model to learn from the sample data according to heterogeneous subgraphs to obtain heterogeneous subgraphs The vector representation of the node includes:

The preset graph convolution model obtains the information of the nodes in the heterogeneous graph according to the attribute information of each node in the heterogeneous subgraph and the structure information and attribute information of the at least first-order neighbor nodes of each node in the heterogeneous subgraph. Vector expression.
The system according to claim 1, wherein the system for obtaining expressions of relationships between entities aggregates vector expressions of the same node in different heterogeneous subgraphs based on sample data through a preset aggregation model to obtain different differences. The same vector expression of the same node in the composition graph includes:

The preset aggregation model is based on the sample data, using attention mechanism aggregation learning, fully connected aggregation learning, or weighted average aggregation learning to aggregate the vector expressions of the same node in different heterogeneous subgraphs to obtain different heterogeneous subgraphs. The same vector representation of the same node in the graph.
The system of claim 1, wherein the advertisement recall matching system determines the degree of matching among query term nodes, commodity nodes, and search advertisement nodes, comprising:

Use the attention mechanism or the fully connected aggregation mechanism or the weighted average aggregation mechanism to converge the low-dimensional vector expression of the query term node and the low-dimensional vector expression of the user's pre-clicked product node under the same query term to obtain the low dimensionality of the virtual request node Vector expression; the virtual request node is a virtual node constructed by the query term node and the commodity node pre-clicked by the user under the common query term;

According to the low-dimensional vector expression of the virtual request node and the low-dimensional vector expression of the search advertisement node, the matching degree between the query term node, the product node and the search advertisement node is determined.
The system according to claim 5, wherein the advertisement recall matching system selects search advertisements that match the product and query terms according to the matching degree, and the search advertisement includes:

According to the cosine distance between the low-dimensional vector expression of the virtual request node and the low-dimensional vector expression of the search advertisement node, a search advertisement whose distance meets the set requirement is selected.
A method for obtaining expressions of relationships between entities, characterized in that it includes:

Divide the pre-built heterogeneous graph into at least two heterogeneous subgraphs according to a predefined meta-path, where the meta-path is used to express the structure of the heterogeneous subgraph and the types of nodes and edges included in the heterogeneous subgraph;

Obtain a batch of sample data;

The preset graph convolution model learns a batch of sample data according to heterogeneous subgraphs to obtain the vector expression of nodes in heterogeneous subgraphs. A graph convolution model corresponds to a heterogeneous subgraph;

The preset aggregation model is based on sample data and aggregates the vector expressions of the same node in different heterogeneous subgraphs to obtain the same vector expression of the same node in different heterogeneous subgraphs;

The preset loss function optimizes the parameters of the model based on the same vector expression of the sample data and the same node;

Continue to acquire the sample data of the next batch for learning, until the sample data of all batches have been learned, and a low-dimensional vector expression of each node in the heterogeneous graph is obtained. A node in the heterogeneous graph corresponds to the sample data Of an entity.
The method of claim 7, wherein a meta-path corresponds to a heterogeneous subgraph, and the meta-path is used to express the structure of the heterogeneous subgraph and the types of nodes and edges included in the heterogeneous subgraph are specifically : A meta-path is used to express the structure of a heterogeneous subgraph and the node types and edge types included in the heterogeneous subgraph;

The splitting the heterogeneous graph into at least two heterogeneous subgraphs according to the preset meta-path specifically includes:

Split the heterogeneous graph into at least two heterogeneous subgraphs according to at least two preset meta-paths.
The method according to claim 8, wherein the one meta-path is used to express the structure of a heterogeneous subgraph and the node types and edge types included in the heterogeneous subgraph, specifically:

A meta-path includes node types and edge types alternately arranged in order. Among them, the node types are ranked first and last. The order of the node types and edge types expresses the structure of heterogeneous subgraphs;

The splitting a heterogeneous graph into at least two heterogeneous subgraphs according to at least two preset meta-paths specifically includes:

For each of the at least two preset meta-paths, obtain the corresponding type of node in the heterogeneous graph according to the node type included in the meta-path; according to the type of the edge connecting each adjacent node, Obtain the edges that meet the requirements from the heterogeneous graph; the obtained nodes of the corresponding type and the edges that meet the requirements form the heterogeneous subgraph corresponding to the meta-path.
The method according to claim 7, wherein the preset graph convolution model learns the sample data according to the heterogeneous subgraph to obtain the vector expression of the nodes in the heterogeneous subgraph, which specifically includes:

The preset graph convolution model learns the sample data according to the attribute information of each node in the heterogeneous subgraph and the structure information and attribute information of the at least first-order neighbor nodes of each node in the heterogeneous subgraph, to obtain Vector expression of nodes in the heterogeneous subgraph.
The method of claim 10, wherein the preset graph convolution model is based on the attribute information of each node in the heterogeneous subgraph and the value of at least first-order neighbor nodes of each node in the heterogeneous subgraph. Structure information and attribute information, learn the sample data to obtain the vector expression of each node in the heterogeneous subgraph, which specifically includes:

Traverse the sample data, read the recorded entity for a piece of sample data currently traversed, and find the corresponding node of the entity in the heterogeneous graph;

From the heterogeneous subgraph including the node, read the neighboring nodes of the first order to the Nth order of the node, where N is a preset positive integer;

The preset graph convolution model performs an N-layer convolution operation according to the attribute information of the node and the attribute information and structure information of the first to Nth-order neighbor nodes to obtain the vector expression of the node.
The method of claim 10, wherein the preset graph convolution model is based on the attribute information of each node in the heterogeneous subgraph and the value of at least first-order neighbor nodes of each node in the heterogeneous subgraph. The structure information and attribute information are learned from the sample data to obtain the vector expression of each node in the heterogeneous graph, which specifically includes:

Traverse the sample data, read the recorded entity for a piece of sample data currently traversed, and find the corresponding node of the entity in the heterogeneous graph;

From the heterogeneous subgraph including the node, read the neighboring nodes of the first order to the Nth order of the node, where N is a preset positive integer;

Sampling the neighbor nodes of the same order according to the weight of the edges between the nodes on the neighbor nodes of the first to Nth order of the node according to a preset number to obtain the first to Nth neighbor nodes after sampling;

The preset graph convolution model performs an N-layer convolution operation according to the attribute information of the node and the attribute information and structure information of the first to Nth-order neighbor nodes after sampling to obtain the vector expression of the node.
The method according to claim 7, wherein the preset aggregation model aggregates vector expressions of the same node in different heterogeneous subgraphs based on sample data to obtain the same node of the same node in different heterogeneous subgraphs. A vector expression, including:

The preset aggregation model is based on the sample data, and uses an attention mechanism or a fully connected aggregation mechanism or a weighted average aggregation mechanism to aggregate the vector expressions of the same node in different heterogeneous subgraphs to obtain the same The same vector representation of the node.
A system for obtaining expressions of relationships between entities, which is characterized by comprising: a registration device, a storage device, a calculation device, and a parameter exchange device;

Storage device for storing data of heterogeneous sub-graphs;

The computing device is used to obtain the data of the heterogeneous subgraph from the storage device through the registration device, and learn the sample data based on the heterogeneous graph by using the method for obtaining the relationship expression between entities according to any one of claims 7-13 to obtain the heterogeneous graph. Low-dimensional vector expression of each node in the composition;

The parameter exchange device is used for parameter interaction with the computing device.