WO2020147595A1 - 获取实体间关系表达的方法、系统和设备、广告召回系统 - Google Patents

获取实体间关系表达的方法、系统和设备、广告召回系统 Download PDF

Info

Publication number
WO2020147595A1
WO2020147595A1 PCT/CN2020/070250 CN2020070250W WO2020147595A1 WO 2020147595 A1 WO2020147595 A1 WO 2020147595A1 CN 2020070250 W CN2020070250 W CN 2020070250W WO 2020147595 A1 WO2020147595 A1 WO 2020147595A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
positive
vector expression
subgraph
negative
Prior art date
Application number
PCT/CN2020/070250
Other languages
English (en)
French (fr)
Inventor
温世阳
陈怡然
吴文金
林伟
朱晓宇
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020147595A1 publication Critical patent/WO2020147595A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present invention relates to the technical field of data mining, in particular to a method, system and equipment for obtaining expressions of relationships between entities, and an advertisement recall system.
  • the inventor of the present invention found:
  • a graph is composed of nodes and edges.
  • a node is used to represent an entity, and the edge between nodes is used to represent the relationship between nodes.
  • a graph generally includes more than two nodes and more than one edge. Therefore, a graph can also be understood as consisting of a collection of nodes and a collection of edges, usually expressed as: G(V, E), where G represents the graph , V represents the set of nodes in the graph G, and E is the set of edges in the graph G.
  • G represents the graph
  • V represents the set of nodes in the graph G
  • E is the set of edges in the graph G.
  • Graphs can be divided into homogeneous graphs and heterogeneous graphs.
  • a heterogeneous graph refers to different types of nodes in a graph (the types of edges can be the same or different), or different types of edges in a graph (the types of nodes can be the same or different). Therefore, when there are many types of entities that need to be expressed by multiple types of nodes, or the relationship between entities does not need to be expressed by multiple types of edges, it is preferable to express these entities and their relationships through heterogeneous graphs.
  • the magnitude of the nodes and edges included in the heterogeneous graph is very large, the heterogeneous graph will be extremely complex and the amount of data will be very large. Therefore, reducing the complexity and data volume of the heterogeneous graph becomes the field Technical problems faced by technicians.
  • the present invention is proposed in order to provide a heterogeneous graph learning method, system and device that overcomes the above problems or at least partially solves the above problems.
  • the embodiment of the present invention provides an advertisement recall system, including a system for obtaining relationship expressions between entities and an advertisement recall matching system;
  • the system for obtaining expressions of relationships between entities is used to split a pre-built heterogeneous graph into subgraphs according to edge types, and one subgraph includes one type of edge;
  • the nodes in the heterogeneous graph include at least one of advertisements, commodities, and query terms, and the types of edges include at least one of click edges, co-click edges, collaborative filtering edges, content semantically similar edges, and attribute similar edges;
  • each sample in the sample set includes a source node, a positive node and at least one negative node;
  • the sample set of the same batch of each subgraph is input into the preset machine learning model for training, and the vector expression of the source node of each sample in the sample set of each subgraph, the vector expression of the positive node and each negative
  • the vector expression of the node based on the obtained vector expression of each node, the parameters in the machine learning model are optimized using the preset loss function;
  • the preset aggregation model performs aggregation learning on the vector expressions of the same source node in different subgraphs to obtain a vector expression of the same source node; based on a vector expression of the same source node and the source node in each subgraph The vector expression of the positive node and the vector expression of each negative node included in the sample of the graph, using the preset loss function to optimize the parameters of the aggregation model;
  • a node in the heterogeneous graph corresponds to an entity in the sample data.
  • the advertisement recall matching system is configured to use the low-dimensional vector expressions of query term nodes, commodity nodes and search advertisement nodes obtained by the system for obtaining inter-entity relationship expressions to determine the relationship between query term nodes, commodity nodes and search advertisement nodes According to the matching degree, select search advertisements that match the product and query terms according to the set requirements.
  • the system for obtaining expressions of relationships between entities samples each sub-graph to obtain a sample set of each sub-graph, including:
  • a random walk is performed using the selected node as the starting point to obtain at least one node sequence corresponding to each subgraph; using a preset sliding window, from the node sequence, the set of positive samples corresponding to each subgraph is obtained ,
  • a positive sample in the positive sample set includes a source node and a positive node;
  • a negative node is sampled to obtain the sample set corresponding to each subgraph.
  • a sample in the sample set includes a source node, a positive node and at least one negative node. The distribution of the node and the positive node is consistent, and the negative node has a correlation with the preset attribute of the source node.
  • the system for obtaining the expression of the relationship between entities inputs the sample set of the same batch of each subgraph into a preset machine learning model for training, and obtains each of the sample sets of each subgraph.
  • the vector expression of the source node of the sample, the vector expression of the positive node and the vector expression of each negative node include:
  • the sparse features of the nodes included in the sample are mapped into dense features
  • the density feature of the source node is trained by a corresponding machine learning model network to obtain the vector expression of the source node, and the density feature of the positive node and the negative node is trained by a corresponding machine learning model to obtain the vector expression of the positive node and each negative node.
  • the system for obtaining the expression of the relationship between entities uses a preset loss function to optimize the parameters in the machine learning model based on the obtained vector expression of each node, including:
  • the preset loss function optimizes the parameters in the machine learning model based on the cosine distance.
  • the system for obtaining expressions of relationships between entities uses a preset aggregation model to aggregate and learn the vector expressions of the same source node in different subgraphs to obtain a vector of the same source node Expression, including:
  • the system for obtaining the expression of the relationship between entities uses a preset loss function to optimize the parameters in the machine learning model based on the obtained vector expression of each node, including:
  • the preset loss function optimizes the parameters in the machine learning model based on the cosine distance.
  • the embodiment of the present invention also provides a method for obtaining expressions of relationships between entities, including:
  • Sampling is performed for each subgraph to obtain the sample set of each subgraph.
  • Each sample in the sample set includes a source node, a positive node and at least one negative node;
  • the sample set of the same batch of each subgraph is input into the preset machine learning model for training, and the vector expression of the source node of each sample in the sample set of each subgraph, the vector expression of the positive node and each negative
  • the vector expression of the node based on the obtained vector expression of each node, the parameters in the machine learning model are optimized using the preset loss function;
  • the preset aggregation model performs aggregation learning on the vector expressions of the same source node in different subgraphs to obtain a vector expression of the same source node; based on a vector expression of the same source node and the source node in each subgraph The vector expression of the positive node and the vector expression of each negative node included in the sample of the graph, using the preset loss function to optimize the parameters of the aggregation model;
  • a node in the heterogeneous graph corresponds to an entity in the sample data.
  • sampling for each sub-picture to obtain the sample set of each sub-picture includes:
  • a random walk is performed using the selected node as the starting point to obtain at least one node sequence corresponding to each subgraph; using a preset sliding window, from the node sequence, the set of positive samples corresponding to each subgraph is obtained ,
  • a positive sample in the positive sample set includes a source node and a positive node;
  • a negative node is sampled to obtain the sample set corresponding to each subgraph.
  • a sample in the sample set includes a source node, a positive node and at least one negative node. The distribution of the node and the positive node is consistent, and the negative node has a correlation with the preset attribute of the source node.
  • a preset sliding window is used to obtain a set of positive samples corresponding to each subgraph from the node sequence, which specifically includes:
  • a negative node is sampled from a positive node, and at least one negative node corresponding to each pair of source node and positive node is obtained, and the distribution of the negative node and the positive node is consistent.
  • the negative node has a correlation with the source node.
  • a negative node is sampled from a positive node, and at least one negative node corresponding to each pair of source node and positive node is obtained, and the distribution of the negative node and the positive node is consistent.
  • the correlation between the negative node and the source node includes:
  • the category information of the source node select the positive node under the category from the statistically calculated positive nodes, determine the probability of the obtained positive node as a negative node according to the distribution weight, and select the correlation with the source node according to the probability The negative node that meets the requirements.
  • the sample set of the same batch of each sub-graph is input into a preset machine learning model for training, and the vector expression and the source node of each sample in the sample set of each sub-graph are obtained.
  • the vector expression of the positive node and the vector expression of each negative node include:
  • the sparse features of the nodes included in the sample are mapped into dense features
  • the density feature of the source node is trained by a corresponding machine learning model network to obtain the vector expression of the source node, and the density feature of the positive node and the negative node is trained by a corresponding machine learning model to obtain the vector expression of the positive node and each negative node.
  • the parameters in the machine learning model are optimized using a preset loss function based on the obtained vector expression of each node, including:
  • the preset loss function optimizes the parameters in the machine learning model based on the cosine distance.
  • the preset aggregation model performs aggregation learning on the vector expressions of the same source node in different subgraphs to obtain a vector expression of the same source node, including:
  • the parameters in the machine learning model are optimized using a preset loss function based on the obtained vector expression of each node, including:
  • the preset loss function optimizes the parameters in the machine learning model based on the cosine distance.
  • An embodiment of the present invention also provides a system for obtaining expressions of relationships between entities, including: a registration device, a storage device, a calculation device, and a parameter exchange device;
  • Storage device for storing data of heterogeneous graphs
  • the computing device is used to obtain the data of the heterogeneous graph from the storage device through the registration device, and learn the heterogeneous graph by using the above-mentioned method of obtaining the relationship expression between entities to obtain the low-dimensional vector expression of each node in the heterogeneous graph.
  • the parameter exchange device is used for parameter interaction with the computing device.
  • the heterogeneous graph learning method is used in the advertisement search scene, mining the entity relationship in the advertisement search scene to realize the use of a large amount of information to accurately realize the advertisement recall, improve the quality of the advertisement recall, and use all advertisements as candidates to ensure that it can be recalled under any traffic Enough advertisements can be achieved in one step through the vector method.
  • FIG. 1 is a flowchart of a method for obtaining expressions of relationships between entities in Embodiment 1 of the present invention
  • Embodiment 2 is a flowchart of a method for obtaining expressions of relationships between entities in Embodiment 2 of the present invention
  • FIG. 3 is an example of a heterogeneous graph in an advertising scenario in Embodiment 2 of the present invention.
  • FIG. 5 is an example 2 of a subgraph constructed based on a heterogeneous graph in Embodiment 2 of the present invention.
  • FIG. 6 is an example three of a subgraph constructed based on a heterogeneous graph in the second embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a model network of a neutron graph in Embodiment 2 of the present invention.
  • FIG. 8 is an example diagram of a result of fusion of learning results of multiple subgraphs in Embodiment 2 of the present invention.
  • FIG. 9 is a schematic structural diagram of a system for obtaining expressions of relationships between entities in an embodiment of the present invention.
  • FIG. 10 is a diagram of an example structure of an implementation of the advertisement recall system in the embodiment of the present invention.
  • the embodiment of the present invention provides One method can well solve the above-mentioned problems, effectively reduce the amount of data obtained in the process of heterogeneous graph learning to express the relationship between entities, and has fast processing speed and high efficiency.
  • Graph learning has a wide range of applications in mining various data relationships in the real world. For example, it is used in search advertising platforms to mine the correlation between search requests and advertisements and click-through-rate (CTR). That is to say, the method of the present invention can be used in the field of advertisement search for the recall of search advertisements.
  • Search advertising refers to advertisements that advertisers determine relevant keywords based on the content and characteristics of their products or services, write advertising content, and independently place prices in the search results corresponding to the keywords.
  • Search ads recall refers to the selection of the most relevant ads from a large collection of ads through a certain algorithm or model.
  • Existing search ad recall technologies may screen "high-quality" advertisements based on the degree of matching between query terms and advertiser bid words, the advertiser's purchase price, and users' statistical preferences for advertisements; or add each user's Historical behavior data, personalized matching recall of ads.
  • the inventor found in the research of the prior art that the existing recall technology either only emphasizes the matching degree between the advertisement and the query word, or only emphasizes the improvement of the recall advertisement revenue, and lacks an integrated model to take both of the two. Since the quality of advertisement recall is very important to search advertisement revenue and user experience, the inventor provided a graph learning technology that can be used to obtain expressions of relationships between entities in the advertisement recall process, which can obtain more high-quality, Users are more concerned about the ad recall collection.
  • the first embodiment of the present invention provides a method for obtaining expressions of relationships between entities.
  • the process is shown in FIG. 1, and includes the following steps:
  • Step S101 Split the pre-built heterogeneous graph into subgraphs according to the types of edges, and one subgraph includes one type of edges.
  • a subgraph includes all nodes and one type of edge in a heterogeneous graph.
  • Step S102 Sampling is performed for each subgraph to obtain a sample set of each subgraph, and each sample in the sample set includes a source node, a positive node and at least one negative node;
  • a random walk is performed with the selected node as the starting point to obtain at least one node sequence corresponding to each subgraph; using a preset sliding window, from the node sequence, the positive sample set corresponding to each subgraph is obtained.
  • a positive sample in the sample set includes a source node and a positive node;
  • a sample in the sample set includes a source node, a positive node and at least one negative node, and a negative node and a positive node The distribution of is consistent, and the negative node has a correlation with the preset attributes of the source node.
  • the random walk algorithm can use learning algorithms such as deepwalk and node2vec.
  • a preset sliding window to obtain the positive sample set corresponding to each subgraph from the node sequence, which specifically includes: for each node in the sequence, according to the preset sliding window size, obtain when the node is in the sliding When the window is in other nodes within the range of the sliding window, the obtained other nodes are respectively formed into sample pairs with the node to obtain a positive sample set.
  • the negative node When negative node sampling is performed, the negative node is sampled from the positive node, and at least one negative node corresponding to each pair of source node and positive node is obtained.
  • the distribution of the negative node and the positive node is consistent, and the negative node and The source node is relevant.
  • the positive node pairs in the sample set are counted, and the category of each positive node and the number of times the same positive node appears in different positive samples are obtained as the distribution weight of the positive node; according to the category information of the source node , Select the positive node under the category from the statistically calculated positive nodes, determine the probability of the obtained positive node as a negative node according to the distribution weight, and select the negative node whose correlation with the source node meets the requirements according to the probability.
  • each node in each subgraph For each node in each subgraph, set the number of walks when the node is the starting point, and perform the corresponding number of walks for each node to obtain a series of node sequences with the starting node as the source node. Extract the positive sample pairs from the node sequence to obtain the positive sample set. After obtaining the set of positive samples, perform negative node sampling according to the sampling principle of negative nodes, and sample at least one negative node for each positive sample pair to obtain a sample including one source node, one positive node, and at least one negative node.
  • Step S103 Input the sample set of the same batch of each subgraph into the preset machine learning model for training, and obtain the vector expression of the source node, the vector expression of the positive node and the positive node of each sample in the sample set of each subgraph. The vector representation of each negative node.
  • the sparse features of the nodes included in the sample are mapped into dense features
  • the density feature of the source node is trained by a corresponding machine learning model network to obtain the vector expression of the source node, and the density feature of the positive node and the negative node is trained by a corresponding machine learning model to obtain the vector expression of the positive node and each negative node.
  • Step S104 Use a preset loss function to optimize the parameters in the machine learning model based on the obtained vector expression of each node.
  • the vector expression of the positive node and the vector expression of each negative node obtained by training calculate the cosine distance between the source node and the positive node and each negative node; the preset loss function is based on the cosine distance pair
  • the parameters in the machine learning model are optimized.
  • the parameters of the machine learning model corresponding to each subgraph are optimized according to the learning results of the machine learning model corresponding to each subgraph on a batch of sample data.
  • the machine learning model after parameter optimization is used to learn the next batch of samples, so that the result of the previous batch of sample learning can affect the next batch of sample learning.
  • Step S105 The preset aggregation model performs aggregation learning on the vector expressions of the same source node in different subgraphs to obtain a vector expression of the same source node.
  • the vector expressions of the same source node in different subgraphs are aggregated.
  • the weight of the vector expression of the source node trained from each subgraph is determined; the determined weight is used for the source node
  • a weighted summation is performed on the vector expressions trained in each subgraph to obtain a vector expression after the source node aggregation.
  • Step S106 Based on a vector expression of the same source node and the vector expression of the positive node and the vector expression of each negative node included in the sample of each subgraph of the source node, the parameters of the aggregation model are optimized using a preset loss function .
  • the vector expression of the source node the vector expression of the positive node and the vector expression of each negative node obtained by training, the cosine distance between the source node and the positive node and each negative node is calculated; the preset loss function is based on the cosine Distance optimizes the parameters in the machine learning model.
  • the cosine distance between the vector expression and the positive node vector expression included in the sample, the cosine distance between the source node and the vector expression of each negative node, and the calculated cosine distances are input into the loss function to obtain an optimized vector.
  • the parameters of a batch of sample data are optimized according to the learning results of the machine learning models corresponding to different subgraphs.
  • the machine learning model after parameter optimization is used to learn the next batch of samples, so that the result of the previous batch of sample learning can affect the next batch of sample learning.
  • Step S107 Repeat the above process to train the sample sets of all batches for a preset number of times to obtain a low-dimensional vector expression of each node in the heterogeneous graph.
  • a node in the heterogeneous graph corresponds to an entity in the sample data.
  • the sub-graphs are sampled, the sample set obtained by the sampling is trained and learned, and the learning results of the learned sub-graphs are merged to obtain the difference
  • the learning result of the composition so as to realize the learning of complex heterogeneous graphs; by learning the subgraphs of the disassembly of heterogeneous graphs, the problem of explosive growth of training parameters is effectively avoided, and the number of neighbors is also effectively avoided.
  • the problem of exponential growth of the number of layers greatly reduces the amount of data processing in the learning process of heterogeneous graphs, reduces the computational magnitude to the magnitude range that the processing equipment can support, and reduces the hardware requirements for heterogeneous graph learning equipment , Greatly improve the speed and efficiency of heterogeneous graph learning.
  • the heterogeneous graph learning method is used in the advertisement search scene, mining the entity relationship in the advertisement search scene to realize the use of a large amount of information to accurately realize the advertisement recall, improve the quality of the advertisement recall, and use all advertisements as candidates to ensure that it can be recalled under any traffic Enough advertisements can be achieved in one step through the vector method.
  • the second embodiment of the present invention provides a specific implementation process of a method for obtaining expressions of relationships between entities.
  • the process of implementing advertisement recall in a search advertisement scenario is taken as an example for description.
  • the method flow is shown in FIG. 2 and includes the following steps:
  • Step S201 Construct a heterogeneous graph.
  • a large-scale heterogeneous graph is constructed for the search recall scene based on user logs and related products and advertisement data, which serves as a rich search interaction graph for the advertisement search scene, and the constructed heterogeneous graph is used as the follow-up Graph data input.
  • This heterogeneous graph includes several nodes such as Query1, Query2, Item1, Item2, Item3, Item4, Ad1 and edges connecting different nodes.
  • the heterogeneous graph includes Query, Multiple types of nodes such as Item and Ad are used to represent different entities in the search scenario.
  • the heterogeneous graph includes multiple types of edges to represent multiple relationships between entities. Such as the click relationship between query and item, item is the pre-click relationship of ad, etc.
  • the node type and its specific meaning can be shown in Table 1 below, and the edge type and its meaning can be shown in Table 2 below.
  • the Query node and the Item node are used as user intention nodes to describe the user's personalized search intention
  • the Ad node is the advertisement placed by the advertiser.
  • the user behavior edge represents the user's historical behavior preference. For example, you can create a "click edge" between the Query node and the Item node or between the Query node and the Ad node and use the number of clicks as the edge weight to indicate Query and Item/ Clicks between Ad; for example, a common click edge (session edge) can be established, which means the item or Ad that is clicked in the same session (time period) and Query; another example, a collaborative filtering edge (cf edge) can also be established to represent different nodes Collaborative filtering relationship. In the advertising search scenario, user behavior describes a dynamic relationship.
  • Popular nodes (such as high-frequency Query nodes) will have more displays and clicks, and then have more dense The edge relationship and greater edge weight of the unpopular nodes and new nodes will have relatively sparse variable relationships and smaller edge weights, so user behavior edges can better describe popular nodes.
  • the content similarity edge (semantic edge) is used for the similarity between customer nodes. For example, an edge is established between Item nodes and the text similarity of its title is used as the weight.
  • the content-similar edges reflect a static relationship between nodes, which is more stable, and can also well describe the relationship between unpopular nodes and new nodes.
  • the attribute similarity edge represents the overlap of domains between nodes, such as brand, category and other domains.
  • Step S202 Split the pre-built heterogeneous graph into subgraphs according to the types of edges, and one subgraph includes one type of edges;
  • Each kind of edge in a heterogeneous graph can be used as a relationship between nodes and nodes.
  • the similar side of the title text between item and ad describes the semantic similarity between the two; the click side represents that they have been clicked by the same user under the same query.
  • each kind of edge can individually describe part of the relationship between nodes; on the other hand, a variety of different edge complements can more describe a richer and more robust relationship. Therefore, the present invention proposes a solution based on subgraphs for heterogeneous graphs, and each subgraph includes one type of edge.
  • each subgraph contains only one type of edge and can contain all or part of the nodes.
  • three different subgraphs can be constructed according to different edges.
  • the user behavior subgraph can be constructed according to the user's search and click behavior
  • the text similarity subgraph can be constructed according to the text similarity between the query, item title and the ad title
  • the co-occurrence of the clicks of the query, item and ad The relationship constructs a co-occurrence relationship subgraph.
  • the sub-pictures of the structure are shown in Figure 4, Figure 5, and Figure 6.
  • the user behavior subgraph shown in Figure 4 includes user behavior edges and all nodes.
  • the text similarity subgraph shown in Figure 5 includes content similar edges and all nodes.
  • the common relationship subgraph shown in Figure 6 includes The attributes are similar to edges and all nodes.
  • Step S203 Sampling is performed for each subgraph to obtain a sample set of each subgraph.
  • Each sample in the sample set includes a source node, a positive node and at least one negative node.
  • each subgraph use Node2Vec to perform a random walk to generate positive sample pairs, and follow the two principles of negative sampling to generate negative nodes, and get a large number of samples: (src_node, pos_node, ⁇ neg_node ⁇ K, edge_type), where src_node represents Source node, pos_node represents a positive node, ⁇ neg_node ⁇ K represents K negative nodes obtained by negative sampling, and edge_type represents the edge type of this subgraph. That is, each sample contains a source node src_node, a positive node pos_node, and K negative nodes neg_node.
  • the sampling process for each sub-picture can be divided into two stages, the positive sampling stage and the negative sampling stage. among them:
  • the positive sampling link generates positive samples by walking.
  • Node2vec Walk is a search method between DFS and BFS, and it has been proven to have a good effect on Network Embedding.
  • E q For each edge E q , use Node2vec Walk to perform ⁇ q walks, and each walk will get a sequence of length ⁇ : v 1 ->v 2 ->...->v ⁇ , for each sequence, pass Slide the window to get a positive sample pair:
  • FIG. 4 Examples of subgraphs shown in Figure 4, Figure 5, and Figure 6, the figure contains three types of nodes, query is the user's query term, item is the product, and ad is the search advertisement. If you take query1 as the starting point and walk along a certain type of edge through Node2Vec, you can get a node sequence, for example: query1->ad1->query2->item1->item2. From this sequence, a series of positive sample pairs (src_node, pos_node) can be obtained through the sliding window.
  • src_node, pos_node a series of positive sample pairs
  • the size of the sliding window is set to 3, when the node query1 is in the sliding window, because the size of the sliding window is 3, the nodes that may appear in the sliding window are ad1 and query2, so the sample pairs (query1, ad1), (query1 , Query2), when the node ad1 is located in the sliding window, because the size of the sliding window is 3, the nodes that may appear in the sliding window are query1, query2 and item1, so the sample pairs (ad1, query1), (ad1, query2) ), (ad1, item1), and so on.
  • the following positive sample pairs can be obtained: (query1, ad1), (query1, query2), (ad1, query1), (ad1, query2), (ad1 , Item1), (query2, ad1), (query2, query1), (query2, item1), (query2, item2), (item1, query2), (item1, ad1), (item1, item2), (item2, item1 ), (item2, query2).
  • a number of walks is set, that is, the number of walks starting from this node.
  • a series of positive sample pairs will be obtained according to the above steps.
  • Negative sampling is carried out according to two principles. For each positive pair, use negative sampling to generate K negative nodes: (src_node, pos, node, ⁇ neg_node ⁇ K, edge_type).
  • each node has rich attributes (Side Information) to help describe the node. For example: item price, brand, etc. Compared with node ID, these attributes have good generalization ability and can help improve the stability of the model. Since the advertisement recall scene is sensitive to the distance of the vector, the following two principles of negative sampling are proposed:
  • Consistency principle The negative nodes obtained by negative sampling need to have a consistent distribution with the positive nodes. If the positive and negative nodes have different distributions, the model will "lazy" tend to remember what the positive and negative nodes are, instead of learning the relationship between the nodes. Therefore, we use Alias Method to carry out negative sampling with weights to ensure the same distribution of positive and negative nodes.
  • Correlation principle There should be a weak correlation between the negative node and the source node. If the negative node and the source node are completely unrelated, it will cause the model to separate positive and negative samples too easily; when used online, the model cannot distinguish the best advertisement from the second best advertisement. Therefore, we use category information to ensure that there is a weak correlation between the negative sample and the source node.
  • the categories and frequencies of positive nodes obtained by statistics are as follows: (query1, cate1, 100), (query2, cate2, 200), (item1, cate1, 50), (item2, cate2, 50), (item3, cate1, 100), (item4, cate1, 150), (ad1, cate1, 150).
  • the category of ad1 is cate1.
  • sampling is performed in the item node of category cate1.
  • Those that meet this condition are (item1, cate1, 50) , (Item3, cate1, 100), (item4, cate1, 150).
  • the positive node cannot be the same as the negative node, (item1, cate1, 50) is excluded, and negative sampling is performed from (item3, cate1, 100), (item4, cate1, 150).
  • the above selection according to the category of the source node reflects the principle of consistency, and the calculation of probability based on weight reflects the principle of relevance.
  • Step S204 Input the sample set of the same batch of each subgraph into a preset machine learning model for training.
  • FIG. 7 An example of a machine learning model of a subgraph is shown in Figure 7.
  • the bottom dashed box in Figure 7 is an example of a subgraph.
  • the subgraph is walked to obtain the source node (src) and the positive node (pos ), and obtain negative nodes (neg 1 , neg 2 , ..., neg k ) through negative node sampling, and the data of each node obtained includes node identification (node id) and attribute information (attr 1 , attr 2 , ... , Attr n ).
  • the relevant data of the source node, the positive node, and the negative node are input into a shared layer for training and learning, and the sparse features of the nodes are mapped into dense features. Each node in this layer learns through the corresponding EMB model.
  • Step S205 Obtain the vector expression of the source node of each sample in the sample set of each subgraph, the vector expression of the positive node, and the vector expression of each negative node.
  • the dense characteristics of the source node go through a neural network Get a source node vector (X src ), the positive nodes and K negative nodes go through another neural network Get the positive node vector (X pos ) and each negative node vector respectively
  • the vector expression of the source node For each sample, the vector expression of the source node, the vector expression of the positive node and the vector expression of each negative node are obtained.
  • Step S206 Use a preset loss function to optimize the parameters in the machine learning model based on the obtained vector expression of each node.
  • the vector expression of the source node of the sample (X src ), the vector expression of the positive node (X pos ) and the vector expression of each negative node Calculate the vector expression of the source node (X src ) and the vector expression of the positive node (X pos ), and the vector expression of each negative node respectively According to the calculated cosine distance, the optimized parameter expression (O rel ) of the machine learning model of the subgraph is obtained.
  • the principle of obtaining the optimized parameter is to calculate the source node and the cosine distance of the positive node and the negative node to make the source node The distance between the node and the positive node is as close as possible.
  • node attributes include ID features and other attributes such as title text information, store information, brand information, etc. (node_id, attr1, attr2, attr%)
  • the source node, the positive node, and the negative node enter a shared layer.
  • This layer is EMB (Embedding search layer), the purpose is to map sparse ID features into dense features.
  • the source node After passing the EMB layer, the source node passes through a DNN network alone, which is called The positive and negative nodes share a DNN network, which is called Each node will get a vector expression through the DNN network, which is expressed as X src , X pos , X neg1 , etc.,
  • FC represents the fully connected layer
  • w and b are the fully connected weights and biases that need to be learned
  • ELU is the exponential linear unit activation function
  • all target nodes that is, positive nodes and negative nodes
  • FC represents the fully connected layer
  • w and b are the fully connected weights and biases that need to be learned
  • ELU is the exponential linear unit activation function
  • P*Q+P DNN networks are learned collaboratively at the same time.
  • P*Q networks use source node Embedding
  • P networks use Embedding target node.
  • v ⁇ V P denotes the source node
  • v' ⁇ V P ' Represents a positive node
  • v' represents a positive node
  • v" represents a negative node
  • Step S207 The preset aggregation model performs aggregation learning on the vector expressions of the same source node in different subgraphs to obtain a vector expression of the same source node.
  • the fusion vector expression when each node is used as the source node is obtained.
  • Step S208 Based on a vector expression of the same source node and the vector expression of the positive node included in the sample of each subgraph of the source node and the vector expression of each negative node, the parameters of the aggregation model are performed using a preset loss function. optimization.
  • the source node vector after fusion Respectively with the vector expression of the positive node in the sample (X pos ) and the vector expression of each negative node
  • the cosine distance is calculated, and the optimized parameter expression (O att ) of the aggregation model is obtained according to the calculated cosine distances.
  • This optimization goal is the corresponding O att .
  • the principle of obtaining optimized parameters is to make the distance between the source node and the positive node as close as possible by calculating the cosine distance between the source node and the positive node and the negative node respectively.
  • Step S209 Repeat the above process to train the sample sets of all batches for a preset number of times to obtain a low-dimensional vector expression of each node in the heterogeneous graph, and a node in the heterogeneous graph corresponds to an entity in the sample data.
  • the system parameters updated by the result of the previous batch will be used for training, which will replace the learning result of the sample set of the previous batch, so that the previous The learning can affect the subsequent learning and the final learning result shall prevail, so that the learning result can reflect the characteristics of all samples.
  • Q low-dimensional vectors can be obtained through each type of edge Through the Attention mechanism, the weight of each vector is automatically learned, and Q vectors are combined into one vector
  • ⁇ pq ( ⁇ ) represents the weight of the p-th type of node v on the q-th type of edge
  • z pq is the parameter that the attention mechanism needs to learn, and is a vector, which means that the p-th node corresponds to the q-th edge The aggregation weight. If z pq and If the inner product of is larger, it means that v considers the qth edge to be informative. In addition, if two nodes have similar vectors, it indicates that they are closely related in the graph and will have similar weight distributions.
  • v' represents a positive node
  • v" represents a negative node
  • embodiments of the present invention also provide a system device for obtaining expressions of relationships between entities.
  • the system can be set up in network equipment in the network, cloud equipment in the cloud, or architecture server equipment, client equipment and other equipment.
  • the structure of the system is shown in FIG. 9 and includes: a registration device 903, a storage device 901, a calculation device 902, and a parameter exchange device 904.
  • the storage device 901 is used to store data of heterogeneous graphs
  • the computing device 902 is configured to obtain the data of the heterogeneous graph from the storage device through the registration device 903, and learn the heterogeneous graph by using the above-mentioned method of obtaining the relationship expression between entities to obtain the low-dimensional vector expression of each node in the heterogeneous graph .
  • the parameter exchange device 904 is used for parameter interaction with the computing device 902.
  • the computing device 902 obtains the data of each node and edge from the storage device through the registration device 903, including:
  • the aforementioned storage device 901 stores data of each node and edge in the heterogeneous graph.
  • the computing device 902 sends a data query request to the registration device 903, the data query request includes the information of the node and edge to be queried; receives the query result returned by the registration device 903, and the query result includes the storage device information storing the data of the node and edge; Obtain the data of each node and edge from the corresponding storage device 901 according to the storage device information.
  • an embodiment of the present invention also provides an advertisement recall system. As shown in FIG. 10, it includes a system 101 for obtaining relationship expressions between entities and an advertisement recall matching system 102;
  • the system 101 for obtaining the expression of the relationship between entities is used to split a pre-built heterogeneous graph into subgraphs according to the type of edges, and one subgraph includes one type of edge;
  • the node type in the heterogeneous graph Including: at least one of an advertisement, a product, and a query term, and the type of the edge includes at least one of a click edge, a common click edge, a collaborative filtering edge, a content semantically similar edge, and an attribute similar edge;
  • each sample in the sample set includes a source node, a positive node and at least one negative node;
  • the sample set of the same batch of each subgraph is input into the preset machine learning model for training, and the vector expression of the source node of each sample in the sample set of each subgraph, the vector expression of the positive node and each negative
  • the vector expression of the node based on the obtained vector expression of each node, the parameters in the machine learning model are optimized using the preset loss function;
  • the preset aggregation model performs aggregation learning on the vector expressions of the same source node in different subgraphs to obtain a vector expression of the same source node; based on a vector expression of the same source node and the source node in each subgraph The vector expression of the positive node and the vector expression of each negative node included in the sample of the graph, using the preset loss function to optimize the parameters of the aggregation model;
  • a node in the heterogeneous graph corresponds to an entity in the sample data.
  • the advertisement recall matching system 102 is configured to use the low-dimensional vector expressions of query term nodes, commodity nodes, and search advertisement nodes obtained by the system for obtaining expressions of inter-entity relationships to determine one of the query term nodes, commodity nodes, and search advertisement nodes According to the degree of matching between products, search advertisements that meet the set requirements are selected according to the degree of matching.
  • the system for obtaining the expression of the relationship between entities 101 samples each subgraph to obtain a sample set of each subgraph, including:
  • a random walk is performed using the selected node as the starting point to obtain at least one node sequence corresponding to each subgraph; using a preset sliding window, from the node sequence, the set of positive samples corresponding to each subgraph is obtained ,
  • a positive sample in the positive sample set includes a source node and a positive node;
  • a negative node is sampled to obtain the sample set corresponding to each subgraph.
  • a sample in the sample set includes a source node, a positive node and at least one negative node. The distribution of the node and the positive node is consistent, and the negative node has a correlation with the preset attribute of the source node.
  • the system for obtaining the expression of the relationship between entities 101 The system for obtaining the expression of the relationship between entities inputs the sample set of the same batch of each subgraph into a preset machine learning model for training, and obtains the results of each sample in the sample set of each subgraph.
  • the vector expression of the source node, the vector expression of the positive node, and the vector expression of each negative node include:
  • the sparse features of the nodes included in the sample are mapped into dense features
  • the density feature of the source node is trained by a corresponding machine learning model network to obtain the vector expression of the source node, and the density feature of the positive node and the negative node is trained by a corresponding machine learning model to obtain the vector expression of the positive node and each negative node.
  • the system for obtaining the expression of the relationship between entities 101 uses a preset loss function to optimize the parameters in the machine learning model based on the obtained vector expression of each node, including:
  • the preset loss function optimizes the parameters in the machine learning model based on the cosine distance.
  • the system for obtaining the expression of the relationship between entities 101 The system for obtaining the expression of the relationship between entities performs aggregation learning on the vector expression of the same source node in different subgraphs through a preset aggregation model, and obtains a vector expression of the same source node, including :
  • the system for obtaining the expression of the relationship between entities 101 uses a preset loss function to optimize the parameters in the machine learning model based on the obtained vector expression of each node, including:
  • the preset loss function optimizes the parameters in the machine learning model based on the cosine distance.
  • the advertisement recall matching system determines the degree of matching between query term nodes, product nodes and search advertisement nodes, including:
  • the virtual request node is passed The virtual node constructed by the query term node and the commodity node that the user pre-clicked under the general query term;
  • the matching degree between the query term node, the product node and the search advertisement node is determined.
  • the advertisement recall matching system selects search advertisements that match the product and query terms according to the matching degree, including:
  • a search advertisement whose distance meets the set requirement is selected.
  • An embodiment of the present invention also provides a computer-readable storage medium on which computer instructions are stored, and when the instructions are executed by a processor, the foregoing method for obtaining expressions of relationships between entities is implemented.
  • An embodiment of the present invention also provides a heterogeneous graph learning device, including: a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor implements the above-mentioned acquisition entity when the program is executed.
  • the method of expressing the relationship includes: a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor implements the above-mentioned acquisition entity when the program is executed. The method of expressing the relationship.
  • terms such as processing, calculation, operation, determination, display, etc. may refer to one or more actions and/or processes of processing or computing systems, or similar devices, and the actions and/or processes will be expressed as The data manipulation and conversion of physical (such as electronic) quantities in the registers or memory of the processing system becomes other data similarly represented as physical quantities in the memory, registers or other such information storage, transmission or display devices of the processing system.
  • Information and signals can be represented using any of a variety of different technologies and methods.
  • the data, instructions, commands, information, signals, bits, symbols, and chips mentioned throughout the above description can be represented by voltage, current, electromagnetic waves, magnetic fields or particles, light fields or particles, or any combination thereof.
  • the steps of the method or algorithm described in combination with the embodiments of this document can be directly embodied as hardware, a software module executed by a processor, or a combination thereof.
  • the software module can be located in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM or any other form of storage medium known in the art.
  • An exemplary storage medium is connected to the processor, so that the processor can read information from the storage medium and can write information to the storage medium.
  • the storage medium may also be a component of the processor.
  • the processor and the storage medium may be located in the ASIC.
  • the ASIC can be located in the user terminal.
  • the processor and the storage medium may also exist as discrete components in the user terminal.
  • the technology described in this application can be implemented with modules (for example, procedures, functions, etc.) that perform the functions described in this application.
  • These software codes can be stored in a memory unit and executed by a processor.
  • the memory unit may be implemented in the processor or outside the processor. In the latter case, it is communicatively coupled to the processor through various means, which are well known in the art.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种获取实体间关系表达的方法、系统和设备、广告召回系统。所述方法包括:将异构图按照边的类型拆分为子图,对子图进行采样,得到样本集合,将样本集合输入机器学习模型,得到每个子图的样本集合中各条样本的源节点、正节点和每个负节点的向量表达;基于得到的向量表达优化模型参数;对不同子图中的相同源节点的向量表达进行聚合,得到相同源节点的向量表达;基于相同源节点的向量表达、正节点和每个负节点的向量表达,优化聚合模型参数;重复上述流程得到异构图中每个节点的低维向量表达。能够实现对复杂的异构图的学习,处理速度块、效率高,用于广告搜索时使召回广告的匹配度更高。

Description

获取实体间关系表达的方法、系统和设备、广告召回系统
本申请要求2019年01月16日递交的申请号为201910041481.3、发明名称为“获取实体间关系表达的方法、系统和设备、广告召回系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及数据挖掘技术领域,特别涉及一种获取实体间关系表达的方法、系统和设备、广告召回系统。
背景技术
随着移动终端及应用软件的普及,在社交、电商、物流、出行、外卖、营销等领域的服务提供商沉淀了海量业务数据,基于海量业务数据,挖掘不同业务实体(实体)之间的关系成为数据挖掘领域一个重要的技术研究方向。而随着机器处理能力的提升,越来越多技术人员开始研究如何通过机器学习技术进行挖掘。
本发明的发明人发现:
目前,通过机器学习技术,对海量业务数据进行学习,得到用于表达实体及实体之间关系的图(Graph),即,对海量业务数据进行图学习,成为一个优选的技术方向。简单理解,图由节点和边构成,一个节点用于表示一个实体,节点与节点之间的边用于表示节点之间的关系。一张图一般会包括两个以上的节点和一条以上的边,因此,图也可以理解为由节点的集合和边的集合组成,通常表示为:G(V,E),其中,G表示图,V表示图G中节点的集合,E是图G中边的集合。图可以分为同构图和异构图,其中,异构图指的是一张图中的节点的类型不同(边的类型可以相同或者不同),或者一张图中边的类型不同(节点的类型可以相同或者不同)。所以,当实体的类型较多需要用多种类型的节点来表达,或者,实体之间的关系不唯一需要用多种类型的边来表达时,优选通过异构图表达这些实体及这些实体之间的关系,而当异构图包括的节点和边的量级很大时,该异构图会异常复杂且数据量会非常庞大,因此,降低异构图的复杂度及数据量成为本领域技术人员面临的技术问题。
发明内容
鉴于上述问题,提出了本发明以便提供一种克服上述问题或者至少部分地解决上述 问题的一种异构图学习方法、系统和设备。
本发明实施例提供一种广告召回系统,包括获取实体间关系表达的系统和广告召回匹配系统;
所述获取实体间关系表达的系统,用于将预先构建的异构图按照边的类型,拆分为子图,一个子图包括一种类型的边;所述异构图中的所述节点类型包括:广告、商品、查询词中的至少一种,所述边的类型包括点击边、共同点击边、协同过滤边、内容语义相似边和属性相似边中的至少一种;
针对每个子图进行采样,得到每个子图的样本集合,样本集合的每条样本中包括一个源节点、一个正节点和至少一个负节点;
将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练,分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达;基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化;
预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到所述相同源节点的一个向量表达;基于所述相同源节点的一个向量表达和该源节点在各子图的样本中包括的正节点的向量表达和每个负节点的向量表达,使用预设的损失函数对聚合模型的参数进行优化;
重复上述流程对所有批次的样本集合进行预设次数的训练,得到所述异构图中每个节点的一个低维向量表达,异构图中的一个节点对应样本数据中的一个实体。
所述广告召回匹配系统,用于使用所述获取实体间关系表达的系统得到的查询词节点、商品节点和搜索广告节点的低维向量表达,确定查询词节点、商品节点和搜索广告节点之间的匹配程度,根据所述匹配程度选择与商品、查询词匹配程度符合设定要求的搜索广告。
在一些可选的实施例中,所述获取实体间关系表达的系统针对每个子图进行采样,得到每个子图的样本集合,包括:
针对每个子图,以选择节点为起点分别进行随机游走,得到每个子图对应的至少一个节点序列;用预设的滑动窗口,从所述节点序列中,得到每个子图对应的正样本集合,所述正样本集合中的一条正样本包括一个源节点与一个正节点;
基于每个子图对应的正样本集合进行一次负节点的采样,得到每个子图对应的样本集合,所述样本集合中的一条样本包括一个源节点、一个正节点和至少一个负节点,所 述负节点与正节点的分布具有一致性,所述负节点与源节点的预设属性具有相关性。
在一些可选的实施例中,所述获取实体间关系表达的系统将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练,分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,包括:
针对每个子图的同一批次的样本集合,将样本集合中每条样本包括的源节点、正节点、负节点及各节点的属性信息输入到机器学习模型中;
经机器学习模型的嵌入层,将样本中包括的节点的稀疏特征映射成稠密特征;
源节点的稠密性特征经过对应的一个机器学习模型网络训练得到源节点的向量表达,正节点和负节点的稠密性特征经过对应的一个机器学习模型训练得到正节点和各负节点的向量表达。
在一些可选的实施例中,所述获取实体间关系表达的系统基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化,包括:
根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;
预设的损失函数基于所述余弦距离对机器学习模型中的参数进行优化。
在一些可选的实施例中,所述获取实体间关系表达的系统通过预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到所述相同源节点的一个向量表达,包括:
根据每个子图训练得到的源节点的每个向量表达和对应的学习权重因子,确定所述源节点的从每个子图中训练得到的向量表达的权重;
使用确定出来的权重对所述源节点从每个子图中训练得到的向量表达进行加权求和,得到所述源节点聚合后的一个向量表达。
在一些可选的实施例中,所述获取实体间关系表达的系统基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化,包括:
根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;
预设的损失函数基于所述余弦距离对机器学习模型中的参数进行优化。
本发明实施例还提供一种获取实体间关系表达的方法,包括:
将预先构建的异构图按照边的类型,拆分为子图,一个子图包括一种类型的边;
针对每个子图进行采样,得到每个子图的样本集合,样本集合的每条样本中包括一 个源节点、一个正节点和至少一个负节点;
将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练,分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达;基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化;
预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到所述相同源节点的一个向量表达;基于所述相同源节点的一个向量表达和该源节点在各子图的样本中包括的正节点的向量表达和每个负节点的向量表达,使用预设的损失函数对聚合模型的参数进行优化;
重复上述流程对所有批次的样本集合进行预设次数的训练,得到所述异构图中每个节点的一个低维向量表达,异构图中的一个节点对应样本数据中的一个实体。
在一些可选的实施例中,针对每个子图进行采样,得到每个子图的样本集合,包括:
针对每个子图,以选择节点为起点分别进行随机游走,得到每个子图对应的至少一个节点序列;用预设的滑动窗口,从所述节点序列中,得到每个子图对应的正样本集合,所述正样本集合中的一条正样本包括一个源节点与一个正节点;
基于每个子图对应的正样本集合进行一次负节点的采样,得到每个子图对应的样本集合,所述样本集合中的一条样本包括一个源节点、一个正节点和至少一个负节点,所述负节点与正节点的分布具有一致性,所述负节点与源节点的预设属性具有相关性。
在一些可选的实施例中,用预设的滑动窗口,从所述节点序列中,得到每个子图对应的正样本集合,具体包括:
针对序列中的每个节点,按照预设的滑动窗口的大小,获取当该节点位于所述滑动窗口中时位于所述滑动窗口范围内的其他节点,将获取的其他节点分别与该节点组成样本对,得到所述正样本集合。
在一些可选的实施例中,从正节点中进行负节点的采样,为每一对源节点和正节点得到对应的至少一个负节点,所述负节点与正节点的分布是一致的,所述负节点与源节点的具有相关性。
在一些可选的实施例中,从正节点中进行负节点的采样,为每一对源节点和正节点得到对应的至少一个负节点,所述负节点与正节点的分布是一致的,所述负节点与源节点的具有相关性,包括:
对样本集合中的正节点对进行统计,得到各正节点所在类目和同一个正节点在不同 正样本中出现的次数作为所述正节点的分布权重;
根据源节点的类目信息,从统计出的正节点中选取该类目下的正节点,根据所述分布权重确定获取的正节点作为负节点的概率,根据所述概率选择与源节点相关性符合要求的负节点。
在一些可选的实施例中,将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练,分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,包括:
针对每个子图的同一批次的样本集合,将样本集合中每条样本包括的源节点、正节点、负节点及各节点的属性信息输入到机器学习模型中;
经机器学习模型的嵌入层,将样本中包括的节点的稀疏特征映射成稠密特征;
源节点的稠密性特征经过对应的一个机器学习模型网络训练得到源节点的向量表达,正节点和负节点的稠密性特征经过对应的一个机器学习模型训练得到正节点和各负节点的向量表达。
在一些可选的实施例中,基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化,包括:
根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;
预设的损失函数基于所述余弦距离对机器学习模型中的参数进行优化。
在一些可选的实施例中,预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到所述相同源节点的一个向量表达,包括:
根据每个子图训练得到的源节点的每个向量表达和对应的学习权重因子,确定所述源节点的从每个子图中训练得到的向量表达的权重;
使用确定出来的权重对所述源节点从每个子图中训练得到的向量表达进行加权求和,得到所述源节点聚合后的一个向量表达。
在一些可选的实施例中,基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化,包括:
根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;
预设的损失函数基于所述余弦距离对机器学习模型中的参数进行优化。
本发明实施例还提供一种获取实体间关系表达的系统,包括:注册装置、存储装置、 计算装置和参数交换装置;
存储装置,用于存储异构图的数据;
计算装置,用于通过注册装置从存储装置获取异构图的数据,采用上述的获取实体间关系表达的方法对异构图进行学习,得到异构图中每个节点的低维向量表达。
参数交换装置,用于与计算装置进行参数交互。
本发明实施例提供的上述技术方案的有益效果至少包括:
基于异构图拆分后的子图,对子图进行采样,并对采样得到的样本集合进行训练学习,融合学习到的各子图的学习结果,得到对异构图的学习结果,从而实现对复杂的异构图的学习;通过对异构图拆解的子图进行学习,有效的避免了训练参数爆炸性增长的问题,同时也有效地避免了邻居数随层数指数级增长的问题,大大减少了异构图学习过程中的数据处理量,使其计算量级降低到处理设备能够支撑的量级范围内,降低了对异构图学习设备的硬件要求,大大提高异构图学习的速度和效率。该异构图学习方法用于广告搜索场景中,挖掘广告搜索场景中的实体关系实现使用大量信息准确实现广告召回,提高广告召回的质量,以全体广告作为候选,保证在任意流量下都能够召回足够多的广告,通过向量方式,实现广告改写和广告筛选可以一步完成。
本发明的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。
下面通过附图和实施例,对本发明的技术方案做进一步的详细描述。
附图说明
附图用来提供对本发明的进一步理解,并且构成说明书的一部分,与本发明的实施例一起用于解释本发明,并不构成对本发明的限制。在附图中:
图1为本发明实施例一中获取实体间关系表达的方法的流程图;
图2为本发明实施例二中获取实体间关系表达的方法的流程图;
图3为本发明实施例二中广告场景下的异构图示例;
图4为本发明实施例二中根据异构图构建的子图示例一;
图5为本发明实施例二中根据异构图构建的子图示例二;
图6为本发明实施例二中根据异构图构建的子图示例三;
图7为本发明实施例二中子图的模型网络示意图;
图8为本发明实施例二中多个子图的学习结果融合结果示例图;
图9为本发明实施例中获取实体间关系表达的系统的结构示意图;
图10为本发明实施例中广告召回系统的一种实现结构示例图。
具体实施方式
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
为了解决现有技术中存在的异构图学习时,训练参数指数级增长,邻居采样也随层数增加而指数级增长,从而导致设备无法支持如此大数量级的运算的问题,本发明实施例提供一种的方法,能够很好地解决上述问题,有效地减少异构图学习过程中的数据处获取实体间关系表达理量,处理速度快、效率高。
图学习在现实领域中挖掘各种数据关系时有着广泛的应用,例如在搜索广告平台中用于挖掘搜索请求和广告之间的相关性以及点击通过率(Click-Through-Rate,CTR)等。即本发明方法可以用于广告搜索领域,用于搜索广告的召回。搜索广告是指广告主根据自己的产品或服务的内容、特点等,确定相关的关键词,撰写广告内容并自主定价投放在关键词对应的搜索结果中的广告。搜索广告召回是指通过某种算法或者模型从海量广告集合中挑选最相关的广告。
现有的搜索广告召回技术或是基于查询词与广告主竞价词(bidword)匹配程度、广告主买词价格以及用户对广告的统计偏好筛选“高质量”的广告;或是加入每个用户的历史行为数据,对广告进行个性化匹配召回。
发明人在对现有技术研究中发现,现有的召回技术或是只偏重广告与查询词的匹配程度,亦或只偏重提高召回广告收益,缺少一个集成模型将二者兼顾。由于广告召回的质量高低,对搜索广告收益及用户体验至关重要,因此,发明者提供了一种图学习技术,在广告召回过程中用来获取实体间关系表达,能得到更多高质量、用户更加关心的广告召回集合。
下面通过具体的实施例来详细描述获取实体间关系表达方法和系统,以及用于广告召回系统的具体实现方式。
实施例一
本发明实施例一提供一种获取实体间关系表达的方法,其流程如图1所示,包括如下步骤:
步骤S101:将预先构建的异构图按照边的类型,拆分为子图,一个子图包括一种类型的边。
由于异构图的复杂性和数据的庞大性,导致处理的过程中数据量呈指数级增长,因此,对异构图进行处理时,将其拆分为子图,对子图进行处理,在进行拆分时,按照边的类型来进行拆分,根据一种边的类型对应一个子图,子图中的节点的类型可以不同。优选的,一个子图中包括异构图中的所有节点和一种类型的边。
步骤S102:针对每个子图进行采样,得到每个子图的样本集合,样本集合的每条样本中包括一个源节点、一个正节点和至少一个负节点;
针对每个子图,以选择节点为起点分别进行随机游走,得到每个子图对应的至少一个节点序列;用预设的滑动窗口,从节点序列中,得到每个子图对应的正样本集合,正样本集合中的一条正样本包括一个源节点与一个正节点;
基于每个子图对应的正样本集合进行一次负节点的采样,得到每个子图对应的样本集合,样本集合中的一条样本包括一个源节点、一个正节点和至少一个负节点,负节点与正节点的分布具有一致性,负节点与源节点的预设属性具有相关性。
随机游走的算法可以采用deepwalk、node2vec等学习算法。
用预设的滑动窗口,从节点序列中,得到每个子图对应的正样本集合,具体包括:针对序列中的每个节点,按照预设的滑动窗口的大小,获取当该节点位于所述滑动窗口中时位于所述滑动窗口范围内的其他节点,将获取的其他节点分别与该节点组成样本对,得到正样本集合。
进行负节点采样时,从正节点中进行负节点的采样,为每一对源节点和正节点得到对应的至少一个负节点,所述负节点与正节点的分布是一致的,所述负节点与源节点的具有相关性。具体的,对样本集合中的正节点对进行统计,得到各正节点所在类目和同一个正节点在不同正样本中出现的次数作为所述正节点的分布权重;根据源节点的类目信息,从统计出的正节点中选取该类目下的正节点,根据所述分布权重确定获取的正节点作为负节点的概率,根据概率选择与源节点相关性符合要求的负节点。
针对每个子图中的每个节点,设定以该节点为起点时的游走次数,针对每个节点进行相应次数的游走,得到以起点节点为源节点的一系列的节点序列,根据得到的节点序列提取正样本对,得到正样本集合。得到正样本集合后,根据负节点的采样原则进行负 节点采样,为每个正样本对采样至少一个负节点,得到包括一个源节点、一个正节点、至少一个负节点的一条样本。
步骤S103:将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练,分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达。
针对每个子图的同一批次的样本集合,将样本集合中每条样本包括的源节点、正节点、负节点及各节点的属性信息输入到机器学习模型中;
经机器学习模型的嵌入层,将样本中包括的节点的稀疏特征映射成稠密特征;
源节点的稠密性特征经过对应的一个机器学习模型网络训练得到源节点的向量表达,正节点和负节点的稠密性特征经过对应的一个机器学习模型训练得到正节点和各负节点的向量表达。
步骤S104:基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化。
根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;预设的损失函数基于余弦距离对机器学习模型中的参数进行优化。
针对子图中每条样本的学习结果,计算该样本中源节点源节点的向量表达与样本中包括的正节点向量表达的余弦距离、该源节点与各负节点的向量表达进行余弦距离,将计算得到的各个余弦距离输入损失函数中,得到优化向量。
该步骤中,根据每个子图对应的机器学习模型对一个批次的样本数据的学习结果优化该子图对应的机器学习模型的参数。使用参数优化后的机器学习模型用于对下一批次的样本的学习,从而使得上一批次的样本学习的结果能够影响下一批次的样本学习。
步骤S105:预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到相同源节点的一个向量表达。
该步骤中实现对不同子图中相同的源节点的向量表达进行聚合。根据每个子图训练得到的源节点的每个向量表达和对应的学习权重因子,确定所述源节点的从每个子图中训练得到的向量表达的权重;使用确定出来的权重对所述源节点从每个子图中训练得到的向量表达进行加权求和,得到所述源节点聚合后的一个向量表达。
步骤S106:基于相同源节点的一个向量表达和该源节点在各子图的样本中包括的正节点的向量表达和每个负节点的向量表达,使用预设的损失函数对聚合模型的参数优化。
根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;预设的损失函数基于所述余弦距离对机器学习模型中的参数进行优化。
对不同子图中相同的源节点的向量表达进行聚合,针对每个源节点,从每个子图的样本集合中获取包含该源节点的样本,针对获取的每条样本,分别计算该源节点的向量表达与样本中包括的正节点向量表达的余弦距离、该源节点与各负节点的向量表达进行余弦距离,将计算得到的各个余弦距离输入损失函数中,得到优化向量。
该步骤中,根据不同子图对应的机器学习模型对一个批次的样本数据的学习结果优化其参数。使用参数优化后的机器学习模型用于对下一批次的样本的学习,从而使得上一批次的样本学习的结果能够影响下一批次的样本学习。
步骤S107:重复上述流程对所有批次的样本集合进行预设次数的训练,得到异构图中每个节点的一个低维向量表达,异构图中的一个节点对应样本数据中的一个实体。
对异构图生成的不同批次的样本集合进行预设次数的训练,并用后一次的训练结果更新前一次的训练结果,得到异构图中各节点的最终的一个低维向量表达。
本实施例的上述方法中,基于异构图拆分后的子图,对子图进行采样,并对采样得到的样本集合进行训练学习,融合学习到的各子图的学习结果,得到对异构图的学习结果,从而实现对复杂的异构图的学习;通过对异构图拆解的子图进行学习,有效的避免了训练参数爆炸性增长的问题,同时也有效地避免了邻居数随层数指数级增长的问题,大大减少了异构图学习过程中的数据处理量,使其计算量级降低到处理设备能够支撑的量级范围内,降低了对异构图学习设备的硬件要求,大大提高异构图学习的速度和效率。该异构图学习方法用于广告搜索场景中,挖掘广告搜索场景中的实体关系实现使用大量信息准确实现广告召回,提高广告召回的质量,以全体广告作为候选,保证在任意流量下都能够召回足够多的广告,通过向量方式,实现广告改写和广告筛选可以一步完成。
实施例二
本发明实施例二提供获取实体间关系表达的方法的一种具体实现过程,以搜索广告场景下实现广告召回的过程为例进行说明,该方法流程如图2所示,包括如下步骤:
步骤S201:构造异构图。
以广告搜索场景为例,根据用户日志以及相关商品、广告数据,为搜索召回场景构造了一张大规模异构图,作为广告搜索场景的丰富的搜索交互图,以构造的异构图作为后续的图数据输入。
构建的异构图的一个示例参照图3所示,该异构图中包括Query1、Query2、Item1、Item2、Item3、Item4、Ad1等若干接点和连接不同节点的边,异构图中包括Query、Item、Ad等多种类型的节点,来表示搜索场景中的不同实体,异构图中包括了多种类型的边,来表示实体之间的多种关系。如query和item之间的点击关系,item是ad的前置点击关系等。其中,节点类型及其具体含义可以如下表1所示,边的类型及其含义可以如下表2所示。
表1
节点类型 具体含义
Item 广告搜索场景下的所有商品
Ad 广告搜索场景下的搜索广告
Query 广告搜索场景下的用户查询词
其中,使用Query节点和Item节点作为用户意图节点,刻画用户的个性化搜索意图,Ad节点为广告主投放的广告。
表2
Figure PCTCN2020070250-appb-000001
其中:
用户行为边表示用户的历史行为偏好,例如,可以在Query节点和Item节点之间或在Query节点和Ad节点之间建“点击边(click边)并使用点击次数作为边权重,表示Query和Item/Ad之间的点击;又例如,可以建立共同点击边(session边),表示同session (时段)同Query共同点击的Item或Ad;又例如,还可以建立协同过滤边(cf边)表示不同节点之间的协同过滤关系。在广告搜索场景下,用户行为边刻画的是一种动态变化的关系。热门的节点(例如高频Query的节点)会有更多的展示和点击,进而拥有更稠密的边关系和更大的边权重,而冷门节点和新节点,则会拥有相对稀疏的变关系和较小的边权重,因此用户行为边能够更好地刻画热门节点。
内容相似边(semantic边),用来客户节点之间的相似度,例如:在Item节点之间建立边,使用其标题的文本相似度作为变得权重。内容相似边反映了节点之间的一种静态关系,更加稳定,也可以很好的刻画冷门节点和新节点之间的关系。
属性相似边(domain边),表示节点之间的领域的重叠成都,例如品牌、种类等领域。
步骤S202:将预先构建的异构图按照边的类型,拆分为子图,一个子图包括一种类型的边;
异构图中的每种边都可以作为节点和节点之间的一种关系。例如,item和ad之间的标题文本相似边刻画了两者之间的语义相似度;点击边代表了他们在同一个query下被同一个用户点击过。一方面,每种边可以单独的刻画节点之间的部分关系;另一方面,多种不同的边互补可以更加刻画更加丰富和鲁棒的关系。因此,针对异构图本发明提出了基于子图的解法,每个子图中包括一种类型的边。
针对异构图,给每种类型边分别构造不同的子图。每个子图只包含一种边,可以包含全部或部分的节点。在搜索广告场景中,可以根据不同的边来构造三种不同的子图。具体来说,可以根据用户的搜索和点击行为构造用户行为子图;可以根据query、item标题以及广告标题之间的文本相似度构造文本相似子图;可以根据query、item以及ad的点击共现关系构造共现关系子图。构造的子图如图4、图5、图6所示。
图4所示的用户行为子图中,包括用户行为边和所有的节点,图5所示的文本相似子图中包括内容相似边和所有的节点,图6所示的共同关系子图中包括属性相似边和所有的节点。
步骤S203:针对每个子图进行采样,得到每个子图的样本集合,样本集合的每条样本中包括一个源节点、一个正节点和至少一个负节点。
针对每个子图,利用Node2Vec进行随机游走产生正样本对,以及遵循负采样的两个原则来产生负节点,得到大量的样本:(src_node,pos_node,{neg_node}K,edge_type),其中src_node表示源节点,pos_node表示正节点,{neg_node}K表示负采样得到的K个 负节点,edge_type表示这个子图的边类型。即每条样本包含一个源节点src_node,一个正节点pos_node,K个负节点neg_node。
针对每个子图进行采样的过程可以分为两个环节,正采样环节和负采样环节。其中:
正采样环节通过游走方式产生正样本。可以采用Node2Vec的游走方式。在异构图上进行游走生成正样本。Node2vec Walk是一种介于DFS和BFS之间的搜索方式,已经被证明在网络嵌入(Network Embedding)上有着很好的效果。给定一个异构图G=(V P,E Q),从每个节点v∈V P开始游走。对于每种边E q,使用Node2vec Walk进行π q次游走,每次游走得到长度为τ的序列:v 1->v 2->...->v τ,对于每个序列,通过滑窗得到正样本对:
(src_node,pos_node,edge_type)
如图4、图5、图6所示的子图示例,图中包含三种节点,query为用户的查询词,item为商品,ad为搜索广告。假如以query1为起点,沿着一定类型的边,通过Node2Vec的方式进行游走,可以得到一个结点序列,比如:query1->ad1->query2->item1->item2。从这个序列中,通过滑窗可得到一系列的正样本对(src_node,pos_node)。如果设置滑窗大小为3,当节点query1位于滑窗中时,因为滑窗的大小为3,可能出现在滑窗的节点有ad1和query2,因此可以得到样本对(query1,ad1)、(query1,query2),当节点ad1位于滑窗中时,因为滑窗的大小为3,可能出现在滑窗的节点有query1、query2和item1,因此可以得到样本对(ad1,query1)、(ad1,query2)、(ad1,item1),以此类推,根据该节点序列可以得到如下的正样本对:(query1,ad1)、(query1,query2)、(ad1,query1)、(ad1,query2)、(ad1,item1)、(query2,ad1)、(query2,query1)、(query2,item1)、(query2,item2)、(item1,query2)、(item1,ad1)、(item1,item2)、(item2,item1)、(item2,query2)。
对于每一个节点,都会设置一个游走次数,即以此节点为起点的游走次数。针对每个节点每次游走之后得到的序列,都会根据上述的步骤得到一系列的正样本对。
负采样环节根据两个原则进行负采样。对于每个正样本对(positive pair),使用负采样生成K个负节点:(src_node,pos,node,{neg_node}K,edge_type)。
在广告搜索场景,每个节点都有丰富的属性(Side Information)来帮助描述节点。例如:item的价格、品牌等。和节点ID相比,这些属性具有很好的泛化能力,能够帮助提高模型的稳定性。由于广告召回场景对向量的距离敏感,提出了负采样的如下两个原 则:
一致性原则:负采样得到的负节点需要和正节点有着一致的分布。如果正负节点有着不一样的分布,模型就会“偷懒”地倾向记住正负节点是什么,而不是去学习节点之间的关系。因此,我们使用Alias Method进行带权重地负采样,保证正负节点分布一致。
相关性原则:负节点和源节点之间应该具有弱相关性。如果负节点和源节点完全无关,则会导致模型过于轻易地分别出正负样本;而在线上使用时,模型则不能很好地分别出最好的广告和次好的广告。因此,我们使用类目信息,来保证负样本和源节点之间有弱相关性。
针对全图中的每个节点进行游走之后,会得到全部的正样本对(src_node,pos_node)。接下来会统计所有的正节点pos_node的分布,即得到每一个正节点pos_node的频次。根据分布一致性原则,负采样的时候会根据pos_node的分布进行采样,也就是按pos_node的权重进行采样。同时,根据相关性原则,会保证源节点src_node和负节点有弱相关性,在电商搜索广告的场景中,可以使用类目信息来保证这种弱相关性。例如,统计得到正节点的类目以及频次如下:(query1,cate1,100)、(query2,cate2,200)、(item1,cate1,50)、(item2,cate2,50)、(item3,cate1,100)、(item4,cate1,150)、(ad1,cate1,150)。
那么,对于正样本对<ad1,item1>,ad1的类目是cate1,根据相关性原则,所以在类目为cate1的item节点中进行采样,满足这种条件的有(item1,cate1,50),(item3,cate1,100),(item4,cate1,150)。由于正节点不能和负节点一样,所以排除(item1,cate1,50),从(item3,cate1,100),(item4,cate1,150)中进行负采样。再根据分布一致性原则,将item3作为负节点的概率为100/(100+150)=0.4,将item4作为负节点的概率为150/(100+150)=0.6。
上述按源节点的类目进行选择,体现了一致性原则,根据权重计算概率体现了相关性原则。
步骤S204:将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练。
对样本进行学习时,使用子图对应的模型网络,将样本以及样本的属性输入模型网络中进行学习。
一个子图的机器学习模型的示例如图7所示,图7中最下面的虚线框中是一个子图的示例,对子图进行游走(walk)得到源节点(src)和正节点(pos),并通过负节点采样得到负节点(neg 1、neg 2、……、neg k),得到的每个节点的数据包括节点标识(node id) 和属性信息(attr 1、attr 2、……、attr n)。将源节点、正节点、负节点的相关数据输入到一层共享的嵌入层(shared layer)进行训练学习,将节点的稀疏特征映射成稠密特征。在该层中每个节点经过对应的EMB模型进行学习。
步骤S205:分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达。
得到各节点的筹码特征后,源节点的稠密特征经过一个神经网络
Figure PCTCN2020070250-appb-000002
得到一个源节点向量(X src),正节点和K个负节点经过另外一个神经网络
Figure PCTCN2020070250-appb-000003
分别得到正节点向量(X pos)和各负节点向量
Figure PCTCN2020070250-appb-000004
对每条样本进行学习,得到每条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达。
步骤S206:基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化。
根据样本的源节点的向量表达(X src)、正节点的向量表达(X pos)和每个负节点的向量表达
Figure PCTCN2020070250-appb-000005
分别计算源节点的向量表达(X src)与正节点的向量表达(X pos)、各负节点的向量表达
Figure PCTCN2020070250-appb-000006
的余弦距离,根据计算得到的各余弦距离得到对该子图的机器学习模型的优化参数表达(O rel),得到优化参数原则是通过分别计算源节点和正节点以及负节点的余弦距离来使得源节点和正节点的距离尽可能的近。
上述步骤S204-步骤S206,通过如图7所示的子图的模型网络实现对子图的样本集合的学习:
1)通过游走和负采样得到训练样本(src_node,pos_node,{neg_node}K,edge_type),与图存储引擎交互取得节点的属性。节点属性包括ID特征以及其他属性如title文本信息,店铺信息,品牌信息等。(node_id,attr1,attr2,attrn…)
2)源节点、正节点、负节点进入一个共享层(shared layer)。这一层是EMB(Embedding查找层),目的是将稀疏的ID特征映射为稠密特征。
3)经过EMB层之后,源节点单独经过一个DNN网络,称之为
Figure PCTCN2020070250-appb-000007
正节点和负节点共享一个DNN网络,称之为
Figure PCTCN2020070250-appb-000008
每个节点经过DNN网络均会得到一个向量表达,将其表示为X src、X pos、X neg1、……等,
4)计算源节点向量表达与正节点和负节点的余弦距离,使得与正节点的距离尽可能的近。这个优化目标就是对应的O rel
上述步骤S204-步骤S206,基于相关性(Relevance)目标构建子图的机器学习模型:
给定异构图G=(V P,E Q),包含了P种节点和Q中边。对于第p种类型的源节点v∈V P和第q种类型的边e∈E q,学习出一个DNN网络
Figure PCTCN2020070250-appb-000009
Figure PCTCN2020070250-appb-000010
其中,FC表示全连接层,w和b均为需要学习的全连接权重和偏置,ELU为指数线性单元激活函数,
Figure PCTCN2020070250-appb-000011
为学习得到的向量表达。
为了保证同一个节点通过不同类型的边的向量表达(Embedding)得到的向量能够映射到同一低维空间,让所有目标节点(即正节点和负节点)在所有类型的边关系中,共享相同的DNN网络
Figure PCTCN2020070250-appb-000012
Figure PCTCN2020070250-appb-000013
其中,FC表示全连接层,w和b均为需要学习的全连接权重和偏置,ELU为指数线性单元激活函数,
Figure PCTCN2020070250-appb-000014
为学习得到的向量表达。
因此,同时协同地学习P*Q+P个DNN网络。其中P*Q个网络用来源节点Embedding,P个网络用来Embedding目标节点。
给定第q种类型的边,v∈V P表示源节点,v′∈V P'表示正节点,
Figure PCTCN2020070250-appb-000015
表示负节点。 使用Cosine距离刻画节点之间的相似性,并使用Softmax交叉熵Loss作为Relevance目标O rel
Figure PCTCN2020070250-appb-000016
Figure PCTCN2020070250-appb-000017
Figure PCTCN2020070250-appb-000018
其中,v′表示正节点,v″表示负节点。
步骤S207:预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到相同源节点的一个向量表达。
融合多个子图的学习结果的原理示意如图8所示,可以通过注意力机制融合多个子图的结果,相同的节点在不同的子图中会得到不同的向量表达,接着使用注意力机制将不同子图的向量表达融合成一个统一的表达。
参见图8所示的,将不同子图中得到相同节点的向量表达
Figure PCTCN2020070250-appb-000019
等,进行融合,通过注意力机制融合得到融合后的该节点的向量表达
Figure PCTCN2020070250-appb-000020
其中,
Figure PCTCN2020070250-appb-000021
为v节点在边类型为q 1的子图得到的向量表达,
Figure PCTCN2020070250-appb-000022
Figure PCTCN2020070250-appb-000023
分别为v节点在边类型为q 2和Q的子图得到的向量表达。
对异构图中每个节点都进行融合处理后,得到各节点作为源节点时的融合后的向量表达。
步骤S208:基于相同源节点的一个向量表达和该源节点在各子图的样本中包括的正节点的向量表达和每个负节点的向量表达,使用预设的损失函数对聚合模型的参数进行优化。
根据源节点所在的样本,将融合之后的源节点向量表达
Figure PCTCN2020070250-appb-000024
分别与其所在样本中的正节点的向量表达(X pos)、各负节点的向量表达
Figure PCTCN2020070250-appb-000025
进行余弦距离计算,根据计算得到的各余弦距离得到对该聚合模型的优化参数表达(O att),这个优化目标就是对应的O att。得到优化参数原则是通过分别计算源节点和正节点以及负节点的余弦距离来使得源节点和正节点的距离尽可能的近。
步骤S209:重复上述流程对所有批次的样本集合进行预设次数的训练,得到异构图中每个节点的一个低维向量表达,异构图中的一个节点对应样本数据中的一个实体。
对当前批次的样本集合进行训练学习得到的各节点的向量表达时,会使用上一批次的结果更新后的系统参数进行训练,会替代之前批次的样本集合的学习结果,这样使得之前的学习能够影响后边的学习且以最后的学习结果为准,使得学习结果能体现所有样本的特征。
上述步骤S207-步骤S208,基于注意力(Attention)目标构建聚合模型:
对于一个源节点v∈V P,可以通过每种类型的边得到Q个低维向量
Figure PCTCN2020070250-appb-000026
通过Attention机制,自动学习每个向量的权重,把Q个向量合并成一个向量
Figure PCTCN2020070250-appb-000027
Figure PCTCN2020070250-appb-000028
Figure PCTCN2020070250-appb-000029
其中λ pq(υ)表示第p种类型的节点v在第q种类型边上的权重,z pq是注意力机制所需要学习的参数,是一个向量,表示第p种节点对应第q种边的聚合权重。如果z pq
Figure PCTCN2020070250-appb-000030
的内积较大,则表明v认为第q种边是有信息的。此外,如果两个节点有着类似的向量,表明它们在图中关系亲密,会有着相似的权重分布。
同样使用Cosine距离和Softmax交叉熵Loss作为Attention目标O att
Figure PCTCN2020070250-appb-000031
Figure PCTCN2020070250-appb-000032
Figure PCTCN2020070250-appb-000033
其中,v′表示正节点,v″表示负节点。
基于同一发明构思,本发明实施例还提供一种获取实体间关系表达的系统装置,该系统可以设置在网络中的网络设备、云端的云端设备或者架构的服务器设备、用户端设备等设备中。该系统的结构如图9所示,包括:注册装置903、存储装置901、计算装置902和参数交换装置904。
存储装置901,用于存储异构图的数据;
计算装置902,用于通过注册装置903从存储装置获取异构图的数据,采用上述的获取实体间关系表达的方法对异构图进行学习,得到异构图中每个节点的低维向量表达。
参数交换装置904,用于与计算装置902进行参数交互。
计算装置902通过注册装置903从存储装置获取各节点和边的数据,包括:
上述存储装置901中存储异构图中各节点以及边的数据。
计算装置902向注册装置903发送数据查询请求,数据查询请求中包括要查询的节点和边的信息;接收注册装置903返回的查询结果,查询结果中包括存储节点和边的数据的存储装置信息;根据存储装置信息向相应的存储装置901获取各节点和边的数据。
基于同一发明构思,本发明实施例还提供一种广告召回系统,参照图10所示,包括获取实体间关系表达的系统101和广告召回匹配系统102;
获取实体间关系表达的系统101,用于将预先构建的异构图按照边的类型,拆分为子图,一个子图包括一种类型的边;所述异构图中的所述节点类型包括:广告、商品、查询词中的至少一种,所述边的类型包括点击边、共同点击边、协同过滤边、内容语义相似边和属性相似边中的至少一种;
针对每个子图进行采样,得到每个子图的样本集合,样本集合的每条样本中包括一个源节点、一个正节点和至少一个负节点;
将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练,分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达;基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化;
预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到所述相同源节点的一个向量表达;基于所述相同源节点的一个向量表达和该源节点在各子图的样本中包括的正节点的向量表达和每个负节点的向量表达,使用预设的损失函数对聚合模型的参数进行优化;
重复上述流程对所有批次的样本集合进行预设次数的训练,得到所述异构图中每个节点的一个低维向量表达,异构图中的一个节点对应样本数据中的一个实体。
所述广告召回匹配系统102,用于使用所述获取实体间关系表达的系统得到的查询词节点、商品节点和搜索广告节点的低维向量表达,确定查询词节点、商品节点和搜索广告节点之间的匹配程度,根据所述匹配程度选择与商品、查询词匹配程度符合设定要 求的搜索广告。
获取实体间关系表达的系统101获取实体间关系表达的系统针对每个子图进行采样,得到每个子图的样本集合,包括:
针对每个子图,以选择节点为起点分别进行随机游走,得到每个子图对应的至少一个节点序列;用预设的滑动窗口,从所述节点序列中,得到每个子图对应的正样本集合,所述正样本集合中的一条正样本包括一个源节点与一个正节点;
基于每个子图对应的正样本集合进行一次负节点的采样,得到每个子图对应的样本集合,所述样本集合中的一条样本包括一个源节点、一个正节点和至少一个负节点,所述负节点与正节点的分布具有一致性,所述负节点与源节点的预设属性具有相关性。
获取实体间关系表达的系统101获取实体间关系表达的系统将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练,分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,包括:
针对每个子图的同一批次的样本集合,将样本集合中每条样本包括的源节点、正节点、负节点及各节点的属性信息输入到机器学习模型中;
经机器学习模型的嵌入层,将样本中包括的节点的稀疏特征映射成稠密特征;
源节点的稠密性特征经过对应的一个机器学习模型网络训练得到源节点的向量表达,正节点和负节点的稠密性特征经过对应的一个机器学习模型训练得到正节点和各负节点的向量表达。
获取实体间关系表达的系统101获取实体间关系表达的系统基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化,包括:
根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;
预设的损失函数基于所述余弦距离对机器学习模型中的参数进行优化。
获取实体间关系表达的系统101获取实体间关系表达的系统通过预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到所述相同源节点的一个向量表达,包括:
根据每个子图训练得到的源节点的每个向量表达和对应的学习权重因子,确定所述源节点的从每个子图中训练得到的向量表达的权重;
使用确定出来的权重对所述源节点从每个子图中训练得到的向量表达进行加权求和,得到所述源节点聚合后的一个向量表达。
获取实体间关系表达的系统101获取实体间关系表达的系统基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化,包括:
根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;
预设的损失函数基于所述余弦距离对机器学习模型中的参数进行优化。
可选的,广告召回匹配系统确定查询词节点、商品节点和搜索广告节点之间的匹配程度,包括:
使用注意力机制对查询词节点的低维向量表达和同查询词下的用户前置点击商品节点的低维向量表达进行汇聚,得到虚拟请求节点的低维向量表达;所述虚拟请求节点为通过查询词节点和通查询词下的用户前置点击的商品节点构建出的虚拟节点;
根据虚拟请求节点的低维向量表达与搜索广告节点的低维向量表达,确定查询词节点、商品节点和搜索广告节点之间的匹配程度。
可选的,广告召回匹配系统根据所述匹配程度选择与商品、查询词匹配程度符合设定要求的搜索广告,包括:
根据所述虚拟请求节点的低维融合信息向量与搜索广告节点的低维融合信息向量的余弦距离,选择距离符合设定要求的搜索广告。
本发明实施例还提供一种计算机可读存储介质,其上存储有计算机指令,该指令被处理器执行时实现上述的获取实体间关系表达的方法。
本发明实施例还提供一种异构图学习设备,包括:存储器,处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述的获取实体间关系表达的方法。
关于上述实施例中的系统,其中各个装置和模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
除非另外具体陈述,术语比如处理、计算、运算、确定、显示等等可以指一个或更多个处理或者计算系统、或类似设备的动作和/或过程,所述动作和/或过程将表示为处理系统的寄存器或存储器内的物理(如电子)量的数据操作和转换成为类似地表示为处理系统的存储器、寄存器或者其他此类信息存储、发射或者显示设备内的物理量的其他数据。信息和信号可以使用多种不同的技术和方法中的任何一种来表示。例如,在贯穿上面的描述中提及的数据、指令、命令、信息、信号、比特、符号和码片可以用电压、电流、电磁波、磁场或粒子、光场或粒子或者其任意组合来表示。
应该明白,公开的过程中的步骤的特定顺序或层次是示例性方法的实例。基于设计偏好,应该理解,过程中的步骤的特定顺序或层次可以在不脱离本公开的保护范围的情况下得到重新安排。所附的方法权利要求以示例性的顺序给出了各种步骤的要素,并且不是要限于所述的特定顺序或层次。
在上述的详细描述中,各种特征一起组合在单个的实施方案中,以简化本公开。不应该将这种公开方法解释为反映了这样的意图,即,所要求保护的主题的实施方案需要清楚地在每个权利要求中所陈述的特征更多的特征。相反,如所附的权利要求书所反映的那样,本发明处于比所公开的单个实施方案的全部特征少的状态。因此,所附的权利要求书特此清楚地被并入详细描述中,其中每项权利要求独自作为本发明单独的优选实施方案。
本领域技术人员还应当理解,结合本文的实施例描述的各种说明性的逻辑框、模块、电路和算法步骤均可以实现成电子硬件、计算机软件或其组合。为了清楚地说明硬件和软件之间的可交换性,上面对各种说明性的部件、框、模块、电路和步骤均围绕其功能进行了一般地描述。至于这种功能是实现成硬件还是实现成软件,取决于特定的应用和对整个系统所施加的设计约束条件。熟练的技术人员可以针对每个特定应用,以变通的方式实现所描述的功能,但是,这种实现决策不应解释为背离本公开的保护范围。
结合本文的实施例所描述的方法或者算法的步骤可直接体现为硬件、由处理器执行的软件模块或其组合。软件模块可以位于RAM存储器、闪存、ROM存储器、EPROM存储器、EEPROM存储器、寄存器、硬盘、移动磁盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质连接至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。该ASIC可以位于用户终端中。当然,处理器和存储介质也可以作为分立组件存在于用户终端中。
对于软件实现,本申请中描述的技术可用执行本申请所述功能的模块(例如,过程、函数等)来实现。这些软件代码可以存储在存储器单元并由处理器执行。存储器单元可以实现在处理器内,也可以实现在处理器外,在后一种情况下,它经由各种手段以通信方式耦合到处理器,这些都是本领域中所公知的。
上文的描述包括一个或多个实施例的举例。当然,为了描述上述实施例而描述部件或方法的所有可能的结合是不可能的,但是本领域普通技术人员应该认识到,各个实施例可以做进一步的组合和排列。因此,本文中描述的实施例旨在涵盖落入所附权利要求 书的保护范围内的所有这样的改变、修改和变型。此外,就说明书或权利要求书中使用的术语“包含”,该词的涵盖方式类似于术语“包括”,就如同“包括,”在权利要求中用作衔接词所解释的那样。此外,使用在权利要求书的说明书中的任何一个术语“或者”是要表示“非排它性的或者”。

Claims (16)

  1. 一种广告召回系统,其特征在于,包括获取实体间关系表达的系统和广告召回匹配系统;
    所述获取实体间关系表达的系统,用于将预先构建的异构图按照边的类型,拆分为子图,一个子图包括一种类型的边;所述异构图中的节点类型包括:广告、商品、查询词中的至少一种,所述边的类型包括点击边、共同点击边、协同过滤边、内容语义相似边和属性相似边中的至少一种;
    针对每个子图进行采样,得到每个子图的样本集合,样本集合的每条样本中包括一个源节点、一个正节点和至少一个负节点;
    将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练,分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达;基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化;
    预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到所述相同源节点的一个向量表达;基于所述相同源节点的一个向量表达和该源节点在各子图的样本中包括的正节点的向量表达和每个负节点的向量表达,使用预设的损失函数对聚合模型的参数进行优化;
    重复上述流程对所有批次的样本集合进行预设次数的训练,得到所述异构图中每个节点的一个低维向量表达,异构图中的一个节点对应样本数据中的一个实体;
    所述广告召回匹配系统,用于使用所述获取实体间关系表达的系统得到的查询词节点、商品节点和搜索广告节点的低维向量表达,确定查询词节点、商品节点和搜索广告节点之间的匹配程度,根据所述匹配程度选择与商品、查询词匹配程度符合设定要求的搜索广告。
  2. 如权利要求1所述的系统,其特征在于,所述获取实体间关系表达的系统针对每个子图进行采样,得到每个子图的样本集合,包括:
    针对每个子图,以选择节点为起点分别进行随机游走,得到每个子图对应的至少一个节点序列;用预设的滑动窗口,从所述节点序列中,得到每个子图对应的正样本集合,所述正样本集合中的一条正样本包括一个源节点与一个正节点;
    基于每个子图对应的正样本集合进行一次负节点的采样,得到每个子图对应的样本集合,所述样本集合中的一条样本包括一个源节点、一个正节点和至少一个负节点,所 述负节点与正节点的分布具有一致性,所述负节点与源节点的预设属性具有相关性。
  3. 如权利要求1所述的系统,其特征在于,所述获取实体间关系表达的系统将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练,分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,包括:
    针对每个子图的同一批次的样本集合,将样本集合中每条样本包括的源节点、正节点、负节点及各节点的属性信息输入到机器学习模型中;
    经机器学习模型的嵌入层,将样本中包括的节点的稀疏特征映射成稠密特征;
    源节点的稠密性特征经过对应的一个机器学习模型网络训练得到源节点的向量表达,正节点和负节点的稠密性特征经过对应的一个机器学习模型训练得到正节点和各负节点的向量表达。
  4. 如权利要求1所述的系统,其特征在于,所述获取实体间关系表达的系统基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化,包括:
    根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;
    预设的损失函数基于所述余弦距离对机器学习模型中的参数进行优化。
  5. 如权利要求1所述的系统,其特征在于,所述获取实体间关系表达的系统通过预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到所述相同源节点的一个向量表达,包括:
    根据每个子图训练得到的源节点的每个向量表达和对应的学习权重因子,确定所述源节点的从每个子图中训练得到的向量表达的权重;
    使用确定出来的权重对所述源节点从每个子图中训练得到的向量表达进行加权求和,得到所述源节点聚合后的一个向量表达。
  6. 如权利要求1所述的系统,其特征在于,所述获取实体间关系表达的系统基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化,包括:
    根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;
    预设的损失函数基于所述余弦距离对机器学习模型中的参数进行优化。
  7. 一种获取实体间关系表达的方法,其特征在于,包括:
    将预先构建的异构图按照边的类型,拆分为子图,一个子图包括一种类型的边;
    针对每个子图进行采样,得到每个子图的样本集合,样本集合的每条样本中包括一个源节点、一个正节点和至少一个负节点;
    将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练,分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达;基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化;
    预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到所述相同源节点的一个向量表达;基于所述相同源节点的一个向量表达和该源节点在各子图的样本中包括的正节点的向量表达和每个负节点的向量表达,使用预设的损失函数对聚合模型的参数进行优化;
    重复上述流程对所有批次的样本集合进行预设次数的训练,得到所述异构图中每个节点的一个低维向量表达,异构图中的一个节点对应样本数据中的一个实体。
  8. 如权利要求7所述的方法,其特征在于,针对每个子图进行采样,得到每个子图的样本集合,包括:
    针对每个子图,以选择节点为起点分别进行随机游走,得到每个子图对应的至少一个节点序列;用预设的滑动窗口,从所述节点序列中,得到每个子图对应的正样本集合,所述正样本集合中的一条正样本包括一个源节点与一个正节点;
    基于每个子图对应的正样本集合进行一次负节点的采样,得到每个子图对应的样本集合,所述样本集合中的一条样本包括一个源节点、一个正节点和至少一个负节点,所述负节点与正节点的分布具有一致性,所述负节点与源节点的预设属性具有相关性。
  9. 如权利要求8所述的方法,其特征在于,用预设的滑动窗口,从所述节点序列中,得到每个子图对应的正样本集合,具体包括:
    针对序列中的每个节点,按照预设的滑动窗口的大小,获取当该节点位于所述滑动窗口中时位于所述滑动窗口范围内的其他节点,将获取的其他节点分别与该节点组成样本对,得到所述正样本集合。
  10. 如权利要求8所述的方法,其特征在于,从正节点中进行负节点的采样,为每一对源节点和正节点得到对应的至少一个负节点,所述负节点与正节点的分布是一致的,所述负节点与源节点的具有相关性。
  11. 如权利要求10所述的方法,其特征在于,从正节点中进行负节点的采样,为每一对源节点和正节点得到对应的至少一个负节点,所述负节点与正节点的分布是一致的,所述负节点与源节点的具有相关性,包括:
    对样本集合中的正节点对进行统计,得到各正节点所在类目和同一个正节点在不同正样本中出现的次数作为所述正节点的分布权重;
    根据源节点的类目信息,从统计出的正节点中选取该类目下的正节点,根据所述分布权重确定获取的正节点作为负节点的概率,根据所述概率选择与源节点相关性符合要求的负节点。
  12. 如权利要求7所述的方法,其特征在于,将每个子图的同一个批次的样本集合输入预设的机器学习模型进行训练,分别得到每个子图的样本集合中各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,包括:
    针对每个子图的同一批次的样本集合,将样本集合中每条样本包括的源节点、正节点、负节点及各节点的属性信息输入到机器学习模型中;
    经机器学习模型的嵌入层,将样本中包括的节点的稀疏特征映射成稠密特征;
    源节点的稠密性特征经过对应的一个机器学习模型网络训练得到源节点的向量表达,正节点和负节点的稠密性特征经过对应的一个机器学习模型训练得到正节点和各负节点的向量表达。
  13. 如权利要求7所述的方法,其特征在于,基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化,包括:
    根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;
    预设的损失函数基于所述余弦距离对机器学习模型中的参数进行优化。
  14. 如权利要求7所述的方法,其特征在于,预设的聚合模型,对不同子图中的相同源节点的向量表达进行聚合学习,得到所述相同源节点的一个向量表达,包括:
    根据每个子图训练得到的源节点的每个向量表达和对应的学习权重因子,确定所述源节点的从每个子图中训练得到的向量表达的权重;
    使用确定出来的权重对所述源节点从每个子图中训练得到的向量表达进行加权求和,得到所述源节点聚合后的一个向量表达。
  15. 如权利要求7所述的方法,其特征在于,基于得到的各节点的向量表达使用预设的损失函数对机器学习模型中的参数进行优化,包括:
    根据训练得到的各条样本的源节点的向量表达、正节点的向量表达和每个负节点的向量表达,计算源节点和正节点、各负节点的余弦距离;
    预设的损失函数基于所述余弦距离对机器学习模型中的参数进行优化。
  16. 一种获取实体间关系表达的系统,其特征在于,包括:注册装置、存储装置、计算装置和参数交换装置;
    存储装置,用于存储异构图的数据;
    计算装置,用于通过注册装置从存储装置获取异构图的数据,采用如权利要求7-15任一所述的获取实体间关系表达的方法对异构图进行学习,得到异构图中每个节点的低维向量表达;
    参数交换装置,用于与计算装置进行参数交互。
PCT/CN2020/070250 2019-01-16 2020-01-03 获取实体间关系表达的方法、系统和设备、广告召回系统 WO2020147595A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910041481.3A CN111444395B (zh) 2019-01-16 2019-01-16 获取实体间关系表达的方法、系统和设备、广告召回系统
CN201910041481.3 2019-01-16

Publications (1)

Publication Number Publication Date
WO2020147595A1 true WO2020147595A1 (zh) 2020-07-23

Family

ID=71614354

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/070250 WO2020147595A1 (zh) 2019-01-16 2020-01-03 获取实体间关系表达的方法、系统和设备、广告召回系统

Country Status (2)

Country Link
CN (1) CN111444395B (zh)
WO (1) WO2020147595A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508115A (zh) * 2020-12-15 2021-03-16 北京百度网讯科技有限公司 建立节点表示模型的方法、装置、设备和计算机存储介质
CN112634995A (zh) * 2020-12-21 2021-04-09 绍兴数鸿科技有限公司 一种基于人工智能的苯酚裂解参数自动优化方法和装置
CN113033194A (zh) * 2021-03-09 2021-06-25 北京百度网讯科技有限公司 语义表示图模型的训练方法、装置、设备和存储介质
CN113673244A (zh) * 2021-01-04 2021-11-19 腾讯科技(深圳)有限公司 医疗文本处理方法、装置、计算机设备和存储介质
CN113763014A (zh) * 2021-01-05 2021-12-07 北京沃东天骏信息技术有限公司 物品共现关系确定方法和装置及判定模型获得方法和装置
CN116737745A (zh) * 2023-08-16 2023-09-12 杭州州力数据科技有限公司 一种更新供应链网络图中实体向量表示的方法及装置

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112214499B (zh) * 2020-12-03 2021-03-19 腾讯科技(深圳)有限公司 图数据处理方法、装置、计算机设备和存储介质
CN112906745B (zh) * 2021-01-21 2022-03-29 天津大学 基于边缘协同的诚信智能网络训练方法
CN113392289B (zh) * 2021-06-08 2022-11-01 北京三快在线科技有限公司 搜索推荐方法、装置、电子设备
CN113761392B (zh) * 2021-09-14 2022-04-12 上海任意门科技有限公司 内容召回方法、计算设备和计算机可读存储介质
CN113609346B (zh) * 2021-10-08 2022-01-07 企查查科技有限公司 基于企业关联关系的自然人人名消歧方法、设备和介质
CN116029891A (zh) * 2022-05-19 2023-04-28 北京百度网讯科技有限公司 图数据存储、访问、处理方法、训练方法、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140317038A1 (en) * 2013-04-23 2014-10-23 International Business Machines Corporation Predictive and descriptive analysis on relations graphs with heterogeneous entities
CN105843799A (zh) * 2016-04-05 2016-08-10 电子科技大学 一种基于多源异构信息图模型的学术论文标签推荐方法
CN107291803A (zh) * 2017-05-15 2017-10-24 广东工业大学 一种融合多类型信息的网络表示方法
CN108491511A (zh) * 2018-03-23 2018-09-04 腾讯科技(深圳)有限公司 基于图数据的数据挖掘方法和装置、模型训练方法和装置
CN109213801A (zh) * 2018-08-09 2019-01-15 阿里巴巴集团控股有限公司 基于关联关系的数据挖掘方法和装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102007044380A1 (de) * 2007-09-17 2009-03-19 Siemens Ag Verfahren zum rechnergestützten Lernen eines probabilistischen Netzes
US8719447B2 (en) * 2010-05-04 2014-05-06 Aryaka Networks, Inc. Heterogeneous service provider model through pay-for-performance based transit settlements
CN106155635B (zh) * 2015-04-03 2020-09-18 北京奇虎科技有限公司 一种数据处理方法和装置
CN106777339A (zh) * 2017-01-13 2017-05-31 深圳市唯特视科技有限公司 一种基于异构网络嵌入模型识别作者的方法
CN107577710B (zh) * 2017-08-01 2020-06-19 广州市香港科大霍英东研究院 基于异构信息网络的推荐方法及装置
CN108763376B (zh) * 2018-05-18 2020-09-29 浙江大学 融合关系路径、类型、实体描述信息的知识表示学习方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140317038A1 (en) * 2013-04-23 2014-10-23 International Business Machines Corporation Predictive and descriptive analysis on relations graphs with heterogeneous entities
CN105843799A (zh) * 2016-04-05 2016-08-10 电子科技大学 一种基于多源异构信息图模型的学术论文标签推荐方法
CN107291803A (zh) * 2017-05-15 2017-10-24 广东工业大学 一种融合多类型信息的网络表示方法
CN108491511A (zh) * 2018-03-23 2018-09-04 腾讯科技(深圳)有限公司 基于图数据的数据挖掘方法和装置、模型训练方法和装置
CN109213801A (zh) * 2018-08-09 2019-01-15 阿里巴巴集团控股有限公司 基于关联关系的数据挖掘方法和装置

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508115A (zh) * 2020-12-15 2021-03-16 北京百度网讯科技有限公司 建立节点表示模型的方法、装置、设备和计算机存储介质
CN112508115B (zh) * 2020-12-15 2023-10-24 北京百度网讯科技有限公司 建立节点表示模型的方法、装置、设备和计算机存储介质
CN112634995A (zh) * 2020-12-21 2021-04-09 绍兴数鸿科技有限公司 一种基于人工智能的苯酚裂解参数自动优化方法和装置
CN112634995B (zh) * 2020-12-21 2024-05-31 绍兴数鸿科技有限公司 一种基于人工智能的苯酚裂解参数自动优化方法和装置
CN113673244A (zh) * 2021-01-04 2021-11-19 腾讯科技(深圳)有限公司 医疗文本处理方法、装置、计算机设备和存储介质
CN113673244B (zh) * 2021-01-04 2024-05-10 腾讯科技(深圳)有限公司 医疗文本处理方法、装置、计算机设备和存储介质
CN113763014A (zh) * 2021-01-05 2021-12-07 北京沃东天骏信息技术有限公司 物品共现关系确定方法和装置及判定模型获得方法和装置
CN113033194A (zh) * 2021-03-09 2021-06-25 北京百度网讯科技有限公司 语义表示图模型的训练方法、装置、设备和存储介质
CN113033194B (zh) * 2021-03-09 2023-10-24 北京百度网讯科技有限公司 语义表示图模型的训练方法、装置、设备和存储介质
CN116737745A (zh) * 2023-08-16 2023-09-12 杭州州力数据科技有限公司 一种更新供应链网络图中实体向量表示的方法及装置
CN116737745B (zh) * 2023-08-16 2023-10-31 杭州州力数据科技有限公司 一种更新供应链网络图中实体向量表示的方法及装置

Also Published As

Publication number Publication date
CN111444395B (zh) 2023-05-16
CN111444395A (zh) 2020-07-24

Similar Documents

Publication Publication Date Title
WO2020147595A1 (zh) 获取实体间关系表达的方法、系统和设备、广告召回系统
WO2020147594A1 (zh) 获取实体间关系表达的方法、系统和设备、广告召回系统
Xu et al. A novel POI recommendation method based on trust relationship and spatial–temporal factors
Mao et al. Multiobjective e-commerce recommendations based on hypergraph ranking
US8543518B2 (en) Deducing shadow user profiles for ad campaigns
US20160188734A1 (en) Method and apparatus for programmatically synthesizing multiple sources of data for providing a recommendation
US20160283481A1 (en) Method and apparatus for combining text search and recommendation engines
US20160253325A1 (en) Method and apparatus for programmatically adjusting the relative importance of content data as behavioral data changes
US20160189218A1 (en) Systems and methods for sponsored search ad matching
US11436628B2 (en) System and method for automated bidding using deep neural language models
Gu et al. Enhancing session-based social recommendation through item graph embedding and contextual friendship modeling
CN113127754A (zh) 一种基于知识图谱的供应商推荐方法
WO2023142520A1 (zh) 信息推荐方法及装置
Xin et al. ATNN: adversarial two-tower neural network for new item’s popularity prediction in E-commerce
Liang et al. Collaborative filtering based on information-theoretic co-clustering
US10304081B1 (en) Yielding content recommendations based on serving by probabilistic grade proportions
Xie et al. Competitive influence maximization considering inactive nodes and community homophily
Lang et al. POI recommendation based on a multiple bipartite graph network model
Jin et al. IM2Vec: Representation learning-based preference maximization in geo-social networks
Wang et al. Who are the best adopters? User selection model for free trial item promotion
CN116823410B (zh) 数据处理方法、对象处理方法、推荐方法及计算设备
Guo et al. Price-aware enhanced dynamic recommendation based on deep learning
Jiang et al. A fusion recommendation model based on mutual information and attention learning in heterogeneous social networks
WO2023284516A1 (zh) 基于知识图谱的信息推荐方法、装置、设备、介质及产品
Leng et al. Geometric deep learning based recommender system and an interpretable decision support system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20741399

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20741399

Country of ref document: EP

Kind code of ref document: A1