WO2020147601A1 - 用于对图进行学习的系统 - Google Patents

用于对图进行学习的系统 Download PDF

Info

Publication number
WO2020147601A1
WO2020147601A1 PCT/CN2020/070416 CN2020070416W WO2020147601A1 WO 2020147601 A1 WO2020147601 A1 WO 2020147601A1 CN 2020070416 W CN2020070416 W CN 2020070416W WO 2020147601 A1 WO2020147601 A1 WO 2020147601A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
graph
nodes
storage
computing
Prior art date
Application number
PCT/CN2020/070416
Other languages
English (en)
French (fr)
Inventor
张研
任毅
杨斯然
陈根宝
魏源
田旭
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020147601A1 publication Critical patent/WO2020147601A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to the field of computer technology, in particular to a system for learning graphs.
  • the inventor of the present invention found:
  • a graph is composed of nodes and edges.
  • each sequence number represents a node
  • a node is used to represent an entity
  • the edge between nodes is used to represent the relationship between nodes.
  • a graph generally includes more than two nodes and more than one edge. Therefore, a graph can also be understood as consisting of a collection of nodes and a collection of edges, usually expressed as: G(V, E), where G represents the graph , V represents the set of nodes in the graph G, and E is the set of edges in the graph G.
  • Graphs can be divided into homogeneous graphs and heterogeneous graphs.
  • a heterogeneous graph refers to different types of nodes in a graph (the types of edges can be the same or different), or different types of edges in a graph (the types of nodes can be the same or Different),
  • Figure 1 shows a heterogeneous graph, the same type of edge is represented by the same line shape, and the same type of point is represented by the same geometric figure.
  • the prior art uses a single machine for graph learning.
  • a single machine has to store the graph and also learn the graph based on training data.
  • the number of nodes and edges in the graph is large and/or the training data is large, a single machine The machine will have problems with high storage pressure and/or too long graph learning time.
  • the present invention is proposed to provide a system for learning graphs that overcomes or at least partially solves the above problems.
  • the system for learning graphs at least includes: a computing node and a storage node;
  • the storage node is used to store subgraphs and provide query services to the computing nodes.
  • the subgraphs are obtained by dividing a graph in advance, and the number of subgraphs obtained by dividing a graph is greater than or equal to 2;
  • the computing node is used to send a query request to the storage node according to a preset graph learning task, and use the graph-related data obtained from the query request as one of the inputs of the preset graph learning task to execute the graph learning task;
  • the number of storage nodes can be configured to be two or more, and the number of computing nodes can be configured to be one or more.
  • the system provided by the present invention realizes the separation of graph learning task execution and graph query storage services by setting computing nodes and storage nodes.
  • the system supports the configuration of the number of storage nodes, realizes the distributed storage of a graph, and solves the technical problem of high storage pressure faced by a single machine storing a large graph.
  • the system supports the configuration of computing nodes
  • the number of computing nodes and the configuration of the number of computing nodes are generally related to the total graph learning time. The larger the total graph learning time, the more computing nodes can be configured. Therefore, compared with a single machine performing graph learning tasks, the configuration Calculating the number of nodes can improve graph learning efficiency, save graph learning time, and solve the problems of low graph learning efficiency and long learning time in the prior art.
  • Figure 1 is a schematic diagram of the figure
  • Embodiment 1 of the present invention is a schematic diagram of the composition of a system for learning graphs provided by Embodiment 1 of the present invention
  • FIG. 3 is a schematic diagram of the composition of a system for learning graphs according to Embodiment 2 of the present invention.
  • Embodiment 4 is a schematic diagram of the composition of a system for learning graphs provided by Embodiment 3 of the present invention.
  • Fig. 5 is a schematic diagram of the composition of a system for learning graphs provided by the fourth embodiment of the present invention.
  • a graph is a data structure.
  • entities such as advertisements, commodities, etc.
  • the relationship between entities such as commodities and advertising marketing
  • the effect relationship, etc. is regarded as the edges of the graph, and the mesh (graph) structure is obtained by splicing the edges of the points.
  • the nodes of the graph can be Query (query), Item (product category), and Ad (advertisement), etc.
  • the edges between nodes can be query behavior relationships, product content relationships, etc., in the field of travel
  • the nodes of the graph can be queries, locations, routes, etc.
  • the edges between nodes can be the association relationships between locations and routes. Therefore, the nodes and edges of the graph in the present invention can be determined according to the business scenario to which the graph is applied, and the present invention does not impose any limitation.
  • graph learning is meaningful for business scenarios. Therefore, when the relationship between the entities corresponding to the nodes in the graph and the entities corresponding to the edges is determined based on the business scenario, the graph is given business and technical meanings. According to the business scenario The technical problems and business problems to be solved perform corresponding graph learning tasks, and the results of solving the corresponding problems can be obtained.
  • graph representation learning can represent complex graphs in low-dimensional, real-valued, dense vector form, so that it has representation and reasoning capabilities, and can facilitate other machine learning tasks.
  • the above is a brief description of graphs.
  • the present invention addresses the technical problems of high storage pressure and/or long learning time in graph learning through a single machine in the prior art, and provides a new system architecture for graph learning. It can be called a graph learning-oriented system framework (framework), which can effectively solve the problems of high storage pressure and/or excessive graph learning time in the prior art.
  • framework graph learning-oriented system framework
  • the system for learning graphs provided in the first embodiment of the present invention includes: a computing node (also called a learning node) and a storage node;
  • the storage node is used to store subgraphs and provide query services to the computing nodes.
  • the subgraphs are obtained by dividing a graph in advance, and the number of subgraphs obtained by dividing a graph is greater than or equal to 2;
  • the computing node is used to send a query request to the storage node according to a preset graph learning task, and use the graph-related data obtained from the query request as one of the inputs of the preset graph learning task to execute the graph learning task;
  • the number of storage nodes can be configured to be two or more, and the number of computing nodes can be configured to be one or more.
  • the system provides the above by the first embodiment of the present invention. Since a graph is divided into at least two subgraphs, there are at least two storage nodes in the system shown in FIG. Compared with the graph storage and graph learning, the machine realizes the separation of graph learning task execution and graph query storage services by setting up computing nodes and storage nodes. At the same time, the system supports the configuration of the number of storage nodes and realizes the realization of a graph Distributed storage solves the technical problem of high storage pressure faced by a single machine storing a large graph. In addition, the system supports the configuration of the number of computing nodes. The configuration of the number of computing nodes is generally the same as the total number of graph learning Time length is related. The larger the total graph learning time, the more computing nodes can be configured. Therefore, compared with a single machine performing graph learning tasks, configuring the number of computing nodes can improve graph learning efficiency and save graph learning time It solves the problems of low learning efficiency and long learning time in the prior art.
  • the number of storage nodes that are started can be configured as follows.
  • the following configuration methods are applicable to any of the embodiments provided by the present invention. Specifically, when a graph is pre-defined When splitting into n( ⁇ 2) subgraphs, configure and enable n*k storage nodes, k( ⁇ 1) is the number of backups of each subgraph, and one storage node is used to store one subgraph or one copy of subgraph. Compared with the first method, this method ensures that the graph-related data requested by multiple storage nodes are stored in the same subgraph, and the system can respond quickly, or when a storage node of a subgraph fails, it does not affect the graph learning task. The normal operation of the system ensures the reliability of the system.
  • the query can be sent to the storage node by broadcasting.
  • Request that is, the query request of the computing node will be sent to all storage nodes.
  • the method of sending the query request by broadcasting is not the preferred method.
  • the corresponding relationship between the subgraph and the storage node can be stored locally on the computing node, so that the computing node can know that the corresponding relationship should be sent to the storage node before sending the request. Which storage nodes send the query request.
  • the computing node actively asks the storage node, and stores the correspondence between the subgraph and the storage node obtained through the inquiry locally;
  • the storage node actively synchronizes the corresponding relationship with the computing node after storing the subgraph, and the computing node stores the corresponding relationship synchronized by the storage node locally.
  • Embodiment 1 of the present invention provides another system for learning graphs. As shown in Figure 3, the system includes: storage nodes, computing nodes and registration nodes. The difference from the system shown in Figure 2 is :
  • Register nodes used to store the correspondence between subgraphs and storage nodes
  • the computing node first inquires the storage node from the registered node according to the preset learning task, and then sends a query request to the inquired storage node.
  • the corresponding relationship between the subgraph and the storage node can also be obtained and stored locally in the manner in which the computing node obtains the corresponding relationship between the subgraph and the storage node, which will not be repeated here.
  • the number of registered nodes can be one or more, and you can view the learning task configuration.
  • the computing node may no longer ask the registration node for the storage node.
  • the configuration of the number of computing nodes is generally related to the total time of graph learning. At least one must be configured. After the system is started, all computing nodes in the system have the same The work target serves, so the machine learning models set on the computing nodes are basically the same. In order to ensure the learning effect, the computing nodes need to exchange parameters.
  • a computing node does not involve parameter exchange, and the number of computing nodes is not In many cases, a computing node can be selected to undertake the task of parameter exchange, or the computing nodes can exchange parameters according to certain rules.
  • the present invention is based on the system provided by the foregoing embodiment , Provides two other systems for learning graphs, both of which include: parameter exchange nodes, the number of parameter exchange nodes can also be configured. specifically:
  • An embodiment includes: storage nodes, computing nodes, and parameter exchange nodes.
  • another embodiment includes: a storage node, a computing node, a registration node, and a parameter node.
  • the computing node needs to further synchronize the parameters of the graph learning model (machine learning model) on which the graph learning task is performed to the parameter exchange node.
  • the parameter exchange node will be based on the synchronization of the computing node. Perform optimal parameter calculations on the parameters and locally stored parameters, and send the calculated parameters to the computing node, that is, the parameter exchange node, for performing optimal parameter calculations, and returning the obtained optimal parameters to the computing node.
  • parameter exchange (interaction) between a computing node and a parameter exchange node.
  • the parameter exchange node and the computing node can exchange parameters in two ways: synchronous or asynchronous.
  • composition and working principle of the system provided by the present invention have been introduced above, and some technical features of the above system will be described below in combination with different scenarios.
  • the graph is the data that expresses the graph structure.
  • the subgraph obtained by segmenting the graph is the subgraph structure data obtained after the graph structure data is segmented.
  • Such graphs generally need to be trained with training data.
  • the computing node In order to get the results used to solve the corresponding technical problems and business problems.
  • the computing node not only uses the graph-related data requested from the storage node as one of the inputs of the graph learning task, but also uses the training data as the input of the graph learning task.
  • the graph is a graph constructed based on training data.
  • This graph has not only graph structure data but also training data.
  • the graph is called a training graph, and the graph is divided into training data.
  • the computing node can request graph-related data from the storage node through a global sampling query request, a neighbor sampling query request, and a feature sampling query request, so that the requested graph-related data is the input of the graph learning task.
  • the computing node of the present invention obtains graph-related data by sending a query request to the storage node.
  • the specific data is related to the scenario, and the present invention does not make any limitation.
  • a batch of training data can be divided into m sub-training data evenly, and each computing node performs corresponding tasks based on one sub-training data.
  • Graph learning task After m computing nodes have learned the same batch of training data, if there are other batches of training data, then this batch of training data will be learned again until all batches of training data have been learned After getting the final result.
  • each sub-training data can be manually uploaded to the computing node.
  • the embodiment of the present invention also provides a scheduling node;
  • the scheduling node is used to divide the training data of each batch into sub-training data according to the number of configured computing nodes, and synchronize the sub-training data to the computing nodes, and each computing node to synchronize a copy of the sub-training data.
  • Any system provided by the embodiment of the present invention may include a scheduling node, and FIG. 5 is only an example of a system including a scheduling node.
  • a node of the present invention can be realized by a machine, or a machine can be used to realize some nodes in the system, of course, it can also be a server. To implement the system in a cluster, it is enough to ensure that each node in the system has the corresponding capabilities.
  • the present invention stores the graph in a distributed manner, and at the same time, the computing node and the storage node are separated.
  • the computing node of the present invention needs to obtain graph-related data through the query service provided by the storage node.
  • the storage node of the present invention provides query services specifically including:
  • the computing node sends a global sampling query request to the storage node, and the storage node performs a global sampling query after receiving the request.
  • the client since the graph is stored in a distributed manner, the client (computing node) will obtain the total weight of the type elements of the storage node from the registration node, and then the client (computing node) will determine the storage node according to the distribution weight of the element in the storage node
  • the number of elements to be sampled and then a global sampling request is sent to all storage nodes to inform each storage node of the type of elements to be sampled and the number of elements to be sampled.
  • the computing node needs to merge the query results first after receiving the query results returned by all storage nodes. For example, the element ids sampled from each storage node are combined according to different element types.
  • the elements are collective names for nodes and edges.
  • the computing node sends a neighbor sampling query request to the storage node, and the storage node performs a neighbor sampling query after receiving the request.
  • the difference between neighbor sampling and global sampling is that the neighbor sampling calculation node needs to inform the storage node of the node (root node) that needs to query the neighbor node through the neighbor sampling query request.
  • the root node in neighbor sampling can be preset or provided by the global sampling service.
  • the client since the graph is stored in a distributed manner, the client (computing node) will first split the neighbor node query request into multiple sub-neighbor sampling query requests according to the root node id, and send the query request to the sub-request request Storage nodes of all root nodes; the client (computing node) also needs to merge after receiving samples from different storage nodes to obtain query results. For example, according to the neighbor node id, they are combined according to different neighbor edge types.
  • the computing node sends a feature query request to the storage node, and the storage node performs feature query after receiving the request.
  • the client since the graph is stored in a distributed manner, the client (computing node) will first split the feature query request into multiple sub feature query requests according to the pre-specified node/edge id, and send the sub feature query request to the owner The storage node for the characteristics of all nodes/edges requested by the subrequest. The client (computing node) needs to further merge the node/edge lists with characteristic information returned by different storage nodes.
  • results returned by the above three query services are not necessarily all graph-related data that can be directly input as a graph learning task.
  • the specific returned results depend on specific business scenarios, and the present invention does not make specific descriptions and restrictions.
  • the embodiments of the present invention may be provided as methods, systems, or computer program products. Therefore, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, the present invention may be in the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, optical storage, etc.) containing computer-usable program codes.
  • These computer program instructions may also be stored in a computer readable memory that can guide a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer readable memory produce an article of manufacture including an instruction device, the instructions
  • the device implements the functions specified in one block or multiple blocks of the flowchart one flow or multiple flows and/or block diagrams.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device, so that a series of operating steps are performed on the computer or other programmable device to generate computer-implemented processing, which is executed on the computer or other programmable device
  • the instructions provide steps for implementing the functions specified in one block or multiple blocks of the flowchart one flow or multiple flows and/or block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了用于对图进行学习的系统,该系统包括:计算节点和存储节点;存储节点,用于存储子图并向计算节点提供查询服务,所述子图是预先对一张图进行切分得到的,一张图进行切分得到的子图的个数大于等于2;计算节点,用于按照预设的图学习任务,向存储节点发送查询请求,将查询请求得到的图相关数据作为预设的图学习任务的输入之一,执行图学习任务;其中,存储节点的个数可配置为两个及以上,所述计算节点的个数可配置为一个及以上。该系统与现有技术相比,提升了图学习的效率。

Description

用于对图进行学习的系统
本申请要求2019年01月16日递交的申请号为201910041326.1、发明名称为“用于对图进行学习的系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机技术领域,特别涉及用于对图进行学习的系统。
背景技术
随着移动终端及应用软件的普及,在社交、电商、物流、出行、外卖、营销等领域的服务提供商沉淀了海量业务数据,基于海量业务数据,挖掘不同业务实体(实体)之间的关系成为数据挖掘领域一个重要的技术研究方向。而随着机器处理能力的提升,越来越多技术人员开始研究如何通过机器学习技术进行挖掘。
本发明的发明人发现:
目前,通过机器学习技术,对海量业务数据进行学习,得到用于表达实体及实体之间关系的图(Graph),即,对海量业务数据进行图学习,成为一个优选的技术方向。简单理解,图由节点和边构成,如图1所示,每个序号代表一个节点,一个节点用于表示一个实体,节点与节点之间的边用于表示节点之间的关系。一张图一般会包括两个以上的节点和一条以上的边,因此,图也可以理解为由节点的集合和边的集合组成,通常表示为:G(V,E),其中,G表示图,V表示图G中节点的集合,E是图G中边的集合。图可以分为同构图和异构图,其中,异构图指的是一张图中的节点的类型不同(边的类型可以相同或者不同),或者一张图中边的类型不同(节点的类型可以相同或者不同),图1所示则为一张异构图,同样类型的边用同样的线形表示,同样类型的点用同样的几何图形表示。
现有技术是用单台机器进行图学习,单台机器既要存储图还要基于训练数据对图进行学习,当图中的节点和边的数量很大和/或训练数据很大时,单台机器会出现存储压力大和/或图学习时间过长的问题。
发明内容
鉴于上述问题,提出了本发明以便提供一种克服上述问题或者至少部分地解决上述问题的用于对图进行学习的系统。
本发明实施例提供的用于对图进行学习的系统,至少包括:计算节点和存储节点;
存储节点,用于存储子图并向计算节点提供查询服务,所述子图是预先对一张图进行切分得到的,一张图进行切分得到的子图的个数大于等于2;
计算节点,用于按照预设的图学习任务,向存储节点发送查询请求,将查询请求得到的图相关数据作为预设的图学习任务的输入之一,执行图学习任务;
其中,存储节点的个数可配置为两个及以上,所述计算节点的个数可配置为一个及以上。
本发明实施例提供的上述技术方案的有益效果至少包括:
本发明提供的系统,与现有技术用单台机器进行图存储和图学习相比,本发明通过设置计算节点和存储节点,实现了图学习任务执行和图查询存储服务的分离,同时,该系统支持存储节点的个数配置,实现了对一张图的分布式存储,解决了单台机器存储一张很大的图所面临的存储压力大的技术问题,另外,本系统支持配置计算节点的个数,计算节点的个数配置一般与图学习总时长相关,图学习总时长越大配置的计算节点的个数可以越多,因此,与单台机器执行图学习任务相比,通过配置计算节点的个数可以提高图学习效率,节省图学习时间,解决了现有技术图学习效率低和学习时间长的问题。
本发明的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。
下面通过附图和实施例,对本发明的技术方案做进一步的详细描述。
附图说明
附图用来提供对本发明的进一步理解,并且构成说明书的一部分,与本发明的实施例一起用于解释本发明,并不构成对本发明的限制。在附图中:
图1为图的示意图;
图2为本发明实施例一提供的用于对图进行学习的系统组成示意图;
图3为本发明实施例二提供的用于对图进行学习的系统组成示意图;
图4为本发明实施例三提供的用于对图进行学习的系统组成示意图;
图5为本发明实施例四提供的用于对图进行学习的系统组成示意图。
具体实施方式
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
关于图需要说明的是,图是一种数据结构,在应用时,需要将现实场景中的实体(如广告、商品等)抽象为图的节点,将实体之间的关系(如商品和广告营销效果关系等)视为图的边,通过点边拼接得到网状(图)结构。比如,在电商领域,图的节点可以是Query(查询),Item(商品类目)和Ad(广告)等,节点之间的边可以是查询行为关系,商品内容关系等,在出行领域,图的节点可以是查询、地点、路线等,节点之间的边可以是地点与路线之间的关联关系等。因此,本发明中的图的节点及边可以根据图所应用的业务场景确定,本发明不做任何限制。
图学习面向业务场景才是有意义的,所以,当基于业务场景确定了图中的节点对应的实体和边对应的实体间关系后,图则被赋予了业务含义和技术含义,按照该业务场景要解决的技术问题和业务问题执行相应的图学习任务,则可得到解决相应问题的结果。比如,图表示学习可将复杂的图表示成低维、实值、稠密的向量形式,使其具有表示及推理能力,可以方便执行其他机器学习任务。
以上是关于图的简要说明,本发明是针对现有技术通过单台机器进行图学习存在存储压力大和/或学习时间长的技术问题,提供了新的用于对图进行学习的系统架构,亦可称为面向图学习的系统框架(框架),该系统能够有效地解决现有技术存储压力大和/或图学习总时间过长的问题。
如图2所示,本发明实施例一提供的用于对图进行学习的系统,该系统包括:计算节点(亦可称为学习节点)和存储节点;
存储节点,用于存储子图并向计算节点提供查询服务,所述子图是预先对一张图进行切分得到的,一张图进行切分得到的子图的个数大于等于2;
计算节点,用于按照预设的图学习任务,向存储节点发送查询请求,将查询请求得到的图相关数据作为预设的图学习任务的输入之一,执行图学习任务;
其中,存储节点的个数可配置为两个及以上,所述计算节点的个数可配置为一个及以上。
以上是本发明实施例一提供的系统,由于一张图会被至少分为两份子图,所以,图2所示的系统中至少有两个存储节点,该系统,与现有技术用单台机器进行图存储和图 学习相比,通过设置计算节点和存储节点,实现了图学习任务执行和图查询存储服务的分离,同时,该系统支持存储节点的个数配置,实现了对一张图的分布式存储,解决了单台机器存储一张很大的图所面临的存储压力大的技术问题,另外,本系统支持配置计算节点的个数,计算节点的个数配置一般与图学习总时长相关,图学习总时长越大配置的计算节点的个数可以越多,因此,与单台机器执行图学习任务相比,通过配置计算节点的个数可以提高图学习效率,节省图学习时间,解决了现有技术图学习效率低和学习时间长的问题。
对于本发明提供的系统在执行图学习任务时,启动的存储节点的个数,可以按照如下方式进行配置,如下配置方式适用于本发明提供的任何一个实施例,具体地,当一个图被预先切分成n(≥2)份子图时,配置启用n*k个存储节点,k(≥1)是每份子图的备份个数,一个存储节点用于存储一份子图或者一份子图的备份,与第一种方式相比,该方式保证了在多个存储节点请求的图相关数据存储在同一份子图中,系统能够快速响应,或者某个子图的存储节点出问题时,不影响图学习任务的正常进行,保证了系统的可靠性。
当图被分布式存储时,对于图2所示的系统,在存储节点和计算节点的个数不是很多时,计算节点向存储节点请求图相关数据时,可以通过广播的方式向存储节点发送查询请求,即,计算节点的查询请求会被发送给所有的存储节点,在存储节点的个数或者计算节点的个数较多时,广播发送查询请求的方式,则不是优选方式,在这种情况下,为了提高计算节点向存储节点请求图相关数据的效率,可以采用在计算节点本地存储子图和存储节点的对应关系的方式,使计算节点在发送请求之前能够通过本地存储的对应关系知道应该向哪些存储节点发送查询请求。
对于计算节点如何获得所述对应关系,可以采用如下方式:
1、计算节点向存储节点主动询问,并将通过询问得到子图与存储节点的对应关系存储在本地;
2、存储节点存储子图之后主动向计算节点同步对应关系,计算节点将存储节点同步来的对应关系存储在本地。
以上是本发明实施例一提供的系统,该系统在存储节点和计算节点个数较多时,由于每个存储节点都要存储前述对应关系,这会造成存储节点的资源浪费,为了提升存储节点的资源利用率,本发明实施例二提供另一种用于对图进行学习的系统,如图3所示,该系统包括:存储节点,计算节点和注册节点,与图2所示系统的区别在于:
注册节点,用于存储子图与存储节点的对应关系;
计算节点,按照预设的学习任务先向注册节点询问存储节点,再向询问到的存储节点发送查询请求。
对于注册节点,也可以按照前述计算节点获得子图与存储节点的对应关系的方式,得到子图与存储节点的对应关系并存储在本地,此处不再赘述。
注册节点的个数可以是一个,也可以是多个,可以视图学习任务配置。
需要注意的是,当计算节点向注册节点询问到存储节点之后,如果计算节点可以始终成功从相应存储节点获得图相关数据,则计算节点可以不再向注册节点询问存储节点。
无论子图与存储节点的对应关系是存储在注册节点还是存储在计算节点,当子图与存储节点的对应关系发生变化时,需要保证注册节点或者计算节点能够及时更新其上存储的子图与存储节点的对应关系。
以上是本发明实施例二提供的系统,前文已述,计算节点的个数配置一般与图学习总时长相关,至少要配置一个,而系统启动后,系统中所有的计算节点的是为同一个工作目标服务,所以计算节点上设置的机器学习模型基本是相同的,为了保证学习效果,计算节点之间需要进行参数交换,当然一个计算节点不涉及参数交换,并且,在计算节点的个数不多时,可以选择一个计算节点承担参数交换的任务,或计算节点之间按一定的规则进行参数交换,而当计算节点个数非常多时,为了降低系统复杂度,本发明基于前述实施例提供的系统,提供了另外两种用于对图进行学习的系统,这两种系统均包括:参数交换节点,参数交换节点的个数亦可配置。具体地:
一种实施例包括:存储节点、计算节点和参数交换节点。
如图4所示,另一种实施例包括:存储节点、计算节点、注册节点和参数节点。
在系统包括参数交换节点的情况下,计算节点,需要进一步将其上执行图学习任务的图学习模型(机器学习模型)的参数同步给参数交换节点,参数交换节点,则会基于计算节点同步来的参数和本地存储的参数进行最优参数运算,并将运算得到的参数发送给计算节点,即,参数交换节点,用于进行最优参数运算,并将得到的最优参数返回给计算节点。
本领域技术人员可以理解,前述流程亦可称为计算节点和参数交换节点的参数交换(交互)。参数交换节点和计算节点之间可以采用同步或者异步两种方式进行参数交换。
以上对本发明提供的系统组成及工作原理进行了介绍,以下结合不同的场景对上述系统部分技术特征进行说明。
第一种场景,图是表达图结构的数据,对该图进行切分得到的子图则是将图结构数据切分之后得到的子图结构数据,这样的图一般还需要用训练数据进行训练,才能得到用于解决相应技术问题和业务问题的结果。在这种场景下,计算节点不仅要将从存储节点请求得到的图相关数据作为图学习任务的输入之一,也要将训练数据作为图学习任务输入。
第二种场景,图是基于训练数据构建的图,这种图不仅有图结构数据还带有训练数据,为便于区分本发明称该图为训练图,对该图进行切分得到的是训练子图,这种情况下,上述计算节点可以通过全局采样查询请求、邻居采样查询请求和特征采样查询请求,从存储节点请求图相关数据,这样请求来的图相关数据就是图学习任务的输入。
基于上述两种场景,本发明计算节点通过向存储节点发送查询请求得到的图相关数据,具体是哪种数据则与场景相关,本发明不做任何限定。
针对前述第一种场景,当基于图学习任务总时长确定需要m个计算节点时,可以将一个批次的训练数据平均分为m份子训练数据,每个计算节点基于一份子训练数据执行相应的图学习任务,当m个计算节点学习完同一个批次的训练数据之后,如果还有其他批次的训练数据,则再对该批次训练数据进行学习,直到学习完所有批次的训练数据后得到最终结果。
当训练数据被分为m份子训练数据后,可以将每一份子训练数据手动上传至计算节点,另外,本发明实施例还提供了调度节点;
调度节点,用于将每个批次的训练数据按照配置计算节点的个数,划分为子训练数据,并将子训练数据同步计算节点,每个计算节点同步一份子训练数据。
本发明实施例提供的任何一个系统均可包括调度节点,图5所示仅为包含调度节点的系统示例。
以上对本发明提供的系统组成及系统工作原理进行了介绍,在实际应用中,本发明一个节点可以用一台机器实现,也可以用一台机器实现系统中的部分节点,当然也可以是用服务器集群实现该系统,只要保证系统中各节点具备相应的能力即可。
以下将针对该存储节点和计算节点之间如何实现图相关数据的查询进行详细说明,与现有技术相比,由于本发明将图进行了分布式存储,同时,计算节点和存储节点分离,所以,本发明计算节点需要通过存储节点提供的查询服务得到图相关数据,具体地,为了得到适用于不同场景的图相关数据,本发明存储节点提供查询服务具体包括:
第一、全局采样查询服务。计算节点向存储节点发送全局采样查询请求,存储节点 在收到该请求后进行全局采样查询。
具体的,由于图被分布式存储,客户端(计算节点)会从注册节点获取到存储节点的类型元素的权重总和,然后,客户端(计算节点)根据元素在存储节点的分布权重确定存储节点需要采样的元素的个数,然后向所有的存储节点发送全局采样请求,告知每个存储节点要采样的元素类型和要采样的元素个数。计算节点在收到所有存储节点返回的查询结果后需要先对查询结果进行合并。比如,将每个存储节点采样得到的元素id按照不同元素类型分别合并起来。所述元素是节点和边的统称。
第二、邻居采样查询服务。计算节点向存储节点发送邻居采样查询请求,存储节点收到该请求后进行邻居采样查询。邻居采样与全局采样的区别在于,邻居采样计算节点需要通过邻居采样查询请求告诉存储节点需要查询邻居节点的节点(根节点)。邻居采样中的根节点可以是预设的,也可以是全局采样服务提供的。
具体的,由于图是分布式存储的,客户端(计算节点)会先按照根节点id将邻居节点查询请求拆分成多个子邻居采样查询请求,并将查询请求发送给拥有该子请求请求的所有根节点的存储节点;客户端(计算节点)收到不同存储节点采样得到查询结果后也需要进行合并。比如,根据邻居节点id按照不同邻居边类型分别合并起来。
第三、特征查询服务。计算节点向存储节点发送特征查询请求,存储节点收到该请求后进行特征查询。
具体的,由于图是分布式存储的,客户端(计算节点)会先按照预先指定的节点/边id将特征查询请求拆分成多个子特征查询请求,并将子特征查询请求发送给拥有该子请求请求的所有节点/边的特征的存储节点。客户端(计算节点)需要进一步将不同存储节点返回的拥有特征信息的节点/边列表合并起来。
上述三种查询服务返回的结果并不必然都是能直接作为图学习任务输入的图相关数据,具体返回的结果情况视具体业务场景,本发明不做具体说明和限制。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每 一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。

Claims (9)

  1. 一种用于对图进行学习的系统,其特征在于,包括:
    计算节点和存储节点;
    存储节点,用于存储子图并向计算节点提供查询服务,所述子图是预先对一张图进行切分得到的,一张图进行切分得到的子图的个数大于等于2;
    计算节点,用于按照预设的图学习任务,向存储节点发送查询请求,将查询请求得到的图相关数据作为预设的图学习任务的输入之一,执行图学习任务;
    其中,存储节点的个数可配置为两个及以上,所述计算节点的个数可配置为一个及以上。
  2. 如权利要求1所述的系统,其特征在于,当图被预先切分为n份子图时,所述计算节点的个数配置具体包括:
    配置启用n*k个存储节点,k是每份子图的备份个数,存储节点用于存储一份子图或者一份子图的备份,k大于等于1。
  3. 如权利要求1所述的系统,其特征在于,
    计算节点,进一步用于向存储节点主动询问子图与存储节点的对应关系,并将询问得到的子图与存储节点的对应关系存储在本地;
    或者,
    存储节点,进一步用于主动向计算节点同步子图与存储节点的对应关系,
    计算节点,进一步用于将存储节点主动同步来的所述对应关系存储在本地;
    所述计算节点,按照预设的图学习任务,向存储节点发送查询请求具体包括:
    按照预设的图学习任务,从本地存储的子图与存储节点的对应关系中,获取可发送查询请求的存储节点,并向获取到的存储节点发送查询请求。
  4. 如权利要求1所述的系统,其特征在于,所述系统进一步包括:
    注册节点,用于存储子图与存储节点的对应关系;
    计算节点,按照预设的图学习任务,向存储节点发送查询请求,具体包括:
    按照预设的学习任务,向注册节点询问可发送查询请求的存储节点,再向询问获得的存储节点发送查询请求;
    其中,注册节点的个数可配置。
  5. 如权利要求1或4所述的系统,其特征在于,所述系统进一步包括:参数交换节点;
    计算节点,执行图学习任务后,将执行图学习任务的图学习模型的参数进一步同步给参数交换节点;
    参数交换节点,基于计算节点同步来的参数和本地存储的参数进行最优参数运算,并将运算得到的参数发送给计算节点;
    其中,参数交换节点的个数可配置。
  6. 如权利要求5所述的系统,其特征在于,
    参数交换节点和计算节点之间可以采用同步或者异步方式进行参数交换。
  7. 如权利要求5所述的系统,其特征在于,当所述图学习任务需要训练数据作为输入时,所述系统进一步包括:调度节点;
    调度节点,用于将每个批次的训练数据按照计算节点的个数,划分为子训练数据,并将子训练数据同步给计算节点,一个计算节点分配一份子训练数据。
  8. 如权利要求5所述的系统,其特征在于,若计算节点同时向两个以上的存储节点发送查询请求,则
    计算节点,进一步用于对所有存储节点返回的图相关数据进行合并,基于合并得到的图相关数据执行图学习任务。
  9. 如权利要求5所述的系统,其特征在于,存储节点的查询服务包括:全局采样查询服务,邻居采样查询服务和特征采样查询服务中的一种或者多种。
PCT/CN2020/070416 2019-01-16 2020-01-06 用于对图进行学习的系统 WO2020147601A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910041326.1A CN111444309B (zh) 2019-01-16 2019-01-16 用于对图进行学习的系统
CN201910041326.1 2019-01-16

Publications (1)

Publication Number Publication Date
WO2020147601A1 true WO2020147601A1 (zh) 2020-07-23

Family

ID=71613701

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/070416 WO2020147601A1 (zh) 2019-01-16 2020-01-06 用于对图进行学习的系统

Country Status (2)

Country Link
CN (1) CN111444309B (zh)
WO (1) WO2020147601A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003775A (zh) * 2021-10-29 2022-02-01 支付宝(杭州)信息技术有限公司 图数据处理、查询方法及其系统

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541038A (zh) * 2020-12-01 2021-03-23 杭州海康威视数字技术股份有限公司 时序数据管理方法、系统、计算设备及存储介质
CN114217743B (zh) * 2021-09-17 2024-05-31 支付宝(杭州)信息技术有限公司 用于分布式图学习架构的数据存取方法及装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9852231B1 (en) * 2014-11-03 2017-12-26 Google Llc Scalable graph propagation for knowledge expansion
CN107733696A (zh) * 2017-09-26 2018-02-23 南京天数信息科技有限公司 一种机器学习和人工智能应用一体机部署方法
CN109194707A (zh) * 2018-07-24 2019-01-11 阿里巴巴集团控股有限公司 分布式图嵌入的方法及装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101883185B1 (ko) * 2016-08-09 2018-07-30 주식회사 피노텍 머신러닝을 활용한 정해진 시나리오로 고객과 상담하는 로봇 자동 상담 방법 및 시스템
CN107885762B (zh) * 2017-09-19 2021-06-11 北京百度网讯科技有限公司 智能大数据系统、提供智能大数据服务的方法和设备
CN108564164B (zh) * 2018-01-08 2022-04-29 中山大学 一种基于spark平台的并行化深度学习方法
CN109145121B (zh) * 2018-07-16 2021-10-29 浙江大学 一种时变图数据的快速存储查询方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9852231B1 (en) * 2014-11-03 2017-12-26 Google Llc Scalable graph propagation for knowledge expansion
CN107733696A (zh) * 2017-09-26 2018-02-23 南京天数信息科技有限公司 一种机器学习和人工智能应用一体机部署方法
CN109194707A (zh) * 2018-07-24 2019-01-11 阿里巴巴集团控股有限公司 分布式图嵌入的方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003775A (zh) * 2021-10-29 2022-02-01 支付宝(杭州)信息技术有限公司 图数据处理、查询方法及其系统

Also Published As

Publication number Publication date
CN111444309B (zh) 2023-04-14
CN111444309A (zh) 2020-07-24

Similar Documents

Publication Publication Date Title
US10685283B2 (en) Demand classification based pipeline system for time-series data forecasting
Merla et al. Data analysis using hadoop MapReduce environment
CN109471900B (zh) 图表类数据自定义动作数据交互方法及系统
WO2020147601A1 (zh) 用于对图进行学习的系统
US20180240062A1 (en) Collaborative algorithm development, deployment, and tuning platform
US20180181957A1 (en) Data monetization and exchange platform
US20180011739A1 (en) Data factory platform and operating system
US10896178B2 (en) High performance query processing and data analytics
CN111046237B (zh) 用户行为数据处理方法、装置、电子设备及可读介质
CN111209310B (zh) 基于流计算的业务数据处理方法、装置和计算机设备
CN103186834A (zh) 业务流程配置方法和装置
US9910821B2 (en) Data processing method, distributed processing system, and program
CN109298948B (zh) 分布式计算方法和系统
US20180276508A1 (en) Automated visual information context and meaning comprehension system
WO2022142859A1 (zh) 数据处理方法、装置、计算机可读介质及电子设备
Verma et al. Big Data representation for grade analysis through Hadoop framework
Jayasekara et al. Wihidum: Distributed complex event processing
US11381463B2 (en) System and method for a generic key performance indicator platform
CN105812175B (zh) 一种资源管理方法及资源管理设备
US20240220319A1 (en) Automated visual information context and meaning comprehension system
US20220240440A1 (en) Agricultural tool unit for rapid conversion of a combination seed drill having a trailed or fine-grain seed dispenser to an on-demand supply system and vice versa
CN111382315A (zh) 子图同构匹配结果的合并方法、电子设备及存储介质
CN115361382B (zh) 基于数据群组的数据处理方法、装置、设备和存储介质
CN115378937B (zh) 任务的分布式并发方法、装置、设备和可读存储介质
CN113821313A (zh) 一种任务调度方法、装置及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20741400

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20741400

Country of ref document: EP

Kind code of ref document: A1