New! View global litigation for patent families

CN104580476A - Method and device for selecting node in distributed system - Google Patents

Method and device for selecting node in distributed system Download PDF

Info

Publication number
CN104580476A
CN104580476A CN 201510016624 CN201510016624A CN104580476A CN 104580476 A CN104580476 A CN 104580476A CN 201510016624 CN201510016624 CN 201510016624 CN 201510016624 A CN201510016624 A CN 201510016624A CN 104580476 A CN104580476 A CN 104580476A
Authority
CN
Grant status
Application
Patent type
Prior art keywords
node
nodes
method
device
none
Prior art date
Application number
CN 201510016624
Other languages
Chinese (zh)
Inventor
吕信
郭李明
Original Assignee
北京京东尚科信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L29/00Arrangements, apparatus, circuits or systems, not covered by a single one of groups H04L1/00 - H04L27/00 contains provisionally no documents
    • H04L29/02Communication control; Communication processing contains provisionally no documents
    • H04L29/06Communication control; Communication processing contains provisionally no documents characterised by a protocol
    • H04L29/08Transmission control procedure, e.g. data link level control procedure

Abstract

The invention provides a method and device for selecting a node in a distributed system. The method and device facilitate the improvement of the whole PrestoDB cluster performance through improving the memory capacity of a None node under the condition that the workload and the cost are low. According to the method, in the distributed system, a portion of predetermined nodes are used as candidate None nodes; under the condition that pieces of data of a plurality of Source nodes are needed to be gathered in one node, a node is chosen form the candidate None nodes, and then the pieces of data of the Source nodes are made to gather in the selected node.

Description

在分布式系统中选取节点的方法和装置 Method and apparatus for selecting a node in the distributed system

技术领域 FIELD

[0001] 本发明涉及计算机技术领域,特别地涉及一种在分布式系统中选取节点的方法和 [0001] The present invention relates to computer technologies, and particularly relates to a method for selecting a node in a distributed system, and

目.0 Head .0

背景技术 Background technique

[0002] 伴随着大数据的兴起,互联网公司的业务数据量逐年上升,因此各大互联网公司都在内部推行大数据技术,并且针对于核心业务系统建设数据仓库,目前数据仓库分为两种类型:离线数据仓库和实时数据仓库。 [0002] With the rise of big data, internet traffic data company increased year by year, so the major Internet companies are implementing big data technology in-house and on your core business systems for building data warehouse, data warehouse is currently divided into two types : offline data warehouse and real-time data warehouse.

[0003] 离线数据仓库的代表产品就是hive,该产品由于底层计算框架是MapReduce,因此其适合于超大数据集的离线分析和计算,对于实时性要求比较高的数据分析和计算并不适合。 [0003] Representative offline data warehouse Hive product is, since the bottom product is the MapReduce computing framework, so it is suitable for off-line analysis of large data sets and computing, high requirement for real-time data analysis and calculation are not suitable.

[0004] 实时数据仓库的代表产品是PrestoDB,该产品由FaceBook开发,采用了PipeLine的分布式数据计算和传输模式,对于大数据的分析和计算能够满足在100ms-20m之内,满足了实时数据分析和计算的要求。 [0004] The real-time data representative of the product warehouse is PrestoDB, the product developed by FaceBook, using distributed computing and data transmission mode PipeLine, analysis and calculation for large data can be met within the 100ms-20m, to meet the real-time data requirements analysis and computation.

[0005] 由于PrestoDB是一个基于内存的分布式计算框架,在进行数据分析和计算的时候,PrestoDB首先将需要分析和计算的数据分为数据片并将每个数据片读取到PrestoDB的Source节点中的内存中,然后将每个Source节点内存中的数据通过网络汇聚到一个None节点或者多个Fixed节点中,具体是汇聚到None节点还是Fixed节点与聚合函数的类型相关,例如:如果查询中包含有order by语句,那么就需要对所有的结果进行整体排序,因此各个Source节点内存中的数据就需要汇聚到一个None节点中,然后进行整体排序;如果查询中包含有group by语句,那么就需要对所有的结果进行分组,因此各个Source节点内存中的数据就需要汇聚到多个Fixed节点中,从而进行分组。 [0005] Since PrestoDB is a frame memory based on distributed computing, data analysis and calculations during the time, PrestoDB first need to analyze the data and calculated data into data for each slice and the slice read PrestoDB Source node the memory and the data memory in each Source node to the aggregation network via a node None Fixed or more nodes, the particular node or converged None Fixed type polymerization node related functions, for example: if the query contains the order by statement, then it needs to be for all the results as a whole sort, so the data each Source node memory will need to converge to a None node, then the whole sort; if the query contains a group by statement, then group all required result, the data of each memory Source nodes need to converge Fixed plurality of nodes, thereby performing packet.

[0006] 目前PrestoDB是从整个集群中随机选取一个节点作为None节点的,具体PrestoDB各种节点的选取算法如图1所示,图1是根据现有技术中的在PrestoDB集群中选取节点的流程的示意图。 [0006] Currently PrestoDB is randomly selected from the entire cluster as a node None node selection algorithm specific PrestoDB various nodes shown in Figure 1, Figure 1 is a prior art process of selecting a node in the cluster PrestoDB FIG. 如图1所示,首先判断需要选取的节点的类型,如需选取None节点或Fixed节点,则在集群中随机选取;如需选取Source节点,先判断是否需要采用硬件感知的方式,若是,则根据数据本地性来选取,否则随机选取多个节点作为Source节点。 1, the first type requires determining the selected node or nodes Fixed None To select node, randomly selected in the cluster; To select the Source node, first determine whether hardware perceptible manner, if yes, The data locality to select, randomly or as a plurality of nodes Source node. 这里的硬件感知是指感知需要处理的数据所在的位置,本地性是指优先选择数据所在的节点作为工作节点。 Here refers to a location-aware hardware perception data to be processed is located, refers to the local preference node data resides as a working node. 因为如果分配的工作节点,刚好就是需要处理的数据所在的节点,就能减少数据进行网络传输所需要的时间,能够减少计算任务所需要的时间。 Because the node where the data node if the assigned work, just that need to be addressed, we can reduce the time network data transmission needs, can reduce the time required for computing tasks. 所以在一些情况下可采用硬件感知方式,按本地性原则选取节点。 Therefore, in some cases, be implemented in hardware perceptible manner, according to the principle of selected local node.

[0007] 因此可以看出,如果一个节点被选择作为None节点,那么对其内存容量的要求就比较大。 [0007] Thus it can be seen, if a node is selected as the node None, then its memory capacity required is relatively large. 要想保证PrestoDB大数据量分析与计算的顺利进行,就必须对集群中的所有节点进行内存升级,使各个节点在被选择为None节点时都能胜任计算要求,这种升级工作量和成本都比较大。 To ensure the smooth progress PrestoDB large amount of data analysis and calculation, it is necessary for memory upgrades to all nodes in the cluster so that each node in the node can be selected as None competent computing requirements, workload and cost of this upgrade bigger.

发明内容 SUMMARY

[0008] 有鉴于此,本发明提供一种在分布式系统中选取节点的方法和装置,通过只提高None节点的内存容量,从而在比较低的工作量和成本下提高整个PrestoDB集群的性能。 [0008] Accordingly, the present invention provides a method and apparatus for selecting a node in a distributed system, only by increasing the memory capacity None node, thereby improving the performance of the entire cluster PrestoDB at a relatively low cost and effort.

[0009] 为实现上述目的,根据本发明的一个方面,提供了一种在分布式系统中选取节点的方法。 [0009] To achieve the above object, according to one aspect of the present invention, there is provided a method of selecting a node in a distributed system.

[0010] 本发明的在分布式系统中选取节点的方法中,分布式系统为PrestoDB集群,该方法包括:在所述分布式系统中,将指定的一部分节点作为候选的None节点;在需要将多个Source节点的数据片汇聚到一个节点的情况下,在所述候选的None节点中选择一个节点,然后将所述多个Source节点的数据片汇聚到选择的节点。 [0010] The method of selecting a node in a distributed system according to the present invention, the distributed system is a cluster PrestoDB, the method comprising: in a distributed system, a portion of the specified node as the node candidates None; in need a case where data pieces of the plurality of nodes converge to a Source node, select a node in the candidate node None, then the data sheet of the plurality of nodes Source converged to the selected node.

[0011] 可选地,在将指定的一部分节点作为候选的None节点之后,还包括:在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许所述候选的None节点作为候选的Fixed节点,若是,则在所述分布式系统中随机选取多个节点作为Fixed节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Fixed节点。 [0011] Alternatively, some of the nodes after the specified node candidates as None, further comprising: in the case where a plurality of pieces of data need to be converged to a plurality of nodes Source Fixed node determines whether to allow the current candidate None of the candidate node as a node Fixed, if yes, the randomly selected plurality of nodes in the distributed system as a Fixed node or nodes outside the distributed system None of the candidates in the plurality of randomly selected nodes as Fixed node.

[0012] 可选地,在将指定的一部分节点作为候选的None节点之后,还包括:在需要将分片的数据保存到Source节点的情况下,判断当前是否允许所述候选的None节点作为候选的Source节点,若是,则在所述分布式系统中随机选取多个节点作为Source节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点。 [0012] Alternatively, some of the nodes after the specified node candidates as None, further comprising: in case of need to store the data fragments to the Source node determines whether to allow the current None of the candidate node as a candidate the Source node, and if yes, the plurality of randomly selected nodes as a Source node in the distributed system, node or outside None of the distributed system of the randomly selected plurality of candidate nodes as a Source node.

[0013] 可选地,在所述分布式系统中随机选取多个节点作为Source节点的步骤包括:在当前采用硬件感知方式的情况下,在所述分布式系统中按照本地性原则选取多个节点作为Source 节点。 Step [0013] Alternatively, in the distributed system, a plurality of randomly selected nodes as a Source node comprises: if the current sensing hardware manner, in a distributed system in accordance with the principle of selecting a plurality of local Source node as a node.

[0014] 可选地,在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点的步骤包括:在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 Step [0014] Alternatively, the outside None nodes of the distributed system a plurality of randomly selected candidate node as a Source node comprises: if the current sensing hardware manner, in the distributed system None of the candidate node other than the selection of the plurality of nodes as a Source node according to the principle of locality.

[0015] 根据本发明的另一方面,提供了一种在分布式系统中选取节点的装置。 [0015] According to another aspect of the present invention, there is provided an apparatus for selecting a node in a distributed system.

[0016] 对于本发明的在分布式系统中选取节点的装置,分布式系统为PrestoDB集群,该装置包括:配置模块,用于记录所述分布式系统中被指定的作为候选的None节点的一部分节点;None节点选择模块,用于在需要将多个Source节点的数据片汇聚到一个节点的情况下,在所述候选的None节点中选择一个节点作为None节点。 [0016] means for selecting a node in a distributed system according to the present invention, the distributed system PrestoDB cluster, the apparatus comprising: a configuration module, the distributed part of the recording system node designated as candidates for None node; None node selecting module for a case where data needs to be a plurality of pieces of Source nodes converge to a node, said node select None candidate node as a node None.

[0017] 可选地,还包括Fixed节点选择模块,用于在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许所述候选的None节点作为候选的Fixed节点,若是,则在所述分布式系统中随机选取多个节点作为Fixed节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Fixed节点。 [0017] Optionally, further comprising Fixed node selecting means for in a case where a plurality of pieces of data need to be converged to a plurality of nodes Source Fixed node determines whether to allow the current None of the candidate node as a candidate Fixed node, and if yes, the plurality of randomly selected nodes as Fixed node in the distributed system, node or outside None of the distributed system in the plurality of candidate nodes as a Fixed randomly selected node.

[0018] 可选地,还包括Source节点选择模块,用于在需要将分片的数据保存到Source节点的情况下,判断当前是否允许所述候选的None节点作为候选的Source节点,若是,贝Ij在所述分布式系统中随机选取多个节点作为Source节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点。 [0018] Optionally, further comprising a Source node selecting module configured to save in case of need of data pieces to the Source node, it is determined whether the current node to allow the candidate None Source node as a candidate, if, Tony Ij plurality of randomly selected nodes as a Source node in the distributed system, node or outside None of the distributed system of the randomly selected plurality of candidate nodes as a Source node.

[0019] 可选地,所述Source节点选择模块还用于在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 [0019] Alternatively, the Source node selection module is further used in the current embodiment of the sensing hardware, other than None node in the distributed system, the principle of selecting a plurality of candidate nodes according to a locality Source node.

[0020] 可选地,所述Source节点选择模块还用于在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 [0020] Alternatively, the Source node selection module is further used in the current embodiment of the sensing hardware, other than None node in the distributed system, the principle of selecting a plurality of candidate nodes according to a locality Source node.

[0021] 根据本发明的技术方案,在PrestoDB集群中指定一部分节点作为候选的None节点,从而将None节点的选取限定在一定范围之内,这样可以对该范围的节点进行内存升级和扩容,使之胜任计算要求。 [0021] According to the present invention, a part designated as the cluster node PrestoDB None node candidates, thereby selecting None of nodes within a defined range, which can be memory upgrades and expansion of the range of the node, so that the competent computational requirements. 这种方式无需对整个PrestoDB集群的所有节点进行内存升级扩容,因此升级扩容的工作量比较低,并且能够提高整个PrestoDB集群的性能。 It eliminates all the nodes of the cluster entire PrestoDB for memory expansion upgrade, so upgrading and expansion of the workload is relatively low, and to improve the performance of the entire PrestoDB cluster.

附图说明 BRIEF DESCRIPTION

[0022] 附图用于更好地理解本发明,不构成对本发明的不当限定。 [0022] The accompanying drawings for a better understanding of the present invention, without unduly limiting the present invention. 其中: among them:

[0023] 图1是根据本发明实施例的示意图; [0023] FIG. 1 is a schematic of an embodiment of the present invention;

[0024] 图2是根据本发明实施例的在分布式系统中选取节点的方法的示意图; [0024] FIG. 2 is a schematic diagram of a method for selecting a node in a distributed system embodiment of the present invention;

[0025] 图3是根据本发明实施例的在分布式系统中选取节点的装置的主要模块的示意图。 [0025] FIG. 3 is a schematic diagram of the main module selecting device node in the distributed system in the embodiment of the present invention.

具体实施方式 detailed description

[0026] 以下结合附图对本发明的示范性实施例做出说明,其中包括本发明实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。 [0026] DESCRIPTION OF THE DRAWINGS made below of exemplary embodiments of the present invention, including various details of the embodiments to assist in understanding the present invention, they should be regarded as merely exemplary. 因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本发明的范围和精神。 Accordingly, those of ordinary skill in the art of the present embodiment will be appreciated that embodiments described herein may make various changes and modifications without departing from the scope and spirit of the invention. 同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。 Also, for clarity and conciseness, the following description is omitted the description of known functions and Structure.

[0027] 在本发明实施例的方案中,事先指定PrestoDB集群中的一部分节点作为候选的None节点,在需要选择None节点时就从这一部分节点中选择。 [0027] In the embodiment of the present invention, in an embodiment, a portion of the previously specified node in the cluster as PrestoDB None candidate node, choose from some of the nodes in the node when necessary to select None. 也可以设置配置项,对于是从这一部分节点中选择None节点还是在PrestoDB集群中随机选择None节点进行配置。 CI may be provided, for selecting None node from the node portion or randomly selected None PrestoDB node disposed in the cluster. 在PrestoDB启动的时候,对该配置项进行解析,根据配置项中的配置信息,构建一个由对应的IP-Port对组成的一个列表,并在分配None节点时进行使用。 When PrestoDB start, parse the configuration item, according to configuration information items, construct a list of one component, and when allocating None node from the corresponding IP-Port. 其配置规范例如:None汇聚节点=IP地址1:端口I ;IP地址2:端口2。 For example, its configuration specification: None convergence node 1 = IP Address: Port I; IP Address 2: Port 2. 即指定了IP地址为地址I和地址2的两个节点作为候选的None节点,端口分别为端口I和端口2。 I.e. the IP address assigned to two addresses I and the address of the node 2 as node candidates None of Port I and port are ports 2. 在配置项中,还可以对于是否允许上述的候选的None节点作为候选的Fixed节点进行配置,对于是否允许上述的候选的None节点作为候选的Source节点也进行配置。 In the configuration item, it can also be configured to allow None whether a candidate node, as described above Fixed node candidates as a Source node candidates may also be configured to allow None whether the above-described candidate node. 这样,在需要选择节点时,可按图2所示流程来进行。 Thus, when the need to select a node, according to the flow shown in Fig. 图2是根据本发明实施例的在分布式系统中选取节点的方法的示意图。 FIG 2 is a schematic diagram of the method for selecting a node in a distributed system according to an embodiment of the present invention. 该方法可由PrestoDB中的Coordinator节点来执行。 PrestoDB The method may be performed in the Coordinator node.

[0028] 步骤S21:判断需要选择的节点的类型。 [0028] Step S21: The type judgment need to select the node. 在需要将分片的数据保存到Source节点的情况下,需选择的节点是Source节点,进入步骤S24。 In the case where the need to store the data fragments to the Source node, the selected node is required Source node proceeds to step S24. 在需要对数据片进行汇聚处理时根据聚合函数的类型来确定需选择的节点的类型,在需选择None节点时,进入步骤S22;在需选择Fixed节点时,进入步骤S23。 When the data type of sheet is needed to determine the convergence process must be selected according to the aggregation function of the type of the node, when the node needs to select None proceeds to step S22; Fixed selected node when needed, proceeds to step S23.

[0029] 步骤S22:判断None节点是否从指定范围中选取。 [0029] Step S22: None determined whether the node is selected from a specific range. 该判断根据上述的配置项进行。 This judgment is made according to the above configuration items. 若是,贝lJ从配置项中记录的候选的None节点中选取一个节点作为None节点(步骤S221),否则随机选取一个节点作为None节点(步骤S222)。 If, None shellfish lJ node configuration item records from selected candidate node as a node None (step S221), or a node randomly selected nodes as None (step S222).

[0030] 步骤S23:判断是否允许候选的None节点作为候选的Fixed节点。 [0030] Step S23: determines whether to permit None Fixed node candidates as the candidate node. 若是,则可以随机选取多个节点作为Fixed节点(步骤S231),否则在分布式系统中的候选的None节点之外随机选取多个节点作为Fixed节点。 If so, the plurality of nodes may be randomly selected as the Fixed node (step S231), or in addition to the candidate node None distributed system of a plurality of randomly selected nodes as Fixed node.

[0031] 步骤S24:判断是否采用硬件感知的方式,若是,进入步骤S241,否则进入步骤S242o [0031] Step S24: determining whether to use hardware-aware way, if the process proceeds to step S241, otherwise, step S242o

[0032] 步骤S241:判断是否允许候选的None节点作为候选的Source节点。 [0032] Step S241: determining whether to allow None Source node candidates as the candidate node. 若是,则可以按照本地性原则选取多个节点作为Source节点(步骤S2411),否则在分布式系统中的候选的None节点之外按照本地性原则选取多个节点作为Source节点(步骤S2412)。 If so, the plurality of nodes can be selected as a principle Source node (step S2411) in accordance with the locality, or in addition to the candidate node None distributed system as a principle of selecting a plurality of nodes Source node (step S2412) in accordance with the locality.

[0033] 步骤S242:判断是否允许候选的None节点作为候选的Source节点。 [0033] Step S242: determining whether to allow None Source node candidates as the candidate node. 若是,则可以随机选取多个节点作为Source节点(步骤S2421),否则在分布式系统中的候选的None节点之外随机选取多个节点作为Source节点(步骤S2422)。 If so, the plurality of nodes may be randomly selected as the Source node (step S2421), or in addition to the candidate node None distributed system of a plurality of randomly selected nodes as a Source node (step S2422).

[0034] 图3是根据本发明实施例的在分布式系统中选取节点的装置的主要模块的示意图。 [0034] FIG. 3 is a schematic diagram of the main module selecting device node in the distributed system in the embodiment of the present invention. 如图3所示,本发明实施例的在分布式系统中选取节点的装置30主要包括配置模块31和None节点选择模块32。 3, the select nodes in the distributed system apparatus 30 of the present embodiment of the invention includes a configuration module 31 and the selection module 32 None node. 配置模块31用于记录分布式系统中被指定的作为候选的None节点的一部分节点。 The configuration module 31 for some of the nodes in the distributed system records None node designated as a candidate. None节点选择模块32用于在需要将多个Source节点的数据片汇聚到一个节点的情况下,在候选的None节点中选择一个节点作为None节点。 None node selecting module 32 is required for the case where a plurality of data pieces Source nodes converge to a node, a node selected as the candidate node None None node of.

[0035] 装置30还可以包括还包括Fixed节点选择模块(图中未示出),用于在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许候选的None节点作为候选的Fixed节点,若是,则在分布式系统中随机选取多个节点作为Fixed节点,否则在分布式系统中候选的None节点之外随机选取多个节点作为Fixed节点。 [0035] The apparatus 30 may further include a selection module further includes a Fixed Node (not shown) for the case where a plurality of pieces of data need to be converged to a plurality of nodes Source Fixed node determines whether to allow the current candidate None Fixed node as a candidate node, and if yes, the plurality of randomly selected nodes as Fixed node in a distributed system, or in a distributed system randomly selected candidate node other than None Fixed plurality of nodes as a node.

[0036] 装置30还可以包括Source节点选择模块(图中未示出),用于在需要将分片的数据保存到Source节点的情况下,判断当前是否允许候选的None节点作为候选的Source节点,若是,则在分布式系统中随机选取多个节点作为Source节点,否则在分布式系统中候选的None节点之外随机选取多个节点作为Source节点。 [0036] Source device 30 may further include a node selecting module (not shown), is used in the case where the data needs to be saved to the fragmentation Source node determines whether to allow the current node candidates None Source node as a candidate if yes, the system randomly distributed in a plurality of nodes as the Source node, or a plurality of randomly selected nodes as nodes other than None Source node candidates distributed system.

[0037] Source节点选择模块还可用于在当前采用硬件感知方式的情况下,在分布式系统中候选的None节点之外按照本地性原则选取多个节点作为Source节点。 [0037] Source node selection module may also be used in the current sensing hardware embodiment, in addition to the candidate node None distributed system as a principle of selecting a plurality of nodes Source node according to local properties. Source节点选择模块还用于在当前采用硬件感知方式的情况下,在分布式系统中候选的None节点之外按照本地性原则选取多个节点作为Source节点。 Source node selecting module is further used in the case of the current sensing hardware embodiment, in addition to the candidate node None distributed system as a principle of selecting a plurality of nodes Source node according to local properties.

[0038] 根据本发明实施例的技术方案,在PrestoDB集群中指定一部分节点作为候选的None节点,从而将None节点的选取限定在一定范围之内,这样可以对该范围的节点进行内存升级和扩容,使之胜任计算要求。 [0038] According to the embodiment of the present invention, a portion designated as the cluster node PrestoDB None node candidates, thereby selecting None of nodes within a defined range, which can be memory upgrades and expansion of the range of the node , so competent computational requirements. 这种方式无需对整个PrestoDB集群的所有节点进行内存升级扩容,因此升级扩容的工作量比较低,并且能够提高整个PrestoDB集群的性能。 It eliminates all the nodes of the cluster entire PrestoDB for memory expansion upgrade, so upgrading and expansion of the workload is relatively low, and to improve the performance of the entire PrestoDB cluster.

[0039] 上述具体实施方式,并不构成对本发明保护范围的限制。 [0039] The specific embodiments do not limit the scope of the present invention. 本领域技术人员应该明白的是,取决于设计要求和其他因素,可以发生各种各样的修改、组合、子组合和替代。 Those skilled in the art would understand that, depending on design requirements and other factors that can occur various modifications, combinations, sub-combinations and alternatives. 任何在本发明的精神和原则之内所作的修改、等同替换和改进等,均应包含在本发明保护范围之内。 Any modifications within the spirit and principle of the present invention, equivalent substitutions and improvements should be included within the scope of the present invention.

Claims (10)

  1. 1.一种在分布式系统中选取节点的方法,所述分布式系统为PrestoDB集群,其特征在于,该方法包括: 在所述分布式系统中,将指定的一部分节点作为候选的None节点; 在需要将多个Source节点的数据片汇聚到一个节点的情况下,在所述候选的None节点中选择一个节点,然后将所述多个Source节点的数据片汇聚到选择的节点。 1. A method for selecting a node in a distributed system, the distributed system is a cluster PrestoDB, characterized in that, the method comprising: in the distributed system, a portion of the specified node as the node candidates None; in the case where a plurality of pieces of data need to be converged to a Source node to node, selecting a node in the candidate node None, then the data sheet of the plurality of nodes Source converged to the selected node.
  2. 2.根据权利要求1所述的方法,其特征在于,在将指定的一部分节点作为候选的None节点之后,还包括: 在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许所述候选的None节点作为候选的Fixed节点,若是,则在所述分布式系统中随机选取多个节点作为Fixed节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Fixed节点。 2. The method according to claim 1, characterized in that, after some of the nodes of the specified node candidates as None, further comprising: in the case where a plurality of pieces of data need to be converged to a plurality of nodes Source node Fixed determining whether to allow the current None of the candidate nodes as candidate nodes Fixed, if yes, the randomly selected plurality of nodes in the distributed system as a Fixed node, otherwise the candidate in the distributed system, node None a plurality of randomly selected nodes as addition Fixed node.
  3. 3.根据权利要求1所述的方法,其特征在于,在将指定的一部分节点作为候选的None节点之后,还包括: 在需要将分片的数据保存到Source节点的情况下,判断当前是否允许所述候选的None节点作为候选的Source节点,若是,则在所述分布式系统中随机选取多个节点作为Source节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source 节点。 3. The method according to claim 1, characterized in that, after some of the nodes of the specified node candidates as None, further comprising: in case of need to store the data fragments to the Source node determines whether to allow the current None of the candidate node as a Source node candidates, and if so, the randomly selected plurality of nodes in the distributed system as a Source node or nodes outside None of the distributed system of the randomly selected plurality of candidate Source nodes as a node.
  4. 4.根据权利要求3所述的方法,其特征在于,在所述分布式系统中随机选取多个节点作为Source节点的步骤包括:在当前采用硬件感知方式的情况下,在所述分布式系统中按照本地性原则选取多个节点作为Source节点。 4. The method according to claim 3, wherein the randomly selected plurality of nodes in the distributed system as a Source node comprises the step of: in the case of the current sensing hardware manner, in the distributed system select a plurality of nodes as a Source node according to the principle of locality.
  5. 5.根据权利要求3或4所述的方法,其特征在于,在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点的步骤包括:在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 5. The method of claim 3 or claim 4, wherein, in the steps other than None node of the distributed system a plurality of randomly selected candidate node as a Source node comprises: sensing hardware in this manner in the case where, outside the node in the distributed system None of the candidates in accordance with the principle of selecting a plurality of local nodes as the Source node.
  6. 6.一种在分布式系统中选取节点的装置,所述分布式系统为PrestoDB集群,其特征在于,该装置包括: 配置模块,用于记录所述分布式系统中被指定的作为候选的None节点的一部分节点; None节点选择模块,用于在需要将多个Source节点的数据片汇聚到一个节点的情况下,在所述候选的None节点中选择一个节点作为None节点。 An apparatus for selecting a node in a distributed system, the distributed system is a cluster PrestoDB, characterized in that, the apparatus comprising: a configuration module, for recording the distributed system designated as a candidate None a portion of the node; None node selecting means for data required in the case where a plurality of sheet Source nodes converge to a node, said node select None candidate node as a node None.
  7. 7.根据权利要求6所述的装置,其特征在于,还包括Fixed节点选择模块,用于在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许所述候选的None节点作为候选的Fixed节点,若是,则在所述分布式系统中随机选取多个节点作为Fixed节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Fixed节点。 7. The device according to claim 6, characterized in that, further comprising a Fixed node selecting means for in a case where a plurality of pieces of data need to be converged to a plurality of nodes Source Fixed node determines whether to allow the current None Fixed candidate node as a candidate node, and if so, the randomly selected plurality of nodes in the distributed system as a Fixed node or nodes outside the distributed system None of said plurality of randomly selected candidate nodes as Fixed node.
  8. 8.根据权利要求6所述的装置,其特征在于,还包括Source节点选择模块,用于在需要将分片的数据保存到Source节点的情况下,判断当前是否允许所述候选的None节点作为候选的Source节点,若是,则在所述分布式系统中随机选取多个节点作为Source节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点。 8. The apparatus according to claim 6, characterized in that, further comprising a Source node selecting module configured to save in case of need of data pieces to the Source node determines whether to allow the current node as a candidate None Source candidate node, and if yes, randomly selected plurality of nodes in the distributed system as a Source node or nodes outside None of the distributed system of the randomly selected plurality of candidate nodes as a Source node.
  9. 9.根据权利要求8所述的装置,其特征在于,所述Source节点选择模块还用于在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 9. The apparatus according to claim 8, wherein said selection module is further configured Source node in the current embodiment of the sensing hardware, other than None node in the distributed system according to the candidate selecting a plurality of local nodes principle as Source node.
  10. 10.根据权利要求8或9所述的装置,其特征在于,所述Source节点选择模块还用于在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 10. The apparatus of claim 8 or claim 9, wherein said selection module is further configured Source node in the current sensing hardware manner, in the distributed system of the candidate nodes None selecting a plurality of nodes as the outer Source node according to the principle of locality.
CN 201510016624 2015-01-13 2015-01-13 Method and device for selecting node in distributed system CN104580476A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201510016624 CN104580476A (en) 2015-01-13 2015-01-13 Method and device for selecting node in distributed system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN 201510016624 CN104580476A (en) 2015-01-13 2015-01-13 Method and device for selecting node in distributed system
PCT/CN2016/070551 WO2016112831A1 (en) 2015-01-13 2016-01-11 Method and device of selecting distributed system node

Publications (1)

Publication Number Publication Date
CN104580476A true true CN104580476A (en) 2015-04-29

Family

ID=53095633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201510016624 CN104580476A (en) 2015-01-13 2015-01-13 Method and device for selecting node in distributed system

Country Status (2)

Country Link
CN (1) CN104580476A (en)
WO (1) WO2016112831A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016112831A1 (en) * 2015-01-13 2016-07-21 北京京东尚科信息技术有限公司 Method and device of selecting distributed system node

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101102225A (en) * 2007-07-26 2008-01-09 北京航空航天大学 Management method of wireless sensor network nodes
US7710884B2 (en) * 2006-09-01 2010-05-04 International Business Machines Corporation Methods and system for dynamic reallocation of data processing resources for efficient processing of sensor data in a distributed network
CN101924777A (en) * 2009-06-17 2010-12-22 中国移动通信集团公司 Method, system and equipment for searching active nodes in P2P streaming media system
CN103188161A (en) * 2011-12-30 2013-07-03 中国移动通信集团公司 Method and system of distributed data loading scheduling
CN104168332A (en) * 2014-09-01 2014-11-26 广东电网公司信息中心 Load balance and node state monitoring method in high performance computing

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100242042A1 (en) * 2006-03-13 2010-09-23 Nikhil Bansal Method and apparatus for scheduling work in a stream-oriented computer system
CN102572809A (en) * 2010-12-27 2012-07-11 中国移动通信集团公司 Method, system and equipment for selecting gateway nodes
CN104580476A (en) * 2015-01-13 2015-04-29 北京京东尚科信息技术有限公司 Method and device for selecting node in distributed system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7710884B2 (en) * 2006-09-01 2010-05-04 International Business Machines Corporation Methods and system for dynamic reallocation of data processing resources for efficient processing of sensor data in a distributed network
CN101102225A (en) * 2007-07-26 2008-01-09 北京航空航天大学 Management method of wireless sensor network nodes
CN101924777A (en) * 2009-06-17 2010-12-22 中国移动通信集团公司 Method, system and equipment for searching active nodes in P2P streaming media system
CN103188161A (en) * 2011-12-30 2013-07-03 中国移动通信集团公司 Method and system of distributed data loading scheduling
CN104168332A (en) * 2014-09-01 2014-11-26 广东电网公司信息中心 Load balance and node state monitoring method in high performance computing

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016112831A1 (en) * 2015-01-13 2016-07-21 北京京东尚科信息技术有限公司 Method and device of selecting distributed system node

Also Published As

Publication number Publication date Type
WO2016112831A1 (en) 2016-07-21 application

Similar Documents

Publication Publication Date Title
US20120221636A1 (en) Method and apparatus for using a shared data store for peer discovery
US20140068602A1 (en) Cloud-Based Middlebox Management System
Costa et al. Camdoop: Exploiting in-network aggregation for big data applications
US20110307899A1 (en) Computing cluster performance simulation using a genetic algorithm solution
US20070266029A1 (en) Recovery segment identification in a computing infrastructure
US20160043901A1 (en) Graceful scaling in software driven networks
US20130151683A1 (en) Load balancing in cluster storage systems
US20130318525A1 (en) Locality-aware resource allocation for cloud computing
US20140089500A1 (en) Load distribution in data networks
US20110029672A1 (en) Selection of a suitable node to host a virtual machine in an environment containing a large number of nodes
Mondal et al. Managing large dynamic graphs efficiently
US20130031545A1 (en) System and method for improving the performance of high performance computing applications on cloud using integrated load balancing
US20100312891A1 (en) Utilizing affinity groups to allocate data items and computing resources
US20140032595A1 (en) Contention-free multi-path data access in distributed compute systems
Liu et al. HSim: a MapReduce simulator in enabling cloud computing
US20140304412A1 (en) Systems and methods for gslb preferred backup list
US20090187588A1 (en) Distributed indexing of file content
US20130332608A1 (en) Load balancing for distributed key-value store
US20120166394A1 (en) Distributed storage system and method for storing objects based on locations
CN102629219A (en) Self-adaptive load balancing method for Reduce ends in parallel computing framework
US20120311295A1 (en) System and method of optimization of in-memory data grid placement
CN103109271A (en) Inter-platform application migration realization method and system
US20120311172A1 (en) Overloading processing units in a distributed environment
US20130227116A1 (en) Determining optimal component location in a networked computing environment
CN102725753A (en) Method and apparatus for optimizing data access, method and apparatus for optimizing data storage

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination