CN104243579A - Computational node control method and system applied to water conservancy construction site - Google Patents

Computational node control method and system applied to water conservancy construction site Download PDF

Info

Publication number
CN104243579A
CN104243579A CN 201410465692 CN201410465692A CN104243579A CN 104243579 A CN104243579 A CN 104243579A CN 201410465692 CN201410465692 CN 201410465692 CN 201410465692 A CN201410465692 A CN 201410465692A CN 104243579 A CN104243579 A CN 104243579A
Authority
CN
Grant status
Application
Patent type
Prior art keywords
computational
node
control
construction
site
Prior art date
Application number
CN 201410465692
Other languages
Chinese (zh)
Inventor
林鹏
李庆斌
高向友
胡森映
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Abstract

The invention provides a computational node control method applied to a water conservancy construction site. The computational node control method applied to the water conservancy construction site comprises the following steps that regular polling is used for discovering multiple computational nodes capable of being used for a computing task, the current computing capacity of each computational node is obtained, the computing task is decomposed, the decomposed computing task is processed through cooperation of the computational nodes, each computational node sends a processing result to a central control node, and the central control node analyzes the processing result of each computational node so as to control the computational nodes. According to the computational node control method applied to the water conservancy construction site, the residual computing capacity of the computational nodes (such as sensors and data processing units) of the construction site are utilized fully, and the informationalized level of the water conservancy construction site can be improved effectively. The invention further provides a computational node control system applied to the water conservancy construction site.

Description

应用于水利施工现场的计算节点的控制方法及系统 RESOURCES control method and system applicable to the construction site computing node

技术领域 FIELD

[0001] 本发明涉及分布式计算技术领域,特别涉及一种应用于水利施工现场的计算节点的控制方法及系统。 [0001] The present invention relates to distributed computing, and more particularly relates to a control method and system for the construction site used in water computing nodes.

背景技术 Background technique

[0002] 随着物联网和传感器网络的快速普及,在施工现场中使用传感器网络的情况越来越多。 [0002] With the rapid popularization of the Internet of Things and sensor networks, sensor networks using more and more at the construction site. 这些传感器网络被广泛应用于采集温度、湿度、压力、人员位置信息等各种和业务相关的方面,并且随着管理向数字化、信息化发展,也为其他业务的引入和发展打下了坚实的基础。 These sensors are widely used in networks of various acquisition and business-related aspects of temperature, humidity, pressure, location of personnel information and the like, and as to the management of digital information technology development, and laid a foundation for the introduction and development of other services . 但是长期以来,各个网络及其节点各司其职,相互割裂,无法达到普适计算和信息融合的目的,比如有的CPU采用的是32位的现代CPU,但长期以来占用率在1 %以下,使计算潜力远远无法发挥,而中心服务器遇到大的计算任务和计算密集型的操作时却超负荷工作,计算周期过长,影响实时效率决策。 But a long time, and their respective network nodes perform their duties, separated from each other, can not achieve the object of the pervasive computing and information fusion, for example, some 32-bit CPU is used in modern CPU, but a long time occupancy rate is below 1% so far unable to calculate the potential to play, while the central server encountered a large computing tasks and compute-intensive operations Shique overworked calculation period is too long, affecting the efficiency of real-time decision making.

[0003] 但是目前关于上述问题的解决方案极少,有的只是提到了一些浅显的设计,而有的也只是提出了一点想法,并没有真正的可以被用来开发出一个分布计算的完整方案。 [0003] However, the current solution to the above problem rare, and some just mentioned some plain design, while others have just made a little idea, and can not really be used to develop a complete solution for distributed computing .

发明内容 SUMMARY

[0004] 本发明旨在至少在一定程度上解决上述相关技术中的技术问题之一。 [0004] The present invention aims to address at least some extent one of the above-described technical problems in the related art.

[0005] 为此,本发明的一个目的在于提出一种应用于水利施工现场的计算节点的控制方法,该方法充分利用了施工现场各计算节点(如传感器和数据处理单元)的剩余计算能力, 可以有效地提升水利施工现场信息化水平。 [0005] It is therefore an object of the present invention is to provide a control method for hydraulic construction site applied computing node, which takes full advantage of the construction site, each computing node (e.g., sensor and data processing unit) computing a remaining capacity, can effectively improve the level of water conservancy construction site information.

[0006] 本发明的另一个目的在于提供一种应用于水利施工现场的计算节点的控制系统。 [0006] Another object of the present invention is to provide a control system applied to computing node RESOURCES the construction site.

[0007] 为了实现上述目的,本发明第一方面的实施例提出了一种应用于水利施工现场的计算节点的控制方法,包括以下步骤:采用定期轮询以发现可用于计算任务的多个计算节点;分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务;每个计算节点分别将处理结果发送至中心控制节点;所述中心控制节点分析每个计算节点的处理结果以对所述多个计算节点进行控制。 [0007] To achieve the above object, a first aspect of the embodiments of the present invention provides a control method for a construction site used in water computing nodes, comprising the steps of: using periodic polling calculations may be used to find a plurality of computational tasks node; respectively acquire the plurality of computing nodes current computing capabilities, and the computing task decomposition, and the node computing tasks calculated by co-processing a plurality of said decomposition; each compute node are sent to the central processing result a control node; said central control node analysis processing result of each computing node of the plurality of computing nodes to be controlled.

[0008] 另外,根据本发明上述实施例的应用于水利施工现场的计算节点的控制方法还可以具有如下附加的技术特征: [0008] Further, according to the control method applied to the compute nodes hydraulic construction site above-described embodiments of the present invention may also have the following additional technical features:

[0009] 在一些示例中,所述采用定期轮询以发现可用于计算任务的多个计算节点,具体包括:根据计算节点列表发送轮询请求并启用等待定时器;各计算节点接收所述轮询请求,估算各自当前的计算能力,并发送至中心控制节点,具体包括: [0009] In some examples, the use of periodic polling to discover the plurality of computing nodes may be used to compute task, comprises: transmitting a polling request The computing node list and the wait timer is enabled; each computing node receiving the wheel inquiry request, estimating respective current computing power, and transmitted to the central control node comprises:

[0010] Μ = Ν+Ρ1+Ρ2, [0010] Μ = Ν + Ρ1 + Ρ2,

[0011] 其中,Μ为计算节点的当前计算能力,Ν为当前CPU占有率,Ρ1为过去一段时间的CPU占有率,P2为预期将来一段时间的CPU占有率;在所述等待定时器到期前,所述中心控制节点根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务;如果可以,则将所述多个节点用于完成所述计算任务,否则继续发送轮询请求;当所述等待定时器到期时,不再等待计算节点的回应,并丢弃超时的回应消息。 [0011] wherein, [mu] is the current computing capabilities compute nodes, Ν current CPU usage, Ρ1 share of the CPU over a period of time, P2 of the CPU occupancy expected future period of time; waiting for the expiration of the timer before, the central control node according to the calculation of the current capacity of the computing nodes, it is determined whether a plurality of computing nodes computational tasks to be completed; if so, then the plurality of nodes for performing the computing tasks, otherwise continue transmitting polling request; wait when the timer expires, without waiting for the response from the computing node, and discards the response message timeout.

[0012] 在一些示例中,所述分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务,具体包括:设所述多个计算节点为N个,以及将所述计算任务分解为m个子任务,其中N>M ;将每个子任务发送给对应的计算节点,并启动超时定时器;定时判断各计算节点是否失效;在所述超时定时器到期前,接收各计算节点的计算结果。 [0012] In some examples, the plurality of computing nodes respectively acquire the current computing capability, and the computing task decomposition, and co-processing computation task node decomposed by the plurality of computing comprises: the plurality of computing nodes arranged into N, and the calculated task into subtasks m, where N> m; each computing node determines the timing; sends each subtask corresponding to compute node, and starts a timeout timer is invalid; before the timeout timer expires, the received calculation result of each computing node.

[0013] 在一些示例中,还包括:采取冗余的策略,同一分解的子任务可以分配到多个计算节点。 [0013] In some examples, further comprising: a redundancy strategy adopted, the same decomposition subtasks can be assigned to multiple computing nodes.

[0014] 在一些示例中,所述各个计算节点之间采用XML格式的通信协议。 [0014] In some examples, the communication protocol using XML format between the respective computing nodes.

[0015] 根据本发明实施例的应用于水利施工现场的计算节点的控制方法,由中心控制节点发起定期轮询,由潜在参与节点上报各自的剩余计算能力,并根据各节点上报的数据,进行任务分解,指派到指定节点进行计算,并上报计算结果,最后根据各个节点上报的信息汇总成最终结果。 [0015] The control method applied to the compute nodes hydraulic construction site to an embodiment of the present invention, by a central control node initiates periodic polling, by the respective potential participating node reports the remaining capacity calculated and reported in accordance with each node data, task decomposition, is assigned to a specific node calculation, and report the results of the last aggregated into a final result based on the information reported by each node. 因此,该方法充分利用了现场的各计算节点(如传感器和数据处理单元) 的剩余计算能力,可以有效地提升水利施工现场信息化水平。 Thus, the method makes full use of each computing node in the scene (such as sensors and data processing unit) computing remaining capacity, can effectively enhance the level of hydraulic construction site information.

[0016] 本发明第二方面的实施例提供了一种应用于水利施工现场的计算节点的控制系统,包括:发现模块,所述发现模块用于通过定期轮询以发现可用于计算任务的多个计算节点;分配模块,所述分配模块用于分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务;上报模块,所述上报模块用于传送每个计算节点的处理结果;控制模块,所述控制模块分析每个计算节点的处理结果以对所述多个计算节点进行控制。 Example [0016] The second aspect of the present invention to provide a control system used in water construction site computing nodes, comprising: a discovery module, the discovery module to discover by periodically polling the plurality of tasks can be used to calculate compute nodes; dispensing module, the allocation module for acquiring the plurality of computing nodes respectively current computing capabilities, and the computing task decomposition, and collaborative computing tasks after the node is calculated by said plurality of decomposition process; a reporting module, the reporting module for transmitting the processing result of each computing node; and a control module, the control module analyzes the processing result of each computing node of the plurality of computing nodes to be controlled.

[0017] 另外,根据本发明上述实施例的应用于水利施工现场的计算节点的控制系统还可以具有如下附加的技术特征: [0017] Further, the control system calculates a node used in water the construction site of the above-described embodiments of the present invention may also have the following additional technical features:

[0018] 在一些示例中,所述发现模块通过定期轮询以发现可用于计算任务的多个计算节点,具体包括:根据计算节点列表发送轮询请求并启用等待定时器;各计算节点接收所述轮询请求,估算各自当前的计算能力,并发送至控制模块,具体包括: [0018] In some examples, the discovery module to discover by periodically polling the plurality of computing nodes may be used to compute task, comprises: waiting timer is enabled and according to the polling request sending computing node list; each computing node receives said polling request, each of the estimated current computing capabilities, and transmitted to the control module comprises:

[0019] Μ = Ν+Ρ1+Ρ2, [0019] Μ = Ν + Ρ1 + Ρ2,

[0020] 其中,Μ为计算节点的当前计算能力,Ν为当前CPU占有率,Ρ1为过去一段时间的CPU占有率,P2为预期将来一段时间的CPU占有率;在所述等待定时器到期前,所述控制模块根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务;如果可以,则将所述多个节点用于完成所述计算任务,否则继续发送轮询请求;当所述等待定时器到期时,则所述发现模块不再等待计算节点的回应,并丢弃超时的回应消息。 [0020] wherein, [mu] is the current computing capabilities compute nodes, Ν current CPU usage, Ρ1 share of the CPU over a period of time, P2 of the CPU occupancy expected future period of time; waiting for the expiration of the timer before, the control module according to the current capacity of each computing node is calculated, it is determined whether a plurality of computing nodes to complete computing tasks; if so, then the plurality of nodes for performing the computing task, or continue to transmit a polling request ; wait when the timer expires, without waiting for the response from the discovery module computing node, and discards the response message timeout.

[0021] 在一些示例中,所述分配模块分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务,具体包括:设所述多个计算节点为N个,以及将所述计算任务分解为m个子任务,其中N>M ;将每个子任务发送给对应的计算节点,并启动超时定时器;定时判断各计算节点是否失效;在所述超时定时器到期前,接收各计算节点的计算结果。 [0021] In some examples, each of the distribution module acquiring plurality of current computing power of the computing nodes, and the computing task decomposition, and co-processing computation task node decomposed by said plurality of computing, particularly comprising: a plurality of computing the set of N nodes, and the calculated task into subtasks m, where N> m; each subtask transmitting nodes corresponding to the calculation, and starting the timeout timer; each timing of Analyzing computing node is invalid; before the timeout timer expires, the received calculation result of each computing node.

[0022] 在一些示例中,所述分配模块还用于采取冗余的策略,同一分解的子任务可以分配到多个计算节点。 [0022] In some examples, the policy allocation module is further configured to take redundancy, the same decomposition subtasks can be assigned to multiple computing nodes.

[0023] 在一些示例中,所述各个计算节点之间采用XML格式的通信协议。 [0023] In some examples, the communication protocol using XML format between the respective computing nodes.

[0024] 根据本发明实施例的应用于水利施工现场的计算节点的控制系统,由中心控制节点发起定期轮询,由潜在参与节点上报各自的剩余计算能力,并根据各节点上报的数据,进行任务分解,指派到指定节点进行计算,并上报计算结果,最后根据各个节点上报的信息汇总成最终结果。 [0024] The control system used in water computing node construction site to an embodiment of the present invention, by a central control node initiates periodic polling, by the respective potential participating node reports the remaining capacity calculated and reported in accordance with each node data, task decomposition, is assigned to a specific node calculation, and report the results of the last aggregated into a final result based on the information reported by each node. 因此,该系统充分利用了现场的个计算节点(如传感器和数据处理单元) 的剩余计算能力,可以有效地提升水利施工现场信息化水平。 Thus, the system takes full advantage of the site computing nodes (e.g., sensor and data processing unit) computing a remaining capacity, can effectively enhance the level of hydraulic construction site information.

[0025] 本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。 [0025] Additional aspects and advantages of the invention will be set forth in part in the description which follows, from the following description in part be apparent from, or learned by practice of the present invention.

附图说明 BRIEF DESCRIPTION

[0026] 本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中: [0026] The foregoing and / or other aspects and advantages of the invention will be described with reference to embodiments in conjunction with the embodiments become apparent and more readily appreciated below, wherein:

[0027] 图1是根据本发明一个实施例的应用于水利施工现场的计算节点的控制方法的流程图; [0027] FIG. 1 is a flowchart of a control method used in water construction site computing node according to one embodiment of the present invention;

[0028] 图2是根据本发明一个实施例的应用于水利施工现场的计算节点的控制方法实现的四个阶段示意图; [0028] FIG. 2 is a schematic view of four stages according to the control method implemented in the computing node is applied to a hydraulic construction site of the embodiment of the present invention;

[0029] 图3是根据本发明一个实施例的发现阶段的示意图; [0029] FIG. 3 is a schematic view according to the present invention, a discovery phase according to the embodiment;

[0030] 图4是根据本发明一个实施例的中心控制节点的维护信息模型示意图;以及 [0030] FIG. 4 is a schematic view of maintenance information model according to the central control node according to one embodiment of the present invention; and

[0031] 图5是根据本发明一个实施例的应用于水利施工现场的计算节点的控制系统的结构框图。 [0031] FIG. 5 is a block diagram showing a control system according to the computing node is applied to a hydraulic construction site of the embodiment of the present invention.

具体实施方式 detailed description

[0032] 下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。 [0032] Example embodiments of the present invention is described in detail below, exemplary embodiments of the embodiment shown in the accompanying drawings, wherein same or similar reference numerals designate the same or similar elements or elements having the same or similar functions. 下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。 By following with reference to the embodiments described are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

[0033] 以下结合附图描述根据本发明实施例应用于水利施工现场的计算节点的控制方法和系统。 [0033] Example embodiments described in conjunction with the following figures applied control method and system of hydraulic construction site computing nodes in accordance with the present invention.

[0034] 图1是根据本发明一个实施例的应用于水利施工现场的计算节点的控制方法的流程图。 [0034] FIG. 1 is a flowchart of a method of calculating a control node used in water construction site according to one embodiment of the present invention. 如图1所示,根据本发明一个实施例的应用于水利施工现场的计算节点的控制方法,包括以下步骤: 1, the control method applied to the compute nodes hydraulic construction site according to one embodiment of the present invention, comprising the steps of:

[0035] 步骤S101,采用定期轮询以发现可用于计算任务的多个计算节点。 [0035] In step S101, the use of periodic polling to discover the plurality of computing nodes may be used for computational tasks.

[0036] 具体而言,在一些示例中,结合图3所示,该步骤具体包括: [0036] Specifically, in some examples, in conjunction with FIG. 3, which comprises the step of:

[0037] 步骤1 :根据计算节点列表发送轮询请求并启用等待定时器。 [0037] Step 1: The computing node list and sends a polling request to enable the wait timer. 换言之,即中心控制节点按照上次计算维护的计算节点表最近最多用过的(Most Recently Used, MRU)节点列表,逐次发送轮询请求,并等待计算节点(如记为Nn)回应,并同时开启等待定时器,记为Tn〇 In other words, the central control node according to the node previously calculated is calculated to maintain the most recently used list node list (Most Recently Used, MRU), and sequentially transmits polling request, and waits for computing node (e.g., referred to as Nn) response, while Enable wait timer, denoted Tn〇

[0038] 在一些示例中,优选地,节点表采用了链表的形式,其格式如下: [0038] In some examples, preferably, the node table takes the form of the list, the following format:

[0039] 链表头,总数节点ID,节点寻址一下一个节点----^链表尾。 [0039] The head of the list, the total number of the node ID, node address ^ ---- at a node of the list.

[0040] 另外,在一些示例中,中心控制节点维护一张信息表,该表记录了各个计算节点的情况,包含了逻辑编号和物理编号的对应关系,相应地,对于物理编号失效的计算节点,也将在信息表里作相应的删除。 [0040] Further, in some examples, a central control node maintains information table, which records information on the individual computing nodes, comprising a correspondence relation of the logical number and physical number, respectively, for a number of physical computing node failure will also be deleted information for the corresponding table. 换言之,节点信息中包含了逻辑编号和物理编号的对应关系, 逻辑编号是指计算节点的ID,物理编号是指计算节点实际的表示,比如对于网卡而言,是Mac号,相应的,对于物理编号失效的计算节点,也在信息表里做相应的删除。 In other words, the node information includes a correspondence relationship between the logical number and the physical number and logical number refers to a computing node ID, a physical computing node numbers refer to the actual representation, such as for the card, a number Mac, correspondingly, the physical number of computing nodes fail, but also information table accordingly deleted. 在具体示例中,通用的消息格式用表格表示如下表1所示: In a particular example, a common message format table shown in Table 1 below:

[0041] [0041]

Figure CN104243579AD00071

[0042] [0042]

[0043] 表1 [0043] TABLE 1

[0044] 而实际实现的一个报文内容如下: [0044] The actual implementation of the contents of a message are as follows:

[0045] 〈 ? [0045] <? xmlversion = 〃1. 0"encoding = 〃utf_8" ? > xmlversion = 〃1. "encoding = 〃utf_8" 0?>

[0046] 〈message〉 [0046] <message>

[0047] 〈head〉 [0047] <head>

[0048] <message_id>DISCOVERY_REQ</message_id> ; [0048] <message_id> DISCOVERY_REQ </ message_id>;

[0049] <version>l. 0</version> [0049] <version> l. 0 </ version>

[0050] <src_node>l</src_node> [0050] <src_node> l </ src_node>

[0051] <dest_node>2</dest_node> [0051] <dest_node> 2 </ dest_node>

[0052] <time_tamp>2013-12-13:00:03:45:234</time_tamp> [0052] <time_tamp> 2013-12-13: 00: 03: 45: 234 </ time_tamp>

[0053] <seq_no>ll〈/seq_no> [0053] <seq_no> ll </ seq_no>

[0054] 〈/head〉 [0054] </ head>

[0055] 〈body〉 [0055] <body>

[0056] <broadcast>no</broadcast> [0056] <broadcast> no </ broadcast>

[0057] 〈/body〉 [0057] </ body>

[0058] 〈tail〉 [0058] <tail>

[0059] <checksum>123</checksum) [0059] <checksum> 123 </ checksum)

[0060] 〈/tail〉 [0060] </ tail>

[0061] 〈/message〉 [0061] </ message>

[0062] 进一步地,中心控制节点维护更多的信息,如图4所示的信息模型图,包含如下表2中所示的信息内容: [0062] Furthermore, the central control node maintains additional information, as shown in FIG. 4 FIG model information includes the information content as shown in Table 2 below:

[0063] [0063]

Figure CN104243579AD00081

[0064] [0064]

[0065] 表2 [0065] TABLE 2

[0066] 在一些示例中,优选地,发送的轮询请求为: [0066] In some examples, preferably, a polling request is sent:

[0067] 〈 ? [0067] <? xmlversion = 〃1. 0"encoding = "utf_8〃 ? xmlversion = 〃1. 0 "encoding =" utf_8〃? > >

[0068] 〈message〉 [0068] <message>

[0069] 〈head〉 [0069] <head>

[0070] 〈message-id>DISC0VERY-REQ〈/message-id> ; [0070] <message-id> DISC0VERY-REQ </ message-id>;

[0071] <src_node>l</src_node> [0071] <src_node> l </ src_node>

[0072] 〈dest-node>2〈/dest-node〉 [0072] <dest-node> 2 </ dest-node>

[0073] <version>l. 0</version> [0073] <version> l. 0 </ version>

[0074] 〈time-tamp>2013-12_13:00:03:45:234〈/time-tamp> [0074] <time-tamp> 2013-12_13: 00: 03: 45: 234 </ time-tamp>

[0075] 〈seq-no>ll〈/seq-no> [0075] <seq-no> ll </ seq-no>

[0076] 〈/head〉 [0076] </ head>

[0077] 〈body〉 [0077] <body>

[0078] <broadcast>no</broadcast> [0078] <broadcast> no </ broadcast>

[0079] 〈/body〉 [0079] </ body>

[0080] 〈tail〉 [0080] <tail>

[0081] <checksum>123</checksum) [0081] <checksum> 123 </ checksum)

[0082] 〈/tail〉 [0082] </ tail>

[0083] 〈/message〉 [0083] </ message>

[0084] 而在该步骤中,等待定时器例如可设置为10s。 [0084] In this step, for example, wait timer may be set to 10s.

[0085] 步骤2:各计算节点接收轮询请求,估算各自当前的计算能力,并发送至中心控制节点,具体包括: [0085] Step 2: each computing node receives the polling request, estimating respective current computing power, and transmitted to the central control node comprises:

[0086] Μ = N+P1+P2, [0086] Μ = N + P1 + P2,

[0087] 其中,Μ表示计算节点的当前计算能力,Ν为当前CPU占有率,Ρ1为过去一段时间的CPU占有率,P2为预期将来一段时间的CPU占有率。 [0087] where, Μ represents the current computing power of computing nodes, Ν current CPU share, Ρ1 as CPU usage over a period of time, P2 for the CPU occupancy expected future period of time.

[0088] 进一步地,在一些示例中,发送的回应报文格式如下: [0088] Further, in some examples, in response to the transmission packet format is as follows:

[0089] 〈 ? [0089] <? xmlversion = 〃1· 0"encoding = "utf_8〃 ? xmlversion = 〃1 · 0 "encoding =" utf_8〃? > >

[0090] 〈message〉 [0090] <message>

[0091] 〈head〉 [0091] <head>

[0092] 〈message-id>DISC0VERY-ACK〈/message-id> ; [0092] <message-id> DISC0VERY-ACK </ message-id>;

[0093] <src_node>2</src_node> [0093] <src_node> 2 </ src_node>

[0094] 〈dest-node>l〈/dest-node〉 [0094] <dest-node> l </ dest-node>

[0095] <version>l. 0</version> [0095] <version> l. 0 </ version>

[0096] 〈time-tamp>2013-12_13:00:03:45:234〈/time-tamp> [0096] <time-tamp> 2013-12_13: 00: 03: 45: 234 </ time-tamp>

[0097] <seq_no>l</seq_no> [0097] <seq_no> l </ seq_no>

[0098] 〈/head〉 [0098] </ head>

[0099] 〈body〉 [0099] <body>

[0100] <power>234</power) [0100] <power> 234 </ power)

[0101] 〈/body〉 [0101] </ body>

[0102] 〈tail〉 [0102] <tail>

[0103] <checksum>123</checksum) [0103] <checksum> 123 </ checksum)

[0104] 〈/tail〉 [0104] </ tail>

[0105] 〈/message〉 [0105] </ message>

[0106] 步骤3 :在等待定时器到期前,中心控制节点根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务。 [0106] Step 3: Before expiration of the wait timer, the central control node according to the calculation of the current capacity of the computing node determines whether a plurality of computing nodes may be accomplished computational tasks.

[0107] 步骤4 :如果可以,则将多个节点用于完成计算任务,否则继续发送轮询请求。 [0107] Step 4: If, then a plurality of nodes for performing computing tasks, or continue to transmit a polling request. 即中心控制节点根据各计算节点上报的情况,判断参与本次计算任务的这些计算节点(在考虑冗余的情况下)能否完成计算任务,如果可以完成,则进入指派阶段,即执行步骤S103。 I.e., the central control node according to the situation reported by each computing node determines compute nodes participating in the calculation task (in consideration of redundancy) can complete computing tasks, if possible, control goes to phase assignment, i.e., step S103 is performed . 如果不能完成,则继续进行轮询,以进入更大范围发现阶段,并且将步骤1中的点对点消息更改为广播的轮询请求消息。 If not completed, the polling continues to enter the larger discovery phase, and the step 1-point change request message to the polling message broadcast. 在一些示例中,具体地报文格式如下: In some examples, the particular message format is as follows:

[0108] 〈 ? [0108] <? xmlversion = 〃1· 0"encoding = 〃utf_8" ? > xmlversion = 〃1 · 0 "encoding = 〃utf_8"?>

[0109] 〈message〉 [0109] <message>

[0110] 〈head〉 [0110] <head>

[0111] <message_id>DISCOVERY_REQ</message_id> ; [0111] <message_id> DISCOVERY_REQ </ message_id>;

[0112] <src_node>l</src_node> [0112] <src_node> l </ src_node>

[0113] <dest_node>2</dest_node> [0113] <dest_node> 2 </ dest_node>

[0114] <version>l. 0</version> [0114] <version> l. 0 </ version>

[0115] <time_tamp>2013-12-13:00:03:45:234</time_tamp> [0115] <time_tamp> 2013-12-13: 00: 03: 45: 234 </ time_tamp>

[0116] <seq_no>ll〈/seq_no> [0116] <seq_no> ll </ seq_no>

[0117] 〈/head〉 [0117] </ head>

[0118] 〈body〉 [0118] <body>

[0119] <broadcast>yes</broadcast> [0119] <broadcast> yes </ broadcast>

[0120] 〈/body〉 [0120] </ body>

[0121] 〈tail〉 [0121] <tail>

[0122] <checksum>123</checksum) [0122] <checksum> 123 </ checksum)

[0123] 〈/tail〉 [0123] </ tail>

[0124] 〈/message〉 [0124] </ message>

[0125] 步骤5:当等待定时器到期时,不再等待计算节点的回应,并丢弃超时的回应消肩、。 [0125] Step 5: When the wait timer expires, without waiting for the response from the computing node, and discards the response time-out elimination of the shoulder.

[0126] 步骤S102,分别获取多个计算节点当前的计算能力,并将计算任务分解,并通过多个计算节点协同处理分解后的计算任务。 [0126] step S102, respectively, obtaining a plurality of computing power to calculate the current node, and the calculation task decomposition, and decomposing computing tasks processed by the plurality of cooperative computing nodes.

[0127] 具体而言,该步骤具体包括: [0127] Specifically, the step comprises:

[0128] 步骤A :设多个计算节点为N个,以及将计算任务分解为m个子任务,其中N>M。 [0128] Step A: setting a plurality of computing nodes is N, and the calculated task into subtasks m, where N> M. 例如多个计算节点分别为N1,N2,…Nn,计算任务T分解为m个子任务分别为T1,T2,…Tm, 并且N>M,则计算节点N1,N2,…Nn分别对应地处理子任务T1,T2,…Tm。 E.g. multiple computing nodes are N1, N2, ... Nn, T computing tasks divided into m sub-tasks are T1, T2, ... Tm, and N> M, the computing node N1, N2, ... Nn corresponding to each of the sub-process task T1, T2, ... Tm.

[0129] 在一些示例中,该步骤中的子任务的表达是语义化的,这样在异构环境中,跟计算节点所在的操作系统无关,并且上述分解的子任务统一采用MathML描述。 [0129] In some examples, the expression of the subtask step is semantic, so that in a heterogeneous environment, the operating system with independent computing node is located, and sub-tasks above described decomposition uniform application MathML.

[0130] 步骤B :将每个子任务发送给对应的计算节点,并启动超时定时器。 [0130] Step B: transmitting each sub task corresponding to compute node, and starts the timer.

[0131] 步骤C:维持各计算节点的包活(心跳)定时器H,并定时判断各计算节点是否失效。 [0131] Step C: maintain live each computing node packet (heartbeat) timer H, and determines whether or not the timing of each computing node failure.

[0132] 步骤D :在超时定时器到期前,接收各计算节点的计算结果。 [0132] Step D: before the timeout timer expires, the received calculation result of each computing node.

[0133] 需要说明的是,在上述过程中,出于容错的目的,目标计算节点对于收到和自身ID 不匹配的报文,简单丢弃即可。 [0133] Incidentally, in the above process, for the purpose of fault-tolerant computing node for the target and received self ID packets do not match, it can be simply discarded.

[0134] 步骤S103,每个计算节点分别将处理结果发送至中心控制节点。 [0134] step S103, the node is calculated separately for each processing result is transmitted to the central control node.

[0135] 步骤S104,中心控制节点分析每个计算节点的处理结果以对多个计算节点进行控制。 [0135] step S104, the central control node analysis processing result of each computing node of the plurality of computing nodes to be controlled.

[0136] 综上所述,本发明的方法充分利用铺设在现场的大量各种传感器和数据处理单元(即计算节点)的剩余计算能力,通过合适的协议进行组网,而该协议主要可概括为四个阶段,分别为:发现、指派、上报和汇总,如图2所示。 [0136] In summary, the method of the present invention take advantage of a large variety of laying in the field of sensors and data processing unit (i.e. computing node) remaining capacity calculation performed by a suitable network protocol, and the protocol can be summarized mainly four stages, namely: discovery, assignment, and summary reports, as shown in FIG.

[0137] 具体而言,在发现阶段,由中心控制节点发起定期轮询,以发现可用于本次计算任务的计算节点;指派阶段(分配阶段)是在掌握了现有计算节点能力的前提下,将计算任务进行分解,同时协同计算;上报阶段就是各分配了子任务的计算节点将计算结果通过网络上报到中心控制节点,其中,上报的结果有两种,一个是计算成功,另外一个是失败,当然由于是通过网络进行的,需要中心控制节点检测失效的计算节点,如果在规定的时间内不能完成分配的子任务,则该计算节点不上报;汇总阶段是将上述上报的结果进行分析汇总,对于失败的结果发送到另外有效的计算节点,同时等待计算结果,具体步骤与分配阶段类同, 属于二次分配。 [0137] Specifically, in the discovery phase, initiated periodically polled by the central control node, it may be used to find the current node calculation computing tasks; assignment phase (distribution phase) is provided in the master node capability under the existing computing the computing tasks decomposition, while cooperative computing; reported by each stage is assigned a computing node subtask results reported to the central control node via a network, wherein there are two results reported, a calculation is successful, the other is fails, of course, require the central control node calculates a node detects failure because it is through the network, if not complete assigned within a predetermined time sub-tasks, the computing node does not report; aggregated phase is a result of the reported analysis summary, the results failed to send additional effective computing node, while waiting for the results, the specific steps and distribution stages similar, belonging to the second distribution. 更极端的情况下,二次分配也得不到好的结果,可以再多次尝试,直到到达设定的时间或者达到尝试次数后作废本次计算任务。 Under more extreme cases, good results can not be obtained secondary distribution may then several attempts, until reaching the set time or number of attempts to reach this void computing tasks.

[0138] 在一些示例中,在考虑到节点失效的情况下,本发明的方法采取冗余策略,同一分解的子任务可以分配到多个计算机节点。 [0138] In some examples, taking into account the node failure, the method of the present invention take the redundancy strategy, the same decomposition can be assigned to a plurality of subtasks computer nodes. 这样,也可以比较执行相同任务的计算节点的计算结果。 Thus, the results may be compared with computing node perform the same task.

[0139] 在本发明的一个实施例中,各个计算节点之间的通信协议采用XML(ExtensibleMarkup Language,可扩展标记语言)格式并通过大型企业内部进行标准化或者遵循相应的国际国家标准。 [0139] In one embodiment of the present invention, the communication protocol between the respective computing nodes using XML (ExtensibleMarkup Language, extensible Markup Language) format and an internal normalized by large companies or to follow the appropriate international national standards. 具体地说,就现有技术而言,在企业大型多单元分布系统中,一般通过网络套接字(Socket)协议作为应用单元之间进行数据交换的通常方法,基本上采取的是自定义报文格式,无论是定长或者分隔符的,但是,由于这些自定义的信息格式缺乏统一标准,随意性大,通用性,灵活性不足,不能满足企业IT建设周期长以及新技术层出不穷的现实情况的需求。 Specifically, in a conventional technology, in a large multi-unit systems distributed enterprise, typically through the network socket (the Socket) between an application protocol as the conventional method of data exchange unit, taken substantially a custom message text format, either fixed-length or delimited, however, because these custom message formats lack of uniform standards, arbitrary and versatility, lack of flexibility, enterprise IT can not meet the long construction period as well as new technologies emerging realities It needs. 因此,本发明采用标准的XML格式的通信协议来作为应用的数据交换标准。 Accordingly, the present invention employs a standard communication protocol to exchange XML format standard for data applications.

[0140] 根据本发明实施例的应用于水利施工现场的计算节点的控制方法,由中心控制节点发起定期轮询,由潜在参与节点上报各自的剩余计算能力,并根据各节点上报的数据,进行任务分解,指派到指定节点进行计算,并上报计算结果,最后根据各个节点上报的信息汇总成最终结果。 [0140] The control method applied to the compute nodes hydraulic construction site to an embodiment of the present invention, by a central control node initiates periodic polling, by the respective potential participating node reports the remaining capacity calculated and reported in accordance with each node data, task decomposition, is assigned to a specific node calculation, and report the results of the last aggregated into a final result based on the information reported by each node. 因此,该方法充分利用了现场的各计算节点(如传感器和数据处理单元) 的剩余计算能力,可以有效地提升水利施工现场信息化水平。 Thus, the method makes full use of each computing node in the scene (such as sensors and data processing unit) computing remaining capacity, can effectively enhance the level of hydraulic construction site information.

[0141] 本发明的进一步实施例还提供了一种应用于水利施工现场的计算节点的控制系统。 [0141] Further embodiments of the present invention further provides a control system used in water construction site computing nodes. 如图5所示,根据本发明一个实施例的应用于水利施工现场的计算节点的控制系统500,包括:发现模块510、分配模块520、上报模块530和控制模块540。 As shown in FIG. 5. The control system calculates a node used in water construction site according to one embodiment of the present invention 500, comprising: a discovery module 510, distribution module 520, reporting module 530 and control module 540.

[0142] 其中,发现模块510用于通过定期轮询以发现可用于计算任务的多个计算节点。 [0142] wherein, for the discovery module 510 to discover by periodically polling the plurality of computing nodes may be used to compute task. 在一些示例中,结合图3所示,具体概括为以下步骤: In some examples, in conjunction with FIG. 3, summarized in the following specific steps:

[0143] 步骤1 :根据计算节点列表发送轮询请求并启用等待定时器。 [0143] Step 1: The computing node list and sends a polling request to enable the wait timer. 换言之,即中心控制节点(包含于控制模块540)按照上次计算维护的计算节点表最近最多用过的(Most Recently Used,MRU)节点列表,逐次发送轮询请求,并等待计算节点(如记为Nn)回应,并同时开启等待定时器,记为Τη。 In other words, the central control node (including the control module 540) in accordance with the previously calculated table maintained by the computing node most recently used node list (Most Recently Used, MRU), and sequentially transmits polling request, and waits for computing node (e.g., referred to as Nn) response, and at the same time turn wait timer, denoted Τη.

[0144] 在一些示例中,优选地,节点表采用了链表的形式,其格式如下: [0144] In some examples, preferably, the node table takes the form of the list, the following format:

[0145] 链表头,总数节点ID,节点寻址一下一个节点----^链表尾。 [0145] head of the list, the total number of the node ID, node address ^ ---- at a node of the list.

[0146] 另外,在一些示例中,中心控制节点维护一张信息表,该表记录了各个计算节点的情况,包含了逻辑编号和物理编号的对应关系,相应地,对于物理编号失效的计算节点,也将在信息表里作相应的删除。 [0146] Further, in some examples, a central control node maintains information table, which records information on the individual computing nodes, comprising a correspondence relation of the logical number and physical number, respectively, for a number of physical computing node failure will also be deleted information for the corresponding table. 换言之,节点信息中包含了逻辑编号和物理编号的对应关系, 逻辑编号是指计算节点的ID,物理编号是指计算节点实际的表示,比如对于网卡而言,是Mac号,相应的,对于物理编号失效的计算节点,也在信息表里做相应的删除。 In other words, the node information includes a correspondence relationship between the logical number and the physical number and logical number refers to a computing node ID, a physical computing node numbers refer to the actual representation, such as for the card, a number Mac, correspondingly, the physical number of computing nodes fail, but also information table accordingly deleted. 在具体示例中,通用的消息格式用表格表示如下表1所示: In a particular example, a common message format table shown in Table 1 below:

[0147] [0147]

Figure CN104243579AD00121

[0148] 表1 [0148] TABLE 1

[0149] 而实际实现的一个报文内容如下: [0149] The actual implementation of the contents of a message are as follows:

[0150] < ? xmlversion = 〃L 0〃encoding = 〃utf_8〃 ? [0150] <? Xmlversion = 〃L 0〃encoding = 〃utf_8〃? > >

[0151] 〈message〉 [0151] <message>

[0152] 〈head〉 [0152] <head>

[0153] 〈message-id>DISC0VERY-REQ〈/message-id> ; [0153] <message-id> DISC0VERY-REQ </ message-id>;

[0154] <version>l. 0</version> [0154] <version> l. 0 </ version>

[0155] <src_node>l</src_node> [0155] <src_node> l </ src_node>

[0156] 〈dest-node>2〈/dest-node〉 [0156] <dest-node> 2 </ dest-node>

[0157] 〈time-tamp>2013-12_13:00:03:45:234〈/time-tamp> [0157] <time-tamp> 2013-12_13: 00: 03: 45: 234 </ time-tamp>

[0158] 〈seq-no>ll〈/seq-no> [0158] <seq-no> ll </ seq-no>

[0159] 〈/head〉 [0159] </ head>

[0160] 〈body〉 [0160] <body>

[0161] <broadcast>no</broadcast> [0161] <broadcast> no </ broadcast>

[0162] 〈/body〉 [0162] </ body>

[0163] 〈tail〉 [0163] <tail>

[0164] <checksum>123</checksum) [0164] <checksum> 123 </ checksum)

[0165] 〈/tail〉 [0165] </ tail>

[0166] 〈/message〉 [0166] </ message>

[0167] 进一步地,中心控制节点维护更多的信息,如图4所示的信息模型图,包含如下表2中所示的信息内容: [0167] Furthermore, the central control node maintains additional information, as shown in FIG. 4 FIG model information includes the information content as shown in Table 2 below:

[0168] [0168]

Figure CN104243579AD00131

[0169] [0169]

[0170] 表2 [0170] TABLE 2

[0171] 在一些示例中,优选地,发送的轮询请求为: [0171] In some examples, preferably, a polling request is sent:

[0172] 〈 ? [0172] <? xmlversion = 〃1· 0"encoding = "utf_8〃 ? xmlversion = 〃1 · 0 "encoding =" utf_8〃? > >

[0173] 〈message〉 [0173] <message>

[0174] 〈head〉 [0174] <head>

[0175] 〈message-id>DISC0VERY-REQ〈/message-id> ; [0175] <message-id> DISC0VERY-REQ </ message-id>;

[0176] <src_node>l</src_node> [0176] <src_node> l </ src_node>

[0177] 〈dest-node>2〈/dest-node〉 [0177] <dest-node> 2 </ dest-node>

[0178] <version>l. 0</version> [0178] <version> l. 0 </ version>

[0179] 〈time-tamp>2013-12_13:00:03:45:234〈/time-tamp> [0179] <time-tamp> 2013-12_13: 00: 03: 45: 234 </ time-tamp>

[0180] 〈seq-no>ll〈/seq-no> [0180] <seq-no> ll </ seq-no>

[0181] 〈/head〉 [0181] </ head>

[0182] 〈body〉 [0182] <body>

[0183] <broadcast>no</broadcast> [0183] <broadcast> no </ broadcast>

[0184] 〈/body〉 [0184] </ body>

[0185] 〈tail〉 [0185] <tail>

[0186] <checksum>123</checksum) [0186] <checksum> 123 </ checksum)

[0187] 〈/tail〉 [0187] </ tail>

[0188] 〈/message〉 [0188] </ message>

[0189] 而在该步骤中,等待定时器例如可设置为10s。 [0189] In this step, for example, wait timer may be set to 10s.

[0190] 步骤2:各计算节点接收轮询请求,估算各自当前的计算能力,并发送至控制模块540 (控制模块包括中心控制节点),具体包括: [0190] Step 2: each computing node receives the polling request, estimating respective current computing power, and transmitted to the control module 540 (control module comprises a central control node) comprises:

[0191] M = N+P1+P2, [0191] M = N + P1 + P2,

[0192] 其中,Μ为计算节点的当前计算能力,Ν为当前CPU占有率,Ρ1为过去一段时间的CPU占有率,P2为预期将来一段时间的CPU占有率。 [0192] where, Μ for the current computing power of computing nodes, Ν current CPU share, Ρ1 as CPU usage over a period of time, P2 for the CPU occupancy expected future period of time.

[0193] 进一步地,在一些示例中,发送的回应报文格式如下: [0193] Further, in some examples, in response to the transmission packet format is as follows:

[0194] 〈 ? [0194] <? xmlversion = 〃1· 0"encoding = "utf_8〃 ? xmlversion = 〃1 · 0 "encoding =" utf_8〃? > >

[0195] 〈message〉 [0195] <message>

[0196] 〈head〉 [0196] <head>

[0197] 〈message-id>DISC0VERY-ACK〈/message-id> ; [0197] <message-id> DISC0VERY-ACK </ message-id>;

[0198] <src_node>2</src_node> [0198] <src_node> 2 </ src_node>

[0199] 〈dest-node>l〈/dest-node〉 [0199] <dest-node> l </ dest-node>

[0200] <version>l. 0</version> [0200] <version> l. 0 </ version>

[0201] 〈time-tamp>2013-12_13:00:03:45:234〈/time-tamp> [0201] <time-tamp> 2013-12_13: 00: 03: 45: 234 </ time-tamp>

[0202] 〈seq-no>l〈/seq-no> [0202] <seq-no> l </ seq-no>

[0203] 〈/head〉 [0203] </ head>

[0204] 〈body〉 [0204] <body>

[0205] <power>234</power) [0205] <power> 234 </ power)

[0206] 〈/body〉 [0206] </ body>

[0207] 〈tail〉 [0207] <tail>

[0208] <checksum>123</checksum) [0208] <checksum> 123 </ checksum)

[0209] 〈/tail〉 [0209] </ tail>

[0210] 〈/message〉 [0210] </ message>

[0211] 步骤3 :在等待定时器到期前,控制模块540根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务。 [0211] Step 3: Before expiration of the wait timer, the control module 540 based on the current computing capabilities each computing node determines whether a plurality of computing nodes may be accomplished computational tasks.

[0212] 步骤4 :如果可以,则将多个节点用于完成计算任务,否则继续发送轮询请求。 [0212] Step 4: If, then a plurality of nodes for performing computing tasks, or continue to transmit a polling request. 即中心控制节点根据各计算节点上报的情况,判断参与本次计算任务的这些计算节点(在考虑冗余的情况下)能否完成计算任务,如果可以完成,则进入指派阶段,分配计算任务。 I.e., the central control node according to the situation reported by each computing node determines compute nodes participating in the calculation task (in consideration of redundancy) can complete computing tasks, if possible, control goes to phase assignment, assignment calculation tasks. 如果不能完成,则继续进行轮询,以进入更大范围发现阶段,并且将步骤1中的点对点消息更改为广播的轮询请求消息。 If not completed, the polling continues to enter the larger discovery phase, and the step 1-point change request message to the polling message broadcast. 在一些示例中,具体地报文格式如下: In some examples, the particular message format is as follows:

[0213] 〈 ? [0213] <? xmlversion = 〃1· 0"encoding = 〃utf_8" ? > xmlversion = 〃1 · 0 "encoding = 〃utf_8"?>

[0214] 〈message〉 [0214] <message>

[0215] 〈head〉 [0215] <head>

[0216] <message_id>DISCOVERY_REQ</message_id> ; [0216] <message_id> DISCOVERY_REQ </ message_id>;

[0217] <src_node>l</src_node> [0217] <src_node> l </ src_node>

[0218] <dest_node>2</dest_node> [0218] <dest_node> 2 </ dest_node>

[0219] <version>l. 0</version> [0219] <version> l. 0 </ version>

[0220] <time_tamp>2013-12-13:00:03:45:234</time_tamp> [0220] <time_tamp> 2013-12-13: 00: 03: 45: 234 </ time_tamp>

[0221] <seq_no>ll</seq_no> [0221] <seq_no> ll </ seq_no>

[0222] 〈/head〉 [0222] </ head>

[0223] 〈body〉 [0223] <body>

[0224] <broadcast>yes</broadcast> [0224] <broadcast> yes </ broadcast>

[0225] 〈/body〉 [0225] </ body>

[0226] 〈tail〉 [0226] <tail>

[0227] <checksum>123</checksum) [0227] <checksum> 123 </ checksum)

[0228] 〈/tail〉 [0228] </ tail>

[0229] 〈/message〉 [0229] </ message>

[0230] 步骤5 :当等待定时器到期时,则发现模块510不再等待计算节点的回应,并丢弃超时的回应消息。 [0230] Step 5: When the wait timer expires, without waiting for the response from the discovery module 510 computing nodes, time-out and discard response message.

[0231] 分配模块520用于分别获取多个计算节点当前的计算能力,并将计算任务分解, 并通过多个计算节点协同处理分解后的计算任务。 [0231] allocation module 520 for respectively obtaining a plurality of computing power to calculate the current node, and the calculation task decomposition, and decomposing computing tasks processed by the plurality of cooperative computing nodes. 在一些示例中,具体概括为以下步骤: In some examples, summarized in the following specific steps:

[0232] 步骤A :设多个计算节点为N个,以及将计算任务分解为m个子任务,其中N>M。 [0232] Step A: setting a plurality of computing nodes is N, and the calculated task into subtasks m, where N> M. 例如多个计算节点分别为N1,N2,…Nn,计算任务T分解为m个子任务分别为T1,T2,…Tm, 并且N>M,则计算节点N1,N2,…Nn分别对应地处理子任务T1,T2,…Tm。 E.g. multiple computing nodes are N1, N2, ... Nn, T computing tasks divided into m sub-tasks are T1, T2, ... Tm, and N> M, the computing node N1, N2, ... Nn corresponding to each of the sub-process task T1, T2, ... Tm.

[0233] 在一些示例中,该步骤中的子任务的表达是语义化的,这样在异构环境中,跟计算节点所在的操作系统无关,并且上述分解的子任务统一采用MathML描述。 [0233] In some examples, the expression of the subtask step is semantic, so that in a heterogeneous environment, the operating system with independent computing node is located, and sub-tasks above described decomposition uniform application MathML.

[0234] 步骤B :将每个子任务发送给对应的计算节点,并启动超时定时器。 [0234] Step B: transmitting each sub task corresponding to compute node, and starts the timer.

[0235] 步骤C :维持各计算节点的包活(心跳)定时器H,并定时判断各计算节点是否失效。 [0235] Step C: maintain live each computing node packet (heartbeat) timer H, and determines whether or not the timing of each computing node failure.

[0236] 步骤D :在超时定时器到期前,接收各计算节点的计算结果。 [0236] Step D: before the timeout timer expires, the received calculation result of each computing node.

[0237] 需要说明的是,在上述过程中,出于容错的目的,目标计算节点对于收到和自身ID 不匹配的报文,简单丢弃即可。 [0237] Incidentally, in the above process, for the purpose of fault-tolerant computing node for the target and received self ID packets do not match, it can be simply discarded.

[0238] 上报模块530用于上报每个计算节点的处理结果。 [0238] The reporting module 530 adapted to report the result of the processing of each computing node. 具体地说,上报模块530将每个计算节点的处理结果上报至中心控制节点,也即上报至控制模块540。 Specifically, module 530 reports the processing result of each computing node is reported to the central control node, that is reported to the control module 540.

[0239] 控制模块540分析每个计算节点的处理结果以对多个计算节点进行控制。 [0239] The control module 540 analyzes the processing result of each computing node to control the plurality of computing nodes.

[0240] 综上所述,本发明的系统500充分利用铺设在现场的大量各种传感器和数据处理单元(即计算节点)的剩余计算能力,通过合适的协议进行组网,而该协议主要可概括为四个阶段,分别为:发现、指派、上报和汇总,如图2所示。 [0240] In summary, the system 500 of the present invention take advantage of a large variety of laying in the field of sensors and data processing unit (i.e. computing node) remaining capacity calculation performed by a suitable network protocol, and the protocol can be mainly summarized in four stages, namely: discovery, assignment, and summary reports, as shown in FIG.

[0241] 具体而言,在发现阶段,中心控制节点发起定期轮询,以发现可用于本次计算任务的计算节点;指派阶段(分配阶段)是在掌握了现有计算节点能力的前提下,将计算任务进行分解,同时协同计算;上报阶段就是各分配了子任务的计算节点将计算结果通过网络上报到中心控制节点,其中,上报的结果有两种,一个是计算成功,另外一个是失败,当然由于是通过网络进行的,需要中心控制节点检测失效的计算节点,如果在规定的时间内不能完成分配的子任务,则该计算节点不上报;汇总阶段是将上述上报的结果进行分析汇总,对于失败的结果发送到另外有效的计算节点,同时等待计算结果,具体步骤与分配阶段类同,属于二次分配。 [0241] Specifically, in the discovery phase, the central control node initiates periodic polling, it may be used to find the current node calculation computing tasks; assignment phase (phase distribution) is calculated prior premise master node capabilities, computing task decomposition, while cooperative computing; reported by each stage is assigned a computing node subtask results reported to the central control node via a network, wherein there are two reported results, a calculation is successful, a further failure of course as is the required central control node calculates a node detects the failure through the network, if not complete assigned within a predetermined time sub-tasks, the computing node does not report; aggregated phase is a result of the reported analysis summary , the results failed to send additional effective computing node, while waiting for the results, the specific steps and distribution stages similar, belonging to the second distribution. 更极端的情况下,二次分配也得不到好的结果,可以再多次尝试,直到到达设定的时间或者达到尝试次数后作废本次计算任务。 Under more extreme cases, good results can not be obtained secondary distribution may then several attempts, until reaching the set time or number of attempts to reach this void computing tasks.

[0242] 在一些示例中,在考虑到节点失效的情况下,分配模块520采取冗余策略,同一分解的子任务可以分配到多个计算机节点。 [0242] In some examples, taking into account the node fails, redundancy policy distribution module 520 to take the same decomposition can be assigned to a plurality of subtasks computer nodes. 这样,也可以比较执行相同任务的计算节点的计算结果。 Thus, the results may be compared with computing node perform the same task.

[0243] 在本发明的一个实施例中,各个计算节点之间的通信协议采用XML(ExtensibleMarkup Language,可扩展标记语言)格式并通过大型企业内部进行标准化或者遵循相应的国际国家标准。 [0243] In one embodiment of the present invention, the communication protocol between the respective computing nodes using XML (ExtensibleMarkup Language, extensible Markup Language) format and an internal normalized by large companies or to follow the appropriate international national standards. 具体地说,就现有技术而言,在企业大型多单元分布系统中,一般通过网络套接字(Socket)协议作为应用单元之间进行数据交换的通常方法,基本上采取的是自定义报文格式,无论是定长或者分隔符的,但是,由于这些自定义的信息格式缺乏统一标准,随意性大,通用性,灵活性不足,不能满足企业IT建设周期长以及新技术层出不穷的现实情况的需求。 Specifically, in a conventional technology, in a large multi-unit systems distributed enterprise, typically through the network socket (the Socket) between an application protocol as the conventional method of data exchange unit, taken substantially a custom message text format, either fixed-length or delimited, however, because these custom message formats lack of uniform standards, arbitrary and versatility, lack of flexibility, enterprise IT can not meet the long construction period as well as new technologies emerging realities It needs. 因此,本发明采用标准的XML格式的通信协议来作为应用的数据交换标准。 Accordingly, the present invention employs a standard communication protocol to exchange XML format standard for data applications.

[0244] 根据本发明实施例的应用于水利施工现场的计算节点的控制系统,由中心控制节点发起定期轮询,由潜在参与节点上报各自的剩余计算能力,并根据各节点上报的数据,进行任务分解,指派到指定节点进行计算,并上报计算结果,最后根据各个节点上报的信息汇总成最终结果。 [0244] The control system used in water computing node construction site to an embodiment of the present invention, by a central control node initiates periodic polling, by the respective potential participating node reports the remaining capacity calculated and reported in accordance with each node data, task decomposition, is assigned to a specific node calculation, and report the results of the last aggregated into a final result based on the information reported by each node. 因此,该系统充分利用了现场的各计算节点(如传感器和数据处理单元) 的剩余计算能力,可以有效地提升水利施工现场信息化水平。 Thus, this system makes full use of each computing node field (such as sensors and data processing unit) computing remaining capacity, can effectively enhance the level of hydraulic construction site information.

[0245] 在本发明的描述中,需要理解的是,术语"中心"、"纵向"、"横向"、"长度"、"宽度"、 "厚度"、"上"、"下"、"前"、"后"、"左"、"右"、"坚直"、"水平"、"顶"、"底" "内"、"外"、"顺时针"、"逆时针"、"轴向"、"径向"、"周向"等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。 [0245] In the description of the present invention, it is to be understood that the term "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front "," rear "," left "," right "," Kennedy straight "," horizontal "," top "," bottom "," inner "," outer "," clockwise "," counterclockwise "," axis to "," radial "," circumferential "and indicates the position or location or orientation relationship positional relationship shown in the accompanying drawings, for convenience of description only and the present invention is to simplify the description, not indicate or imply referred devices or elements must have a specific orientation, the orientation of a particular configuration and operation, can not be construed as limiting the present invention.

[0246] 此外,术语"第一"、"第二"仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。 [0246] In addition, the terms "first", "second" are used to indicate or imply relative importance or the number of technical features specified implicitly indicated the purpose of description and should not be understood. 由此,限定有"第一"、"第二"的特征可以明示或者隐含地包括至少一个该特征。 Thus, there is defined "first", "second" features may be explicitly or implicitly include at least one of the feature. 在本发明的描述中,"多个"的含义是至少两个,例如两个, 三个等,除非另有明确具体的限定。 In the description of the present invention, the meaning of the "plurality" is at least two, e.g. two, three, etc., unless explicitly specifically limited.

[0247] 在本发明中,除非另有明确的规定和限定,术语"安装"、"相连"、"连接"、"固定"等术语应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系,除非另有明确的限定。 [0247] In the present invention, unless otherwise expressly specified or limited, the terms "mounted," "connected," "connected," "fixed" and like terms are to be broadly understood, for example, may be a fixed connection, may be detachable connection, or integrally; may be a mechanical connector may be electrically connected; may be directly connected, can also be connected indirectly through intervening structures, it may be interaction between the two internal communicating elements or two elements, unless otherwise expressly limited. 对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。 Those of ordinary skill in the art, to be understood that the specific meanings in the present invention depending on the circumstances.

[0248] 在本发明中,除非另有明确的规定和限定,第一特征在第二特征"上"或"下"可以是第一和第二特征直接接触,或第一和第二特征通过中间媒介间接接触。 [0248] In the present invention, unless otherwise expressly specified or limited, the first feature in the "on" a second or "lower" may be in direct contact with the first and second characteristic or the first and second characteristics by intermediary indirect contact. 而且,第一特征在第二特征"之上"、"上方"和"上面"可是第一特征在第二特征正上方或斜上方,或仅仅表示第一特征水平高度高于第二特征。 Also, the first feature a second feature in the "on", "above" and "upper" But first feature a second feature directly above or obliquely upward or simply represents a first characteristic level is higher than the height of the second feature. 第一特征在第二特征"之下"、"下方"和"下面"可以是第一特征在第二特征正下方或斜下方,或仅仅表示第一特征水平高度小于第二特征。 In the first feature a second feature "beneath", "below" and "lower" may be just below the first feature or the second feature obliquely downward, or just less than the level represented by the first feature a second feature.

[0249] 在本说明书的描述中,参考术语"一个实施例"、"一些实施例"、"示例"、"具体示例"、或"一些示例"等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。 [0249] In the description of the present specification, reference to the term "one embodiment," "some embodiments", "an example", "a specific example", or "some examples" means that a description of the exemplary embodiment or embodiments described a particular feature, structure, material, or characteristic is included in at least one embodiment of the present invention, embodiments or examples. 在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。 In the present specification, a schematic representation of the above terms must not be the same for the embodiment or exemplary embodiments. 而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。 Furthermore, the particular features, structures, materials, or characteristics described may be in any one or more embodiments or examples combined in suitable manner. 此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。 Furthermore, different embodiments or examples and embodiments or features of different exemplary embodiments without conflicting, those skilled in the art described in this specification can be combined and the combination thereof.

[0250] 尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。 [0250] Although the above has been illustrated and described embodiments of the present invention, it is understood that the above embodiments are exemplary and are not to be construed as limiting the present invention, within the scope of the invention to those of ordinary skill in the art It may be variations of the above embodiments, modifications, alternatives, and modifications.

Claims (10)

  1. 1. 一种应用于水利施工现场的计算节点的控制方法,其特征在于,包括以下步骤: 采用定期轮询以发现可用于计算任务的多个计算节点; 分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务; 每个计算节点分别将处理结果发送至中心控制节点; 所述中心控制节点分析每个计算节点的处理结果以对所述多个计算节点进行控制。 A control method applied to computing node hydraulic construction site, characterized by, comprising the steps of: using periodic polling to discover the plurality of computing nodes may be used to compute task; respectively acquire the plurality of computing nodes of the current computing power, and the computing task decomposition, and the node computing tasks calculated by co-processing a plurality of said decomposition; each compute node are sent to the processing result to the central control node; said central control node analysis calculated for each node processing result of the multiple computing nodes to be controlled.
  2. 2. 根据权利要求1所述的应用于水利施工现场的计算节点的控制方法,其特征在于, 所述采用定期轮询以发现可用于计算任务的多个计算节点,具体包括: 根据计算节点列表发送轮询请求并启用等待定时器; 各计算节点接收所述轮询请求,估算各自当前的计算能力,并发送至中心控制节点,具体包括: Μ = Ν+Ρ1+Ρ2, 其中,Μ为计算节点的当前计算能力,Ν为当前CPU占有率,Ρ1为过去一段时间的CPU 占有率,P2为预期将来一段时间的CPU占有率; 在所述等待定时器到期前,所述中心控制节点根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务; 如果可以,则将所述多个节点用于完成所述计算任务,否则继续发送轮询请求; 当所述等待定时器到期时,不再等待计算节点的回应,并丢弃超时的回应消息。 The control method applied to a computing node according to claim hydraulic construction site, wherein periodically polling to discover the use of a plurality of computing nodes may be used to compute task, comprises: a list of nodes based on the calculation and transmits polling request waiting timer is enabled; each computing node receives the polling request, each of the estimated current computing capabilities, and transmitted to the central control node, comprises: Μ = Ν + Ρ1 + Ρ2, where, [mu] is calculated current computing node, v is the current CPU usage, CPU usage rho] 1 of the past period, P2 of the CPU occupancy expected future period of time; wait before the timer expires, the central control node according to the calculation of the current capacity of each computing node, it is determined whether a plurality of computing nodes computational tasks to be completed; if so, then the plurality of nodes for performing the computing task, or continue to transmit a polling request; when the wait timer expires, without waiting for the response from the computing node, and discards the response message timeout.
  3. 3. 根据权利要求1所述的应用于水利施工现场的计算节点的控制方法,其特征在于, 所述分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务,具体包括: 设所述多个计算节点为N个,以及将所述计算任务分解为m个子任务,其中N>M ; 将每个子任务发送给对应的计算节点,并启动超时定时器; 定时判断各计算节点是否失效; 在所述超时定时器到期前,接收各计算节点的计算结果。 The control method applied to a computing node according to claim hydraulic construction site, wherein each said current computing capabilities of the acquired plurality of computing nodes, and the computing task decomposition, and by the plurality of computing nodes collaborative computing tasks after decomposition treatment, comprises: a plurality of computing nodes arranged into N, and the calculated task into subtasks m, where N> m; each subtask transmission corresponding to compute node, and starts a timeout timer; determining whether the timing of each computing node failure; before the timeout timer expires, the received calculation result of each computing node.
  4. 4. 根据权利要求3所述的应用于水利施工现场的计算节点的控制方法,其特征在于, 还包括: 采取冗余的策略,同一分解的子任务可以分配到多个计算节点。 According to claim 3 is applied to the control method of computing nodes hydraulic construction site, characterized by further comprising: a redundancy strategy adopted, the same decomposition subtasks can be assigned to multiple computing nodes.
  5. 5. 根据权利要求1-4任一项所述的应用于水利施工现场的计算节点的控制方法,其特征在于,所述各个计算节点之间采用XML格式的通信协议。 The control method applied to compute nodes hydraulic construction site according to claim one of the claims 1-4, wherein the communication protocol uses XML format between the respective computing nodes.
  6. 6. -种应用于水利施工现场的计算节点的控制系统,其特征在于,包括: 发现模块,所述发现模块用于通过定期轮询以发现可用于计算任务的多个计算节点; 分配模块,所述分配模块用于分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务; 上报模块,所述上报模块用于上报每个计算节点的处理结果; 控制模块,所述控制模块分析每个计算节点的处理结果以对所述多个计算节点进行控制。 6. - Species applied to the control system of hydraulic construction site computing node, characterized in that, comprising: a discovery module, the discovery module is configured by periodically polling can be used to find a plurality of computing nodes of the computational tasks; allocation module, the allocation module is configured to obtain the plurality of computing nodes respectively current computing capabilities, and the computing task decomposition, and co-processing computation task node decomposed by said plurality of calculation; reporting module, the reporting module processing result of each computing node configured to report; a control module, the control module analyzes the processing result of each computing node of the plurality of computing nodes to be controlled.
  7. 7. 根据权利要求6所述的应用于水利施工现场的计算节点的控制系统,其特征在于, 所述发现模块通过定期轮询以发现可用于计算任务的多个计算节点,具体包括: 根据计算节点列表发送轮询请求并启用等待定时器; 各计算节点接收所述轮询请求,估算各自当前的计算能力,并发送至控制模块,具体包括: Μ = Ν+Ρ1+Ρ2, 其中,Μ为计算节点的当前计算能力,Ν为当前CPU占有率,Ρ1为过去一段时间的CPU 占有率,P2为预期将来一段时间的CPU占有率; 在所述等待定时器到期前,所述控制模块根据各计算节点当前的计算能力,判断多个计算节点是否可完成计算任务; 如果可以,则将所述多个节点用于完成所述计算任务,否则继续发送轮询请求; 当所述等待定时器到期时,则所述发现模块不再等待计算节点的回应,并丢弃超时的回应消息。 The control system of hydraulic construction applies to the computing node as claimed in claim 6, field, wherein, the discovery module to discover by periodically polling the plurality of computing nodes may be used to compute task, comprises: the calculation node list and sends a polling request waiting timer is enabled; receiving the polling request of each node is calculated, the estimated respective current computing power, and transmitted to the control module comprises: Μ = Ν + Ρ1 + Ρ2, where, [mu] is the computing a current computing node, v is the current CPU usage, CPU usage rho] 1 of the past period of time, P2 is the expected future CPU occupancy period; wait before the timer expires, the control module according to calculation of the current capacity of each computing node, it is determined whether a plurality of computing nodes computational tasks to be completed; if so, then the plurality of nodes for performing the computing task, or continue to transmit a polling request; when the wait timer expires, then the discovery module without waiting for the response from the computing node, and discards the response message timeout.
  8. 8. 根据权利要求6所述的应用于水利施工现场的计算节点的控制系统,其特征在于, 所述分配模块分别获取所述多个计算节点当前的计算能力,并将所述计算任务分解,并通过所述多个计算节点协同处理分解后的计算任务,具体包括: 设所述多个计算节点为N个,以及将所述计算任务分解为m个子任务,其中N>M ; 将每个子任务发送给对应的计算节点,并启动超时定时器; 定时判断各计算节点是否失效; 在所述超时定时器到期前,接收各计算节点的计算结果。 The control system of hydraulic construction applies to the computing node as claimed in claim 6, field, characterized in that the dispensing of the plurality of modules respectively acquire current computing capabilities node calculation, and the computing task decomposition, and the plurality of computing nodes through collaborative computing tasks after decomposition treatment, comprises: a plurality of computing nodes arranged into N, and the calculated task into subtasks m, where N> m; each sub computing tasks to a corresponding node, and starting the timeout timer; determining whether the timing of each computing node failure; before the timeout timer expires, the received calculation result of each computing node.
  9. 9. 根据权利要求8所述的应用于水利施工现场的计算节点的控制系统,其特征在于, 所述分配模块还用于采取冗余的策略,同一分解的子任务可以分配到多个计算节点。 9. The control system of claim 8 applied to computing node hydraulic construction site, characterized in that, the policy module is further for taking the redundancy allocation, the same decomposition can be assigned to the plurality of subtasks computing nodes .
  10. 10. 根据权利要求6-9任一项所述的应用于水利施工现场的计算节点的控制系统,其特征在于,所述各个计算节点之间采用XML格式的通信协议。 The control system is applied to the computing node hydraulic construction site according to claim any one of claims 6-9, wherein the communication protocol uses XML format between the respective computing nodes.
CN 201410465692 2014-09-12 2014-09-12 Computational node control method and system applied to water conservancy construction site CN104243579A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201410465692 CN104243579A (en) 2014-09-12 2014-09-12 Computational node control method and system applied to water conservancy construction site

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201410465692 CN104243579A (en) 2014-09-12 2014-09-12 Computational node control method and system applied to water conservancy construction site

Publications (1)

Publication Number Publication Date
CN104243579A true true CN104243579A (en) 2014-12-24

Family

ID=52230907

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201410465692 CN104243579A (en) 2014-09-12 2014-09-12 Computational node control method and system applied to water conservancy construction site

Country Status (1)

Country Link
CN (1) CN104243579A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101072133A (en) * 2007-05-23 2007-11-14 华中科技大学 High-performance computing system based on peer-to-peer network
US20100241741A1 (en) * 2005-01-31 2010-09-23 Computer Associates Think, Inc. Distributed computing system having hierarchical organization
CN102063327A (en) * 2010-12-15 2011-05-18 中国科学院深圳先进技术研究院 Application service scheduling method with power consumption consciousness for data center
CN102929718A (en) * 2012-09-17 2013-02-13 江苏九章计算机科技有限公司 Distributed GPU (graphics processing unit) computer system based on task scheduling

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100241741A1 (en) * 2005-01-31 2010-09-23 Computer Associates Think, Inc. Distributed computing system having hierarchical organization
CN101072133A (en) * 2007-05-23 2007-11-14 华中科技大学 High-performance computing system based on peer-to-peer network
CN102063327A (en) * 2010-12-15 2011-05-18 中国科学院深圳先进技术研究院 Application service scheduling method with power consumption consciousness for data center
CN102929718A (en) * 2012-09-17 2013-02-13 江苏九章计算机科技有限公司 Distributed GPU (graphics processing unit) computer system based on task scheduling

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡敏: "《对几种典型分布式计算技术的比较》", 《电脑知识与技术》 *

Similar Documents

Publication Publication Date Title
US9331962B2 (en) Methods and systems for time sensitive networks
US20110151840A1 (en) Enhanced service discovery mechanism in wireless communication system
CN101114867A (en) Multi-channel synchronization transmitting method and system
CN102571452A (en) Multi-node management method and system
US20060212597A1 (en) Multi-stage load distributing apparatus and method, and program
CN103179027A (en) Method and system for realizing compatibility of electrical appliance, and universal peripheral access gateway
JPH05128030A (en) Device for managing resource information
CN102158346A (en) Information acquisition system and method based on cloud computing
CN101183984A (en) Network management system, management method and equipment
CN102291416A (en) A method for client and server-side and two-way synchronization system
CN101635728A (en) Method and system for data synchronization in content distribution network
US20120166556A1 (en) Method, device and system for real-time publish subscribe discovery based on distributed hash table
CN101951369A (en) Batch terminal upgrading method and system based on automatic discovery
CN102170641A (en) Method and system for resource allocation and distribution of M2M business group
CN102724069A (en) Collision detection method, device and network device of dual-master device in thermal staking system
JP2003124875A (en) Information distribution method and system, and multicast server, program, and recording medium
CN101572710A (en) Interprocess communication method and system
CN101945086A (en) Security system access business platform for video type security gateway and information transmission method
CN103442042A (en) Incremental data synchronization method and system
CN102665213A (en) Data direct connection processing method, equipment and system thereof
CN102340436A (en) Cross-network message forwarding method and switch system
CN102196526A (en) Local flow forwarding method of centralized wireless sensor network
WO2004073270A1 (en) Router setting method and router device
CN101304330A (en) Resource allocation method, server, network equipment and network system
CN101521596A (en) Management structure for distributed dynamic self-organizing network

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination