WO2022150995A1 - Supercomputer architecture implementation method - Google Patents

Supercomputer architecture implementation method Download PDF

Info

Publication number
WO2022150995A1
WO2022150995A1 PCT/CN2021/071347 CN2021071347W WO2022150995A1 WO 2022150995 A1 WO2022150995 A1 WO 2022150995A1 CN 2021071347 W CN2021071347 W CN 2021071347W WO 2022150995 A1 WO2022150995 A1 WO 2022150995A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
routing
instruction
supercomputer
sent
Prior art date
Application number
PCT/CN2021/071347
Other languages
French (fr)
Chinese (zh)
Inventor
王志平
Original Assignee
王志平
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 王志平 filed Critical 王志平
Priority to PCT/CN2021/071347 priority Critical patent/WO2022150995A1/en
Publication of WO2022150995A1 publication Critical patent/WO2022150995A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring

Definitions

  • the present invention relates to the field of computer technology, in particular to a method for realizing supercomputer architecture.
  • Supercomputer (or “mainframe computer”) can be understood as a super large computer processing architecture formed by multiple servers through a network.
  • the characteristic of “supercomputer” is that it can carry more tasks and can process, and has “super” powerful computing power is one of the important characteristics of “supercomputer”.
  • the topology of the "supercomputer” that physically constitutes the “supercomputer” and the task deployment strategy on the network are the key factors that determine the performance and efficiency of the "supercomputer”.
  • Different "supercomputer” manufacturers have different topological structures and task deployment strategies. There are some differences, but the interconnection between all computing nodes (ie, monolithic servers) that make up a "supercomputer” follows the TCP/IP protocol.
  • the existing network topology structure usually distributes the tasks of one area to the appropriate area according to the load situation of each area through the scheduling server.
  • the existing supercomputers rely on the TCP/IP protocol to build the entire system architecture and rely on software to implement task scheduling, which will inevitably lead to synchronous communication (especially real-time synchronization) between different processes in the entire "supercomputer" system.
  • the software is implemented on the TCP/IP protocol socket (or other process communication protocol). For short messages (maybe most short messages with only a few or a dozen bytes), the synchronization efficiency will become very low; with At the same time, the synchronous communication based on socket (or other process communication protocol) will also cause the failure of synchronous communication to be too poor (ie, the delay is too large, which is not conducive to real-time control). Therefore, there is a need for a supercomputer architecture that does not need to rely on the TCP/IP protocol, can have a more reasonable and efficient task load scheduling effect when dealing with huge loads, and ensure the timeliness of processes (tasks).
  • the purpose of the present invention is to provide a supercomputer architecture implementation method, which does not need to rely on the TCP/IP protocol, and can have a more reasonable and efficient task load scheduling effect when dealing with huge loads, thereby ensuring the timeliness of the process. sex.
  • the present invention provides a method for realizing a supercomputer architecture, comprising the steps of:
  • the process sent by the first node is run through the second node.
  • the process switching unit is abbreviated as PRU (Process Routing Unit), which has the function of "routing”, but does not need too large and complex network communication protocols as support; from the perspective of hardware implementation, PRU no longer exists.
  • PRU Process Routing Unit
  • the specific concept of "protocol” is to directly reduce the required functions to the various instructions in the PRU instruction set.
  • PRU is a special bus controller that realizes high throughput, high efficiency, high scalability, no software and no configuration, low power consumption, and low cost to connect all nodes in the "supercomputer” system.
  • the first node may be referred to as node S
  • the second node may be referred to as node T.
  • establishing the routing path between the first node for sending the process and the second node for receiving the process in the supercomputer through the process exchange unit specifically includes:
  • a routing path connecting the first node and the second node is established.
  • the route establishment instruction is sent by "node S”, and a route is established between the request and "node T", but the route establishment instruction does not specify which "node” is “node T” (in fact, as a "Node S", which does not have the ability to specify a routing target), but the PRU decides how to establish a route according to the specific load of the "node” connected to it and the value of the routing time.
  • the route establishment instruction includes the following parameters: the process number parameter (that is, the process number of the driver process), the RTF file number parameter (that is, the file number of the RTF file), and the kernel class number parameter (the kernel class number parameter indicates that the route to be established by the current command is Prepared for the process running on that type of kernel, the kernel type signal determines whether the relevant process can be executed in "node T", that is to say, the PRU needs to judge how to find the appropriate "node T" according to the "kernel type number” to establish a route), the kernel version number parameter (the kernel version number parameter refers to the version number of the kernel defined by the kernel category number.
  • code loading space parameter refers to the maximum storage space required when the execution code of the relevant process served by the route established by the current instruction is loaded (installed) into the "target system", "code loading space” "The meaning of the parameter is to judge whether the "target system” has enough storage resources to load the execution code of the relevant process)
  • data load space parameter refers to the needs of the relevant process served by the route established by the current instruction.
  • TTR is the English abbreviation of Time To Routing, that is "Route Time”
  • the route establishment feedback command is a feedback command for the route establishment command, and this command may be issued by the "node T" or the PRU.
  • the route establishment feedback command includes the following parameters: the process number parameter (the process number of the driver process or 0 (if the command is issued by "node T", the parameter is the "driver process number”, otherwise it must be 0)), RTF parameter ( RTF file number), TTR parameter (representing the number of PRUs experienced in the process of route establishment), route result parameter (representing the final result of route establishment, the possible results of route establishment include: “Route establishment is successful", “ "Route establishment failed”, if the parameter is "Route establishment failed", then you need to parse the "Error” parameter to know the reason for the "failure”), and the route property parameter (when the route result parameter shows "Route establishment succeeded” , "routing property” represents the load proportion of the "node T" directly or indirectly connected to the current port.
  • the PRU can use this parameter to select the most suitable "node T" from many feedbacks as the best route.
  • the route result parameter shows "Route establishment failed”
  • "Route property” indicates the reason for the failure of the current port routing.
  • the specific possibilities are as follows: this port is not connected to "node T", and the hardware resources of node T are insufficient. , The load of "Node T” is too large, the hardware resources and load of "Node T" do not meet the requirements, the PRU memory is insufficient and cannot provide enough space to cache code and initial variables, the PRU hardware management resources are insufficient and cannot create RTF files, PRU memory and hardware management resources are insufficient).
  • selecting the node with the smallest routing time as the second node among other nodes sending the establishment feedback instruction by the process switching unit further includes:
  • the node with the smallest load proportion is selected as the second node.
  • the method when receiving the route establishment instruction sent by the first node, the method further includes:
  • the method further includes:
  • the running of the process sent by the first node through the second node specifically includes:
  • the process sent by the first node is loaded into the second node through the routing and task description file, and the second node executes the process.
  • the Routing & Task File (RTF for short) is an important file used by the PRU to establish and implement routing.
  • This file records information related to the "process” and records the routing path information.
  • the implementation of routing by the PRU will depend on the content (routing information) recorded in the RTF file.
  • the RTF is only saved and managed in the PRU.
  • the RTF is initially created by "node S", that is, in any routing path Before being established, "node S" needs to first create an RTF file in the PRU to which it is connected. In fact, the RTF file at this time is an empty file, and no content is recorded for the time being. After that, "node S" will send a message to the PRU to establish a route.
  • the PRU will write the corresponding routing information and process information in the RTF file when establishing the routing path for the "node S".
  • the RTF file must contain the following main information: "Node S” or the port number connecting the PRU from the “Node S” direction to the current PRU; "Node T” or the port number connecting the PRU from the "Node S” direction to the current PRU; If the current PRU is PRU-S (that is, the process switching unit of the source end node), it needs to include the process ID of the process using this routing path in "node S”; if the current PRU is PRU-T (that is, the process switching unit of the target end end) When , it needs to include the process ID of the process using this routing path in "node T"; the total number of PRUs experienced by the current routing path, namely TTR (TTR is the English abbreviation of Time To Routing, that is, “routing time”); routing time Poke, ie RTS.
  • TTR is the English abbreviation of Time To Routing, that is, “routing time”
  • routing time Poke ie RTS.
  • the process sent by the first node is loaded into the second node through the route and task description file, and after the second node performs the running of the process, it also includes:
  • the first node responds to the wake-up command and communicates with the corresponding second node, but the second node does not respond, that is, the process related to "Node T" has been terminated unexpectedly, the first node Send uninstall process command.
  • the method further includes:
  • the routing path between the first node and the second node is deleted.
  • the method further includes:
  • the second node After receiving the installation execution code and the initialization and execution instruction sent by the first node, the second node loads the process execution code and initializes variables, and starts running the process immediately after the loading is completed.
  • the method further includes:
  • the data interaction between the first node and the second node is performed by means of file messages.
  • the method further includes:
  • the communication between processes is encapsulated in the form of short messages and/or file messages by the compiler.
  • the process exchange unit can be directly passed through the process exchange unit.
  • the transfer does not need to rely on the TCP/IP protocol, so that it can have a more reasonable and efficient task load scheduling effect when dealing with huge loads, which is conducive to ensuring the timeliness of the process.
  • Fig. 1 is the overall flow schematic diagram of the embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a connection between a process exchange unit and a node according to an embodiment of the present invention
  • FIG. 3 is another schematic diagram of a connection between a process exchange unit and a node according to an embodiment of the present invention
  • FIG. 4 is another schematic diagram of a connection between a process exchange unit and a node according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a process exchange unit according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a supercomputer architecture according to an embodiment of the present invention.
  • FIG. 7 is another schematic structural diagram of a supercomputer architecture according to an embodiment of the present invention.
  • FIG. 1 An embodiment of the present invention, as shown in FIG. 1 , provides a method for implementing a supercomputer architecture, including the steps:
  • the process switching unit is abbreviated as PRU (Process Routing Unit), which has the function of "routing”, but does not need too large and complex network communication protocols as support; from the perspective of hardware implementation, PRU no longer exists.
  • PRU Process Routing Unit
  • the specific concept of "protocol” is to directly reduce the required functions to the various instructions in the PRU instruction set.
  • PRU is a special bus controller that realizes high throughput, high efficiency, high scalability, no software and no configuration, low power consumption, and low cost to connect all nodes in the "supercomputer” system.
  • the first node may be referred to as node S, and the second node may be referred to as node T.
  • each port of the process exchange unit can be connected to each node of the corresponding supercomputer, so as to facilitate the transfer of processes in each node; and/or the port of the process exchange unit can be connected to other
  • the port connection of the process exchange unit can facilitate the expansion of the supercomputer architecture by connecting the port of the process exchange unit with the ports of other process exchange units.
  • the process switching unit has 8 ports from port 0 to port 7 inside, and these 8 ports outside the PRU can be connected to “computing nodes” or other PRUs, respectively.
  • the three modules of PRE, EI and MI are included in the box.
  • RRE is the abbreviation of Process Routing Engine, which is expressed as the "routing engine” independently owned by each port, which is used for PRU, that is, "process switching unit” command.
  • EI is the English abbreviation of EFS Interface, which means that the module implements the function of the interface with the EFS module shown in the figure
  • MI is the English abbreviation of Matrix Interface, which means that the module implements the same function as the Matrix module shown in the figure. The function of the interface between them; the Matrix in the figure is a multi-layer bus controller that realizes the interconnection between all ports in the PRU.
  • the supercomputer architecture of this solution can be shown in Figure 6. Of course, it can also be continuously expanded on this basis.
  • the expansion method can be shown in Figure 7, so that the architecture of the supercomputer is continuously increased.
  • the steps further include:
  • the "dynamic deployment” that can be achieved in the present invention means that no matter how the user program consumes hardware resources or the concurrency nature, the user does not need to make a specific task for a specific "supercomputer” when submitting a task to the "supercomputer”
  • the "adaptation operation” of the "supercomputer” does not require any configuration related to the internal network of the "supercomputer”.
  • the system can automatically deploy the user program to the process accuracy.
  • deployment accurate to the process refers to the sub-process generated at any time during the running process of the user program, and the system can dynamically deploy it according to the current load of the "node”.
  • the compiler will insert a judgment branch for all programs related to process communication and process synchronization, that is, when judging the current program, the PRU-S ( That is, the source node process switching unit) or the PRU-T (target process switching unit), if it is running in the PRU-S, the same process communication and process synchronization procedures as "static deployment” are used. , and if it is run in PRU-T, the process communication and process synchronization program encapsulated by the PRU instruction is used.
  • An embodiment of the present invention on the basis of Embodiment 1, establishes a routing path between a first node for sending a process and a second node for receiving a process in a supercomputer by using a process exchange unit, which specifically includes:
  • the route establishment instruction is sent by "node S”, and a route is established between the request and "node T", but the route establishment instruction does not specify which "node” is “node T” (in fact, as a "Node S", which does not have the ability to specify a routing target), but the PRU decides how to establish a route according to the specific load of the "node” connected to it and the value of the routing time.
  • the route establishment instruction includes the following parameters: the process number parameter (that is, the process number of the driver process), the RTF file number parameter (that is, the file number of the RTF file), and the kernel class number parameter (the kernel class number parameter indicates that the route to be established by the current command is Prepared for the process running on that type of kernel, the kernel type signal determines whether the relevant process can be executed in "node T", that is to say, the PRU needs to judge how to find the appropriate "node T" according to the "kernel type number” to establish a route), the kernel version number parameter (the kernel version number parameter refers to the version number of the kernel defined by the kernel category number.
  • code loading space parameter refers to the maximum storage space required when the execution code of the relevant process served by the route established by the current instruction is loaded (installed) into the "target system", "code loading space” "The meaning of the parameter is to judge whether the "target system” has enough storage resources to load the execution code of the relevant process)
  • data load space parameter refers to the needs of the relevant process served by the route established by the current instruction.
  • TTR is the English abbreviation of Time To Routing, that is "Route Time”
  • S22 Broadcast the route establishment instruction to other nodes of the supercomputer through the process exchange unit, and receive the route establishment feedback instruction sent by other nodes correspondingly.
  • the route establishment feedback command is a feedback command for the route establishment command, and this command may be issued by the "node T" or the PRU.
  • the route establishment feedback command includes the following parameters: the process number parameter (the process number of the driver process or 0 (if the command is issued by "node T", the parameter is the "driver process number”, otherwise it must be 0)), RTF parameter ( RTF file number), TTR parameter (representing the number of PRUs experienced in the process of route establishment), route result parameter (representing the final result of route establishment, the possible results of route establishment include: “Route establishment is successful", “ "Route establishment failed”, if the parameter is "Route establishment failed", then you need to parse the "Error” parameter to know the reason for the "failure”), and the route property parameter (when the route result parameter shows "Route establishment succeeded” , "routing property” represents the load proportion of the "node T" directly or indirectly connected to the current port.
  • the PRU can use this parameter to select the most suitable "node T" from many feedbacks as the best route.
  • the route result parameter shows "Route establishment failed”
  • "Route property” indicates the reason for the failure of the current port routing.
  • the specific possibilities are as follows: this port is not connected to "node T", and the hardware resources of node T are insufficient. , The load of "Node T” is too large, the hardware resources and load of "Node T" do not meet the requirements, the PRU memory is insufficient and cannot provide enough space to cache code and initial variables, the PRU hardware management resources are insufficient and cannot create RTF files, PRU memory and hardware management resources are insufficient).
  • the process switching unit selects the node with the smallest routing time as the second node among other nodes that send the establishment feedback instruction.
  • the process switching unit selects the node with the smallest routing time as the second node among other nodes that send the establishment feedback instruction, further comprising:
  • the node with the smallest load proportion is selected as the second node.
  • An embodiment of the present invention when receiving the route establishment instruction sent by the first node, further includes:
  • the Routing & Task File (RTF for short) is an important file used by the PRU to establish and implement routing.
  • This file records information related to the "process” and records the routing path information.
  • the implementation of routing by the PRU will depend on the content (routing information) recorded in the RTF file.
  • the RTF is only saved and managed in the PRU.
  • the RTF is initially created by "node S", that is, in any routing path Before being established, "node S" needs to first create an RTF file in the PRU to which it is connected. In fact, the RTF file at this time is an empty file, and no content is recorded for the time being. After that, "node S" will send a message to the PRU to establish a route.
  • the PRU will write the corresponding routing information and process information in the RTF file when establishing the routing path for the "node S".
  • the RTF file must contain the following main information: "Node S” or the port number connecting the PRU from the “Node S” direction to the current PRU; "Node T” or the port number connecting the PRU from the "Node S” direction to the current PRU; If the current PRU is PRU-S (that is, the process switching unit of the source end node), it needs to include the process ID of the process using this routing path in "node S”; if the current PRU is PRU-T (that is, the process switching unit of the target end end) When , it needs to include the process ID of the process using this routing path in "node T"; the total number of PRUs experienced by the current routing path, namely TTR (TTR is the English abbreviation of Time To Routing, that is, “routing time”); routing time Poke, ie RTS.
  • TTR is the English abbreviation of Time To Routing, that is, “routing time”
  • routing time Poke ie RTS.
  • Running the process sent by the first node through the second node specifically includes:
  • the process sent by the first node is loaded into the second node through the routing and task description file, and the process is run by the second node.
  • the process sent by the first node is loaded into the second node through the routing and task description file, and after the second node executes the process, the process further includes:
  • the method further includes: receiving the uninstallation process instruction sent by the first node; and deleting the routing path between the first node and the second node.
  • An embodiment of the present invention after establishing a routing path connecting the first node and the second node, further includes: receiving the installation execution code sent by the first node and executing the instruction, the second node Load the process execution code, and start running the process immediately after the loading is completed; or receive the installation execution code and initialization and execution instructions sent by the first node, the second node loads the process execution code and initializes variables, and immediately after the loading is completed. Start the process running.
  • the method further includes: performing data interaction between the first node and the second node by means of short messages; and/or performing the first node and the second node by means of file messages Data interaction between the node and the second node.
  • after running the process sent by the first node through the second node it also includes: after the process running ends, receiving a no-message return instruction sent by the second node, and deleting the routing information in the process of transmitting the no-message return instruction. ; or after the process ends, receive a message return instruction sent by the second node, and delete the routing information in the process of message return instruction delivery.
  • the difference between a message return instruction and no message return instruction is in the process of instruction transmission. will carry a short message.

Abstract

A supercomputer architecture implementation method, which comprises steps of: constructing a process exchange unit connecting various nodes within a supercomputer (S1); establishing a routing path between a first node for sending a process and a second node for receiving a process in the supercomputer by means of the process exchange unit (S2); and running a process sent by the first node by means of the second node (S3). The present method does not need to rely on a TCP/IP protocol, is able to produce a more sensible and more efficient task load scheduling effect for a huge load, and thereby ensures the timeliness of a process.

Description

一种超级计算机架构实现方法A kind of supercomputer architecture realization method 技术领域technical field
本发明涉及计算机技术领域,尤指一种超级计算机架构实现方法。The present invention relates to the field of computer technology, in particular to a method for realizing supercomputer architecture.
背景技术Background technique
“超级计算机”(或者“大型计算机”)可以理解为多个服务器通过网络形成一个超大型的计算机处理架构。“超级计算机”的特性是可以承载更多的任务且能进行处理,具有“超级”强大的运算能力是“超级计算机”的重要特征之一。在物理上组成“超级计算机”的拓扑结构以及在网络上对于任务的布放策略是决定“超级计算机”性能及效率的关键因素,不同“超级计算机”厂商在拓扑结构及任务布放策略上都存在一定差异,但在构成“超级计算机”的所有计算节点(亦即单体服务器)之间的互连均遵循TCP/IP协议。"Supercomputer" (or "mainframe computer") can be understood as a super large computer processing architecture formed by multiple servers through a network. The characteristic of "supercomputer" is that it can carry more tasks and can process, and has "super" powerful computing power is one of the important characteristics of "supercomputer". The topology of the "supercomputer" that physically constitutes the "supercomputer" and the task deployment strategy on the network are the key factors that determine the performance and efficiency of the "supercomputer". Different "supercomputer" manufacturers have different topological structures and task deployment strategies. There are some differences, but the interconnection between all computing nodes (ie, monolithic servers) that make up a "supercomputer" follows the TCP/IP protocol.
随着超级计算机的发展,计算机的负荷越来越大,现有的网络拓扑结构通常通过调度服务器将一个区域的任务依据各区域的负载情况布放到合适的区域。With the development of supercomputers, the load of the computer is increasing, and the existing network topology structure usually distributes the tasks of one area to the appropriate area according to the load situation of each area through the scheduling server.
但是,现有的超级计算机依赖于TCP/IP协议构建整个系统架构以及依赖于软件实现任务调度,势必导致在“超级计算机”整个系统内,不同进程间的同步通信(尤其是实时同步)必须通过软件在TCP/IP协议上socket(或者其它进程通信协议)来实现,对于短消息(可能大多数只有几个或十几个字节的短消息)而言,同步效率将变得十分低下;与此同时,基于socket(或者其它进程通信协议)的同步通信也会导致同步通信的失效太差(即延时太大,不利于实时控制)。因此,需要一种不需要依赖于TCP/IP协议,能够在针对巨大负荷时,具有更合理、更为高效的任务负荷 调度效果,保证进程(任务)的时效性的超级计算机架构。However, the existing supercomputers rely on the TCP/IP protocol to build the entire system architecture and rely on software to implement task scheduling, which will inevitably lead to synchronous communication (especially real-time synchronization) between different processes in the entire "supercomputer" system. The software is implemented on the TCP/IP protocol socket (or other process communication protocol). For short messages (maybe most short messages with only a few or a dozen bytes), the synchronization efficiency will become very low; with At the same time, the synchronous communication based on socket (or other process communication protocol) will also cause the failure of synchronous communication to be too poor (ie, the delay is too large, which is not conducive to real-time control). Therefore, there is a need for a supercomputer architecture that does not need to rely on the TCP/IP protocol, can have a more reasonable and efficient task load scheduling effect when dealing with huge loads, and ensure the timeliness of processes (tasks).
发明内容SUMMARY OF THE INVENTION
本发明的目的是提供一种超级计算机架构实现方法,该方案不需要依赖于TCP/IP协议,能够在针对巨大负荷时,具有更合理、更为高效的任务负荷调度效果,从而保证进程的时效性。The purpose of the present invention is to provide a supercomputer architecture implementation method, which does not need to rely on the TCP/IP protocol, and can have a more reasonable and efficient task load scheduling effect when dealing with huge loads, thereby ensuring the timeliness of the process. sex.
本发明提供的技术方案如下:The technical scheme provided by the present invention is as follows:
本发明提供一种超级计算机架构实现方法,包括步骤:The present invention provides a method for realizing a supercomputer architecture, comprising the steps of:
构建连接超级计算机内各节点的进程交换单元;Build a process exchange unit that connects each node in the supercomputer;
通过所述进程交换单元建立所述超级计算机中用于发送进程的第一节点与用于接收进程的第二节点之间的路由路径;Establish a routing path between a first node for sending a process and a second node for receiving a process in the supercomputer by using the process exchange unit;
通过所述第二节点运行所述第一节点发送的进程。The process sent by the first node is run through the second node.
具体的,进程交换单元简称为PRU(Process Routing Unit),具有“路由”的功能特性,但并不需要过于庞大而复杂的网络通信协议来作为支撑;从硬件实现的角度看,PRU已不存在具体的“协议”这一概念,而是直接将所需要做到的功能归化到PRU指令集中各个指令当中。作为“超级计算机”中的任何一个节点在使用PRU完成自身在整个系统中的角色时,只需要向PRU发送相应的指令便能轻易完成。所以,PRU是实现了具有大吞吐量、高效率、高扩展性、无软件无配置、低功耗、低成本的连接“超级计算机”系统内所有节点的一种特殊的总线控制器。第一节点可称之为节点S,第二节点可以称之为节点T。Specifically, the process switching unit is abbreviated as PRU (Process Routing Unit), which has the function of "routing", but does not need too large and complex network communication protocols as support; from the perspective of hardware implementation, PRU no longer exists. The specific concept of "protocol" is to directly reduce the required functions to the various instructions in the PRU instruction set. As any node in the "supercomputer", when using the PRU to complete its role in the entire system, it only needs to send corresponding instructions to the PRU to complete it easily. Therefore, PRU is a special bus controller that realizes high throughput, high efficiency, high scalability, no software and no configuration, low power consumption, and low cost to connect all nodes in the "supercomputer" system. The first node may be referred to as node S, and the second node may be referred to as node T.
通过构建连接超级计算机内各节点的进程交换单元,使得当需要将某一节点内的进程转移到其它节点进行运行时,可以直接通过进程交换单元进行进行转移,不需要依赖于TCP/IP协议,从而能够在针对巨大负荷时,具有更合理、更为高效的任务负荷调度效果,有利于保证进程的时效性。By building a process exchange unit that connects each node in the supercomputer, when a process in a node needs to be transferred to other nodes for operation, it can be transferred directly through the process exchange unit without relying on the TCP/IP protocol. Therefore, it can have a more reasonable and efficient task load scheduling effect when dealing with huge loads, which is beneficial to ensure the timeliness of the process.
进一步地,所述的通过所述进程交换单元建立所述超级计算机中用于发送进程的第一节点与用于接收进程的第二节点之间的路由路径,具体包括:Further, establishing the routing path between the first node for sending the process and the second node for receiving the process in the supercomputer through the process exchange unit specifically includes:
接收所述第一节点发送的路由建立指令;receiving a route establishment instruction sent by the first node;
通过所述进程交换单元向所述超级计算机的其它节点广播所述路由建立指令,并接收其它节点对应发送的路由建立反馈指令;Broadcast the route establishment instruction to other nodes of the supercomputer through the process exchange unit, and receive route establishment feedback instructions sent by other nodes correspondingly;
通过所述进程交换单元在发送所述建立反馈指令的其它节点中选择路由时间最小的节点作为所述第二节点;Selecting, by the process switching unit, the node with the smallest routing time as the second node among other nodes that send the establishment feedback instruction;
建立连接所述第一节点和所述第二节点的路由路径。A routing path connecting the first node and the second node is established.
具体的,路由建立指令是由“节点S”发送的,请求与“节点T”之间建立一条路由,但路由建立指令并不指定具体哪一个“节点”是“节点T”(事实上,作为“节点S”,其并不具备指定路由目标的能力),而是由PRU根据与其连接的“节点”的具体负荷以及路由时间的值来决定如何建立路由。Specifically, the route establishment instruction is sent by "node S", and a route is established between the request and "node T", but the route establishment instruction does not specify which "node" is "node T" (in fact, as a "Node S", which does not have the ability to specify a routing target), but the PRU decides how to establish a route according to the specific load of the "node" connected to it and the value of the routing time.
另外,在成功建立起路由路径后,或者建立路由路径失败后,都会向“节点S”回馈一个“路由反馈”的指令,以告知“节点S”路由路径建立结果,不过“路由反馈”的指令只会在路由网络中传递,即该指令传递到“节点S”之后,“节点S”的相应端口会以中断的形式反馈到相关驱动进程。In addition, after the routing path is successfully established, or after the establishment of the routing path fails, a "routing feedback" instruction will be sent back to "node S" to inform "node S" of the routing path establishment result, but the "routing feedback" instruction It will only be passed in the routing network, that is, after the command is passed to "node S", the corresponding port of "node S" will be fed back to the relevant driver process in the form of an interrupt.
路由建立指令包含以下参数:进程号参数(即驱动进程的进程号)、RTF文件号参数(即RTF文件的文件号)、内核类别号参数(内核类别号参数指明当前指令所需建立的路由是为那种型号内核运行的进程准备的,内核类别信号决定了相关进程是否能够在“节点T”中执行,也就是说PRU需要依据“内核类别号”来判断应该如何找到合适的“节点T”来建立路由)、内核版本号参数(内核版本号参数是指内核类别号所定义的内核的版本号,只有内核类别号和内核版本号都符合要求才能保证相关进程能够在“目标系统”中正确运行)、代码加载空间参数(代码加载空间参数是指当前指令建立的路由所服务的相关进程的执行代码被加载(被安装)到“目标系统”时所需要的最大存储空间, “代码加载空间”参数的意义是判断“目标系统”是否有足够的存储资源用于加载相关进程的执行代码)、数据加载空间参数(数据加载空间参数是指当前指令建立的路由所服务的相关进程所需的用于变量存储的最小存储量,数据加载空间参数的意义是判断“目标系统”是否有足够的存储资源用于相关进程的最低运行场景)和TTR参数(TTR是Time To Routing的英文缩写,即“路由时间”)。The route establishment instruction includes the following parameters: the process number parameter (that is, the process number of the driver process), the RTF file number parameter (that is, the file number of the RTF file), and the kernel class number parameter (the kernel class number parameter indicates that the route to be established by the current command is Prepared for the process running on that type of kernel, the kernel type signal determines whether the relevant process can be executed in "node T", that is to say, the PRU needs to judge how to find the appropriate "node T" according to the "kernel type number" to establish a route), the kernel version number parameter (the kernel version number parameter refers to the version number of the kernel defined by the kernel category number. Only the kernel category number and the kernel version number meet the requirements to ensure that the relevant process can be correct in the "target system". run), code loading space parameter (code loading space parameter refers to the maximum storage space required when the execution code of the relevant process served by the route established by the current instruction is loaded (installed) into the "target system", "code loading space" "The meaning of the parameter is to judge whether the "target system" has enough storage resources to load the execution code of the relevant process), the data load space parameter (the data load space parameter refers to the needs of the relevant process served by the route established by the current instruction. The minimum storage amount used for variable storage, the meaning of the data loading space parameter is to judge whether the "target system" has enough storage resources for the lowest running scenario of the relevant process) and the TTR parameter (TTR is the English abbreviation of Time To Routing, that is "Route Time").
路由建立反馈指令是针对路由建立指令的反馈指令,本指令可以由“节点T”或者PRU发出。路由建立反馈指令包含以下参数:进程号参数(驱动进程的进程号或者为0(如果是“节点T”发出指令,则该参数为“驱动进程号”,否则必须为0))、RTF参数(RTF文件的文件号)、TTR参数(表示在路由建立的过程中所经历的PRU的数量)、路由结果参数(表示路由建立的最终结果,路由建立的可能结果包括:“路由建立成功”、“路由建立失败”,如果该参数是“路由建立失败”,那么需要解析“错误”参数,以获知“失败”的缘由),以及路由性质参数(当路由结果参数显示的是“路由建立成功”时,“路由性质”表示的是当前端口所直接连接或间接连接的“节点T”的负荷比重,此时PRU可以利用该参数来从众多反馈中选取一个最为合适的“节点T”作为最佳路由方向;而当路由结果参数显示的是“路由建立失败”时,“路由性质”表示的是当前端口路由失败缘由,具体有如下可能:本端口没有连接“节点T”、节点T”硬件资源不足、节点T”负荷过大、“节点T”硬件资源及负荷均不符合要求、PRU内存不足无法提供足够的空间缓存代码及初始变量、PRU硬件管理资源不足无法建立RTF文件、PRU内存和硬件管理资源均不足)。The route establishment feedback command is a feedback command for the route establishment command, and this command may be issued by the "node T" or the PRU. The route establishment feedback command includes the following parameters: the process number parameter (the process number of the driver process or 0 (if the command is issued by "node T", the parameter is the "driver process number", otherwise it must be 0)), RTF parameter ( RTF file number), TTR parameter (representing the number of PRUs experienced in the process of route establishment), route result parameter (representing the final result of route establishment, the possible results of route establishment include: "Route establishment is successful", " "Route establishment failed", if the parameter is "Route establishment failed", then you need to parse the "Error" parameter to know the reason for the "failure"), and the route property parameter (when the route result parameter shows "Route establishment succeeded" , "routing property" represents the load proportion of the "node T" directly or indirectly connected to the current port. At this time, the PRU can use this parameter to select the most suitable "node T" from many feedbacks as the best route. When the route result parameter shows "Route establishment failed", "Route property" indicates the reason for the failure of the current port routing. The specific possibilities are as follows: this port is not connected to "node T", and the hardware resources of node T are insufficient. , The load of "Node T" is too large, the hardware resources and load of "Node T" do not meet the requirements, the PRU memory is insufficient and cannot provide enough space to cache code and initial variables, the PRU hardware management resources are insufficient and cannot create RTF files, PRU memory and hardware management resources are insufficient).
进一步地,所述的通过所述进程交换单元在发送所述建立反馈指令的其它节点中选择路由时间最小的节点作为所述第二节点,还包括:Further, selecting the node with the smallest routing time as the second node among other nodes sending the establishment feedback instruction by the process switching unit further includes:
当路由时间最小的节点有多个时,选择负荷比重最小的节点作为所述第二节点。When there are multiple nodes with the smallest routing time, the node with the smallest load proportion is selected as the second node.
进一步地,所述的接收所述第一节点发送的路由建立指令时,还包括:Further, when receiving the route establishment instruction sent by the first node, the method further includes:
在所述进程交换单元中创建用于记录进程信息的路由与任务描述文件;Create a route and task description file for recording process information in the process exchange unit;
所述的建立连接所述第一节点和所述第二节点的路由路径时,还包括:When establishing the routing path connecting the first node and the second node, the method further includes:
在所述路由与任务描述文件中写入路由信息;Write routing information in the routing and task description files;
所述的通过所述第二节点运行所述第一节点发送的进程,具体包括:The running of the process sent by the first node through the second node specifically includes:
通过所述路由与任务描述文件将所述第一节点发送的进程加载到所述第二节点中,并由所述第二节点进行该进程的运行。The process sent by the first node is loaded into the second node through the routing and task description file, and the second node executes the process.
具体的,路由与任务描述文件(Routing&Task File,简称为RTF),是PRU用于建立路由以及实现路由的重要文件,该文件记录了“进程”相关信息,以及记录了路由路径的信息。PRU对路由实现将依赖于RTF文件中记录的内容(路由信息),RTF只在PRU中保存并管理,对于某一条路由路径,RTF最初是由“节点S”所创建的,即在任何路由路径被建立之前,“节点S”需要首先在其连接的PRU中创建一个RTF文件,事实上此时的RTF文件为空文件,暂时没有记录任何内容,此后,“节点S”会向PRU发送建立路由相关的指令,PRU在为“节点S”建立路由路径时便会在RTF文件写入相应的路由信息和进程信息。Specifically, the Routing & Task File (RTF for short) is an important file used by the PRU to establish and implement routing. This file records information related to the "process" and records the routing path information. The implementation of routing by the PRU will depend on the content (routing information) recorded in the RTF file. The RTF is only saved and managed in the PRU. For a routing path, the RTF is initially created by "node S", that is, in any routing path Before being established, "node S" needs to first create an RTF file in the PRU to which it is connected. In fact, the RTF file at this time is an empty file, and no content is recorded for the time being. After that, "node S" will send a message to the PRU to establish a route. For related instructions, the PRU will write the corresponding routing information and process information in the RTF file when establishing the routing path for the "node S".
RTF文件必须包含以下主要信息:“节点S”或者来自“节点S”方向的PRU与当前PRU连接的端口号;“节点T”或者来自“节点S”方向的PRU与当前PRU连接的端口号;当前PRU如果是PRU-S(即源端节点进程交换单元)时,需要包含“节点S”中使用本路由路径的进程的进程号;当前PRU如果是PRU-T(即目标端进程交换单元)时,需要包含“节点T”中使用本路由路径的进程的进程号;当前路由路径所经历的PRU的总数,即TTR(TTR是Time To Routing的英文缩写,即“路由时间”);路由时间戳,即RTS。The RTF file must contain the following main information: "Node S" or the port number connecting the PRU from the "Node S" direction to the current PRU; "Node T" or the port number connecting the PRU from the "Node S" direction to the current PRU; If the current PRU is PRU-S (that is, the process switching unit of the source end node), it needs to include the process ID of the process using this routing path in "node S"; if the current PRU is PRU-T (that is, the process switching unit of the target end end) When , it needs to include the process ID of the process using this routing path in "node T"; the total number of PRUs experienced by the current routing path, namely TTR (TTR is the English abbreviation of Time To Routing, that is, "routing time"); routing time Poke, ie RTS.
进一步地,所述的通过所述路由与任务描述文件将所述第一节点发送的进程加载到所述第二节点中,并由所述第二节点进行该进程的运行之后,还包 括:Further, after the process sent by the first node is loaded into the second node through the route and task description file, and after the second node performs the running of the process, it also includes:
当路由路径上存在指令时,更新所述路由与任务描述文件中的路由时间戳;When there is an instruction on the routing path, update the routing timestamp in the routing and task description files;
当所述路由时间戳与所述进程交换单元的当前时间的差距超过预定范围时,通过所述进程交换单元向所述第一节点发送唤醒指令;When the difference between the routing timestamp and the current time of the process switching unit exceeds a predetermined range, send a wake-up instruction to the first node through the process switching unit;
若所述第一节点未响应所述唤醒指令,即“节点S”的相关进程已经意外终止,则自动卸载所述第二节点中的进程,并删除当前路由路径;If the first node does not respond to the wake-up instruction, that is, the related process of "node S" has been terminated unexpectedly, the process in the second node is automatically uninstalled, and the current routing path is deleted;
若所述第一节点响应所述唤醒指令并与对应的所述第二节点进行通信,但所述第二节点未回应,即“节点T”相关进程已经意外终止,则通过所述第一节点发送卸载进程指令。If the first node responds to the wake-up command and communicates with the corresponding second node, but the second node does not respond, that is, the process related to "Node T" has been terminated unexpectedly, the first node Send uninstall process command.
进一步地,所述的通过所述第一节点发送卸载进程指令之后,还包括:Further, after the uninstalling process instruction is sent by the first node, the method further includes:
接收所述第一节点发送的所述卸载进程指令;receiving the uninstallation process instruction sent by the first node;
删除所述第一节点和所述第二节点之间的路由路径。The routing path between the first node and the second node is deleted.
进一步地,所述的建立连接所述第一节点和所述第二节点的路由路径之后,还包括:Further, after the establishment of the routing path connecting the first node and the second node, the method further includes:
接收所述第一节点发送的安装执行代码并执行指令,所述第二节点加载进程执行代码,并在加载完成后立即开始该进程的运行;Receive the installation execution code sent by the first node and execute the instruction, the second node loads the process execution code, and starts running the process immediately after the loading is completed;
或;or;
接收所述第一节点发送的安装执行代码及初始化并执行指令,所述第二节点加载进程执行代码以及初始化变量,并在加载完成后立即开始该进程的运行。After receiving the installation execution code and the initialization and execution instruction sent by the first node, the second node loads the process execution code and initializes variables, and starts running the process immediately after the loading is completed.
进一步地,所述的建立连接所述第一节点和所述第二节点的路由路径之后,还包括:Further, after the establishment of the routing path connecting the first node and the second node, the method further includes:
通过短消息的方式进行所述第一节点和所述第二节点之间的数据交互;perform data interaction between the first node and the second node by means of short messages;
和/或;and / or;
通过文件消息的方式进行所述第一节点和所述第二节点之间的数据交互。The data interaction between the first node and the second node is performed by means of file messages.
进一步地,所述的通过所述第二节点运行所述第一节点发送的进程之后,还包括:Further, after running the process sent by the first node through the second node, the method further includes:
在进程运行结束后,接收所述第二节点发送的无消息返回指令,并在所述无消息返回指令传递的过程中删除路由信息;After the process finishes running, receive the no-message return instruction sent by the second node, and delete routing information in the process of transmitting the no-message return instruction;
或;or;
在进程运行结束后,接收所述第二节点发送的有消息返回指令,并在所述有消息返回指令传递的过程中删除路由信息。After the running of the process is completed, a message-returning instruction sent by the second node is received, and routing information is deleted in the process of transmitting the message-returning instruction.
进一步地,所述的构建连接超级计算机内各节点的进程交换单元之前,还包括步骤:Further, before the construction of the process exchange unit connecting each node in the supercomputer, it also includes the steps:
在软件开发时,判断用户软件是否支持动态布放;During software development, determine whether the user software supports dynamic deployment;
若支持,则通过编译器将进程之间的通信以短消息和/或文件消息的形式进行封装。If supported, the communication between processes is encapsulated in the form of short messages and/or file messages by the compiler.
根据本发明提供的一种超级计算机架构实现方法,通过构建连接超级计算机内各节点的进程交换单元,使得当需要将某一节点内的进程转移到其它节点进行运行时,可以直接通过进程交换单元进行进行转移,不需要依赖于TCP/IP协议,从而能够在针对巨大负荷时,具有更合理、更为高效的任务负荷调度效果,有利于保证进程的时效性。According to a method for realizing a supercomputer architecture provided by the present invention, by constructing a process exchange unit connecting each node in the supercomputer, when a process in a node needs to be transferred to other nodes for operation, the process exchange unit can be directly passed through the process exchange unit. The transfer does not need to rely on the TCP/IP protocol, so that it can have a more reasonable and efficient task load scheduling effect when dealing with huge loads, which is conducive to ensuring the timeliness of the process.
附图说明Description of drawings
下面将以明确易懂的方式,结合附图说明优选实施方式,对本方案的上述特性、技术特征、优点及其实现方式予以进一步说明。The preferred embodiments will be described below in a clear and easy-to-understand manner with reference to the accompanying drawings, and the above-mentioned characteristics, technical features, advantages and implementations of the present solution will be further described.
图1是本发明实施例的整体流程示意图;Fig. 1 is the overall flow schematic diagram of the embodiment of the present invention;
图2是本发明实施例的进程交换单元与节点的一种连接示意图;2 is a schematic diagram of a connection between a process exchange unit and a node according to an embodiment of the present invention;
图3是本发明实施例的进程交换单元与节点的另一种连接示意图;3 is another schematic diagram of a connection between a process exchange unit and a node according to an embodiment of the present invention;
图4是本发明实施例的进程交换单元与节点的又一种连接示意图;4 is another schematic diagram of a connection between a process exchange unit and a node according to an embodiment of the present invention;
图5是本发明实施例的进程交换单元结构示意图;5 is a schematic structural diagram of a process exchange unit according to an embodiment of the present invention;
图6是本发明实施例的超级计算机架构的一种结构示意图;6 is a schematic structural diagram of a supercomputer architecture according to an embodiment of the present invention;
图7是本发明实施例的超级计算机架构的另一种结构示意图。FIG. 7 is another schematic structural diagram of a supercomputer architecture according to an embodiment of the present invention.
具体实施方式Detailed ways
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对照附图说明本发明的具体实施方式。显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图,并获得其他的实施方式。In order to more clearly describe the embodiments of the present invention or the technical solutions in the prior art, the specific embodiments of the present invention will be described below with reference to the accompanying drawings. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative efforts, and obtain other implementations.
为使图面简洁,各图中只示意性地表示出了与本发明相关的部分,它们并不代表其作为产品的实际结构。另外,以使图面简洁便于理解,在有些图中具有相同结构或功能的部件,仅示意性地绘示了其中的一个,或仅标出了其中的一个。在本文中,“一个”不仅表示“仅此一个”,也可以表示“多于一个”的情形。In order to keep the drawings concise, the drawings only schematically show the parts related to the present invention, and they do not represent its actual structure as a product. In addition, in order to make the drawings concise and easy to understand, in some drawings, only one of the components having the same structure or function is schematically shown, or only one of them is marked. As used herein, "one" not only means "only one", but also "more than one".
实施例1Example 1
本发明的一个实施例,如图1所示,本发明提供一种超级计算机架构实现方法,包括步骤:An embodiment of the present invention, as shown in FIG. 1 , provides a method for implementing a supercomputer architecture, including the steps:
S1、构建连接超级计算机内各节点的进程交换单元。S1. Build a process exchange unit that connects each node in the supercomputer.
具体的,进程交换单元简称为PRU(Process Routing Unit),具有“路由”的功能特性,但并不需要过于庞大而复杂的网络通信协议来作为支撑;从硬件实现的角度看,PRU已不存在具体的“协议”这一概念,而是直接将所需要做到的功能归化到PRU指令集中各个指令当中。作为“超级计算机”中的任何一个节点在使用PRU完成自身在整个系统中的角色时,只需要向PRU发送相 应的指令便能轻易完成。所以,PRU是实现了具有大吞吐量、高效率、高扩展性、无软件无配置、低功耗、低成本的连接“超级计算机”系统内所有节点的一种特殊的总线控制器。Specifically, the process switching unit is abbreviated as PRU (Process Routing Unit), which has the function of "routing", but does not need too large and complex network communication protocols as support; from the perspective of hardware implementation, PRU no longer exists. The specific concept of "protocol" is to directly reduce the required functions to the various instructions in the PRU instruction set. As any node in the "supercomputer", when using the PRU to complete its role in the entire system, it only needs to send the corresponding instructions to the PRU to complete it easily. Therefore, PRU is a special bus controller that realizes high throughput, high efficiency, high scalability, no software and no configuration, low power consumption, and low cost to connect all nodes in the "supercomputer" system.
S2、通过进程交换单元建立超级计算机中用于发送进程的第一节点与用于接收进程的第二节点之间的路由路径。S2. Establish a routing path between the first node used for sending the process and the second node used for receiving the process in the supercomputer through the process exchange unit.
第一节点可称之为节点S,第二节点可以称之为节点T。The first node may be referred to as node S, and the second node may be referred to as node T.
优选的,如图2至图4所示,进程交换单元的各个端口可以与对应的超级计算机的各个节点连接,以便于各个节点内的进程的转移;和/或进程交换单元的端口可以与其它进程交换单元的端口连接,通过将进程交换单元的端口与其它进程交换单元的端口进行连接,能够便于超级计算机架构的扩展。Preferably, as shown in FIG. 2 to FIG. 4 , each port of the process exchange unit can be connected to each node of the corresponding supercomputer, so as to facilitate the transfer of processes in each node; and/or the port of the process exchange unit can be connected to other The port connection of the process exchange unit can facilitate the expansion of the supercomputer architecture by connecting the port of the process exchange unit with the ports of other process exchange units.
如图5所示,在本实施例中,进程交换单元内部具有端口0至端口7这8个端口,在PRU外部这8个端口分别可以连接“计算节点”或者其它的PRU,在每个端口方框内都包含PRE、EI和MI这三个模块,其中RRE是Process Routing Engine的因为缩写,表示为每个端口独立拥有的“路由引擎”,用于对PRU,即“进程交换单元”指令集的执行/相应;EI是EFS Interface的英文缩写,表示该模块实现与图中所示EFS模块之间接口的功能;MI是Matrix Interface的英文缩写,表示该模块实现与图中所示Matrix模块之间接口的功能;图中Matrix是实现PRU中所有端口间互连的多层总线控制器。As shown in FIG. 5 , in this embodiment, the process switching unit has 8 ports from port 0 to port 7 inside, and these 8 ports outside the PRU can be connected to “computing nodes” or other PRUs, respectively. The three modules of PRE, EI and MI are included in the box. RRE is the abbreviation of Process Routing Engine, which is expressed as the "routing engine" independently owned by each port, which is used for PRU, that is, "process switching unit" command. Set execution/correspondence; EI is the English abbreviation of EFS Interface, which means that the module implements the function of the interface with the EFS module shown in the figure; MI is the English abbreviation of Matrix Interface, which means that the module implements the same function as the Matrix module shown in the figure. The function of the interface between them; the Matrix in the figure is a multi-layer bus controller that realizes the interconnection between all ports in the PRU.
S3、通过第二节点运行第一节点发送的进程。S3. Run the process sent by the first node through the second node.
通过构建连接超级计算机内各节点的进程交换单元,使得当需要将某一节点内的进程转移到其它节点进行运行时,可以直接通过进程交换单元进行进行转移,不需要依赖于TCP/IP协议,从而能够在针对巨大负荷时,具有更合理、更为高效的任务负荷调度效果,有利于保证进程的时效性。By building a process exchange unit that connects each node in the supercomputer, when a process in a node needs to be transferred to other nodes for operation, it can be transferred directly through the process exchange unit without relying on the TCP/IP protocol. Therefore, it can have a more reasonable and efficient task load scheduling effect when dealing with huge loads, which is beneficial to ensure the timeliness of the process.
具体的,本方案的超级计算机架构可以如图6所示,当然,也可以在此基础上进行不断扩展,扩展的方式可以如图7所示,从而使得超级计算 机的架构不断增大。Specifically, the supercomputer architecture of this solution can be shown in Figure 6. Of course, it can also be continuously expanded on this basis. The expansion method can be shown in Figure 7, so that the architecture of the supercomputer is continuously increased.
优选的,构建连接超级计算机内各节点的进程交换单元之前,还包括步骤:Preferably, before constructing a process exchange unit connecting each node in the supercomputer, the steps further include:
在软件开发时,判断用户软件是否支持动态布放;若支持,则通过编译器将进程之间的通信以短消息和/或文件消息的形式进行封装。During software development, it is judged whether the user software supports dynamic deployment; if so, the communication between processes is encapsulated in the form of short messages and/or file messages through the compiler.
本方案在进程布放时,能够做到动态布放,所谓“动态布放”是相对于“静态布放”,“静态布放”是指用户任务进入“超级计算机”内部网络之前需要对消耗资源较大以及具有高并发性应用的大任务针对“超级计算机”的硬件以及其内部网络特性进行任务的分割和可能的网络配置。因此,“静态布放”对于不同“超级计算机”用户程序需要做不同的适配,可移植性或兼容性较差,“静态布放”在任务提交之前所需完成的“适配操作”令超大型任务或高并发性大型任务的开发变得更为复杂、更为困难。This solution can achieve dynamic deployment during process deployment. The so-called "dynamic deployment" is relative to "static deployment", and "static deployment" refers to the need for user tasks to enter the internal network of the "supercomputer". Large tasks with large resources and applications with high concurrency perform task segmentation and possible network configuration for the hardware of the "supercomputer" and its internal network characteristics. Therefore, "static deployment" requires different adaptations for different "supercomputer" user programs, and the portability or compatibility is poor. "Static deployment" needs to be completed before the task is submitted. The development of very large tasks or large tasks with high concurrency becomes more complex and difficult.
本发明所能达到的“动态布放”是指用户程序无论其对硬件资源消耗的大小或并发性质如何,用户在向“超级计算机”提交任务的时候并不需要针对特定“超级计算机”做特定的“适配操作”,更不需要做任何关于“超级计算机”内部网络的相关配置,任何用户程序在提交给“超级计算机”之后,系统可以自动对用户程序做精确到进程的布放。所谓“精确到进程的布放”是指用户程序在运行过程中任意时刻产生的子进程,系统都能依据当前“节点”的负荷情况做动态的布放。The "dynamic deployment" that can be achieved in the present invention means that no matter how the user program consumes hardware resources or the concurrency nature, the user does not need to make a specific task for a specific "supercomputer" when submitting a task to the "supercomputer" The "adaptation operation" of the "supercomputer" does not require any configuration related to the internal network of the "supercomputer". After any user program is submitted to the "supercomputer", the system can automatically deploy the user program to the process accuracy. The so-called "deployment accurate to the process" refers to the sub-process generated at any time during the running process of the user program, and the system can dynamically deploy it according to the current load of the "node".
另外,当进程布放由“静态布放”变为“动态布放”后,进程间的通信与同步方法将发生变化。因此,本发明方法在实现“动态布放”时,要求软件开发的过程中必须确定自己的软件是否支持“动态布放”,当支持“动态布放”时,编译器会将原本“静态布放”的进程间通信和同步的方法以PRU指令集中的短消息和/或文件消息的形式自动进行封装,以达到功能上与“静态布放”相同的进程通信和同步的效果。很显然本发明要求用户程序所需要做的“准备”与具体的“超级计算机”并没有关系,因此这种“准备工作”完全独立于“超 级计算机”硬件及其内部网络,并不会带来任何可移植性的问题。In addition, when the process deployment changes from "static deployment" to "dynamic deployment", the communication and synchronization methods between processes will change. Therefore, when the method of the present invention implements "dynamic deployment", it is required to determine whether the software supports "dynamic deployment" in the process of software development. When "dynamic deployment" is supported, the compiler will The method of inter-process communication and synchronization of "layout" is automatically encapsulated in the form of short messages and/or file messages in the PRU instruction set, so as to achieve the same process communication and synchronization effect as "static deployment" in function. Obviously, the "preparation" required by the present invention for the user program has nothing to do with the specific "supercomputer", so this "preparation" is completely independent of the "supercomputer" hardware and its internal network, and will not bring any portability issues.
具体的,用户程序如果是支持“动态布放”,那么在进程被编译后,编译器会对所有与进程通信及进程同步相关的程序插入一个判断分支,即判断当前程序时在PRU-S(即源端节点进程交换单元)中运行还是在PRU-T(即目标端进程交换单元)中运行,如果是在PRU-S中运行就采用与“静态布放”相同的进程通信和进程同步程序,而如果是在PRU-T中运行,就采用使用PRU指令封装后的进程通信和进程同步程序。而在“超级计算机”的用户进程,如果进程被布放到PRU-T中运行,该进程会被标记为“外部进程”的状态(没有被布放到其它“节点”的进程为“本部进程”,即默认标记状态)。Specifically, if the user program supports "dynamic deployment", then after the process is compiled, the compiler will insert a judgment branch for all programs related to process communication and process synchronization, that is, when judging the current program, the PRU-S ( That is, the source node process switching unit) or the PRU-T (target process switching unit), if it is running in the PRU-S, the same process communication and process synchronization procedures as "static deployment" are used. , and if it is run in PRU-T, the process communication and process synchronization program encapsulated by the PRU instruction is used. In the user process of "supercomputer", if the process is deployed in PRU-T to run, the process will be marked as "external process" (the process that is not deployed to other "nodes" is "local process", i.e. the default flag state).
实施例2Example 2
本发明的一个实施例,在实施例1的基础上,通过进程交换单元建立超级计算机中用于发送进程的第一节点与用于接收进程的第二节点之间的路由路径,具体包括:An embodiment of the present invention, on the basis of Embodiment 1, establishes a routing path between a first node for sending a process and a second node for receiving a process in a supercomputer by using a process exchange unit, which specifically includes:
S21、接收第一节点发送的路由建立指令。S21. Receive a route establishment instruction sent by the first node.
具体的,路由建立指令是由“节点S”发送的,请求与“节点T”之间建立一条路由,但路由建立指令并不指定具体哪一个“节点”是“节点T”(事实上,作为“节点S”,其并不具备指定路由目标的能力),而是由PRU根据与其连接的“节点”的具体负荷以及路由时间的值来决定如何建立路由。Specifically, the route establishment instruction is sent by "node S", and a route is established between the request and "node T", but the route establishment instruction does not specify which "node" is "node T" (in fact, as a "Node S", which does not have the ability to specify a routing target), but the PRU decides how to establish a route according to the specific load of the "node" connected to it and the value of the routing time.
另外,在成功建立起路由路径后,或者建立路由路径失败后,都会向“节点S”回馈一个“路由反馈”的指令,以告知“节点S”路由路径建立结果,不过“路由反馈”的指令只会在路由网络中传递,即该指令传递到“节点S”之后,“节点S”的相应端口会以中断的形式反馈到相关驱动进程。In addition, after the routing path is successfully established, or after the establishment of the routing path fails, a "routing feedback" instruction will be sent back to "node S" to inform "node S" of the routing path establishment result, but the "routing feedback" instruction It will only be passed in the routing network, that is, after the command is passed to "node S", the corresponding port of "node S" will be fed back to the relevant driver process in the form of an interrupt.
路由建立指令包含以下参数:进程号参数(即驱动进程的进程号)、RTF文件号参数(即RTF文件的文件号)、内核类别号参数(内核类别号参数指明 当前指令所需建立的路由是为那种型号内核运行的进程准备的,内核类别信号决定了相关进程是否能够在“节点T”中执行,也就是说PRU需要依据“内核类别号”来判断应该如何找到合适的“节点T”来建立路由)、内核版本号参数(内核版本号参数是指内核类别号所定义的内核的版本号,只有内核类别号和内核版本号都符合要求才能保证相关进程能够在“目标系统”中正确运行)、代码加载空间参数(代码加载空间参数是指当前指令建立的路由所服务的相关进程的执行代码被加载(被安装)到“目标系统”时所需要的最大存储空间,“代码加载空间”参数的意义是判断“目标系统”是否有足够的存储资源用于加载相关进程的执行代码)、数据加载空间参数(数据加载空间参数是指当前指令建立的路由所服务的相关进程所需的用于变量存储的最小存储量,数据加载空间参数的意义是判断“目标系统”是否有足够的存储资源用于相关进程的最低运行场景)和TTR参数(TTR是Time To Routing的英文缩写,即“路由时间”)。The route establishment instruction includes the following parameters: the process number parameter (that is, the process number of the driver process), the RTF file number parameter (that is, the file number of the RTF file), and the kernel class number parameter (the kernel class number parameter indicates that the route to be established by the current command is Prepared for the process running on that type of kernel, the kernel type signal determines whether the relevant process can be executed in "node T", that is to say, the PRU needs to judge how to find the appropriate "node T" according to the "kernel type number" to establish a route), the kernel version number parameter (the kernel version number parameter refers to the version number of the kernel defined by the kernel category number. Only the kernel category number and the kernel version number meet the requirements to ensure that the relevant process can be correct in the "target system". Run), code loading space parameter (code loading space parameter refers to the maximum storage space required when the execution code of the relevant process served by the route established by the current instruction is loaded (installed) into the "target system", "code loading space" "The meaning of the parameter is to judge whether the "target system" has enough storage resources to load the execution code of the relevant process), the data load space parameter (the data load space parameter refers to the needs of the relevant process served by the route established by the current instruction. The minimum storage amount used for variable storage, the meaning of the data loading space parameter is to judge whether the "target system" has enough storage resources for the lowest running scenario of the relevant process) and the TTR parameter (TTR is the English abbreviation of Time To Routing, that is "Route Time").
S22、通过进程交换单元向超级计算机的其它节点广播路由建立指令,并接收其它节点对应发送的路由建立反馈指令。S22: Broadcast the route establishment instruction to other nodes of the supercomputer through the process exchange unit, and receive the route establishment feedback instruction sent by other nodes correspondingly.
路由建立反馈指令是针对路由建立指令的反馈指令,本指令可以由“节点T”或者PRU发出。路由建立反馈指令包含以下参数:进程号参数(驱动进程的进程号或者为0(如果是“节点T”发出指令,则该参数为“驱动进程号”,否则必须为0))、RTF参数(RTF文件的文件号)、TTR参数(表示在路由建立的过程中所经历的PRU的数量)、路由结果参数(表示路由建立的最终结果,路由建立的可能结果包括:“路由建立成功”、“路由建立失败”,如果该参数是“路由建立失败”,那么需要解析“错误”参数,以获知“失败”的缘由),以及路由性质参数(当路由结果参数显示的是“路由建立成功”时,“路由性质”表示的是当前端口所直接连接或间接连接的“节点T”的负荷比重,此时PRU可以利用该参数来从众多反馈中选取一个最为合适的“节点T”作为最佳 路由方向;而当路由结果参数显示的是“路由建立失败”时,“路由性质”表示的是当前端口路由失败缘由,具体有如下可能:本端口没有连接“节点T”、节点T”硬件资源不足、节点T”负荷过大、“节点T”硬件资源及负荷均不符合要求、PRU内存不足无法提供足够的空间缓存代码及初始变量、PRU硬件管理资源不足无法建立RTF文件、PRU内存和硬件管理资源均不足)。The route establishment feedback command is a feedback command for the route establishment command, and this command may be issued by the "node T" or the PRU. The route establishment feedback command includes the following parameters: the process number parameter (the process number of the driver process or 0 (if the command is issued by "node T", the parameter is the "driver process number", otherwise it must be 0)), RTF parameter ( RTF file number), TTR parameter (representing the number of PRUs experienced in the process of route establishment), route result parameter (representing the final result of route establishment, the possible results of route establishment include: "Route establishment is successful", " "Route establishment failed", if the parameter is "Route establishment failed", then you need to parse the "Error" parameter to know the reason for the "failure"), and the route property parameter (when the route result parameter shows "Route establishment succeeded" , "routing property" represents the load proportion of the "node T" directly or indirectly connected to the current port. At this time, the PRU can use this parameter to select the most suitable "node T" from many feedbacks as the best route. When the route result parameter shows "Route establishment failed", "Route property" indicates the reason for the failure of the current port routing. The specific possibilities are as follows: this port is not connected to "node T", and the hardware resources of node T are insufficient. , The load of "Node T" is too large, the hardware resources and load of "Node T" do not meet the requirements, the PRU memory is insufficient and cannot provide enough space to cache code and initial variables, the PRU hardware management resources are insufficient and cannot create RTF files, PRU memory and hardware management resources are insufficient).
S23、通过进程交换单元在发送建立反馈指令的其它节点中选择路由时间最小的节点作为第二节点。S23. The process switching unit selects the node with the smallest routing time as the second node among other nodes that send the establishment feedback instruction.
优选的,通过进程交换单元在发送建立反馈指令的其它节点中选择路由时间最小的节点作为第二节点,还包括:Preferably, the process switching unit selects the node with the smallest routing time as the second node among other nodes that send the establishment feedback instruction, further comprising:
当路由时间最小的节点有多个时,选择负荷比重最小的节点作为第二节点。When there are multiple nodes with the smallest routing time, the node with the smallest load proportion is selected as the second node.
S24、建立连接第一节点和第二节点的路由路径。S24. Establish a routing path connecting the first node and the second node.
实施例3Example 3
本发明的一个实施例,在实施例2的基础上,接收第一节点发送的路由建立指令时,还包括:An embodiment of the present invention, on the basis of Embodiment 2, when receiving the route establishment instruction sent by the first node, further includes:
在进程交换单元中创建用于记录进程信息的路由与任务描述文件。Create a routing and task description file for recording process information in the process exchange unit.
具体的,路由与任务描述文件(Routing&Task File,简称为RTF),是PRU用于建立路由以及实现路由的重要文件,该文件记录了“进程”相关信息,以及记录了路由路径的信息。PRU对路由实现将依赖于RTF文件中记录的内容(路由信息),RTF只在PRU中保存并管理,对于某一条路由路径,RTF最初是由“节点S”所创建的,即在任何路由路径被建立之前,“节点S”需要首先在其连接的PRU中创建一个RTF文件,事实上此时的RTF文件为空文件,暂时没有记录任何内容,此后,“节点S”会向PRU发送建立路由相关的指令,PRU在为“节点S”建立路由路径时便会在RTF文件写入相应的路由信息和进 程信息。Specifically, the Routing & Task File (RTF for short) is an important file used by the PRU to establish and implement routing. This file records information related to the "process" and records the routing path information. The implementation of routing by the PRU will depend on the content (routing information) recorded in the RTF file. The RTF is only saved and managed in the PRU. For a routing path, the RTF is initially created by "node S", that is, in any routing path Before being established, "node S" needs to first create an RTF file in the PRU to which it is connected. In fact, the RTF file at this time is an empty file, and no content is recorded for the time being. After that, "node S" will send a message to the PRU to establish a route. For related instructions, the PRU will write the corresponding routing information and process information in the RTF file when establishing the routing path for the "node S".
RTF文件必须包含以下主要信息:“节点S”或者来自“节点S”方向的PRU与当前PRU连接的端口号;“节点T”或者来自“节点S”方向的PRU与当前PRU连接的端口号;当前PRU如果是PRU-S(即源端节点进程交换单元)时,需要包含“节点S”中使用本路由路径的进程的进程号;当前PRU如果是PRU-T(即目标端进程交换单元)时,需要包含“节点T”中使用本路由路径的进程的进程号;当前路由路径所经历的PRU的总数,即TTR(TTR是Time To Routing的英文缩写,即“路由时间”);路由时间戳,即RTS。The RTF file must contain the following main information: "Node S" or the port number connecting the PRU from the "Node S" direction to the current PRU; "Node T" or the port number connecting the PRU from the "Node S" direction to the current PRU; If the current PRU is PRU-S (that is, the process switching unit of the source end node), it needs to include the process ID of the process using this routing path in "node S"; if the current PRU is PRU-T (that is, the process switching unit of the target end end) When , it needs to include the process ID of the process using this routing path in "node T"; the total number of PRUs experienced by the current routing path, namely TTR (TTR is the English abbreviation of Time To Routing, that is, "routing time"); routing time Poke, ie RTS.
建立连接第一节点和第二节点的路由路径时,还包括:When establishing the routing path connecting the first node and the second node, it also includes:
在路由与任务描述文件中写入路由信息。Write routing information in routing and task description files.
通过第二节点运行第一节点发送的进程,具体包括:Running the process sent by the first node through the second node specifically includes:
通过路由与任务描述文件将第一节点发送的进程加载到第二节点中,并由第二节点进行该进程的运行。The process sent by the first node is loaded into the second node through the routing and task description file, and the process is run by the second node.
实施例4Example 4
本发明的一个实施例,在实施例3的基础上,通过路由与任务描述文件将第一节点发送的进程加载到第二节点中,并由第二节点进行该进程的运行之后,还包括:In an embodiment of the present invention, on the basis of Embodiment 3, the process sent by the first node is loaded into the second node through the routing and task description file, and after the second node executes the process, the process further includes:
当路由路径上存在指令时,更新路由与任务描述文件中的路由时间戳;当路由时间戳与进程交换单元的当前时间的差距超过预定范围时,通过进程交换单元向第一节点发送唤醒指令;若第一节点未响应唤醒指令,即“节点S”的相关进程已经意外终止,则自动卸载第二节点中的进程,并删除当前路由路径;若第一节点响应唤醒指令并与对应的第二节点进行通信,但第二节点未回应,即“节点T”相关进程已经意外终止,则通过第一节点发送卸载进程指令。When there is an instruction on the routing path, update the routing timestamp in the routing and task description files; when the difference between the routing timestamp and the current time of the process switching unit exceeds a predetermined range, send a wake-up instruction to the first node through the process switching unit; If the first node does not respond to the wake-up command, that is, the related process of "node S" has been terminated unexpectedly, the process in the second node is automatically uninstalled, and the current routing path is deleted; if the first node responds to the wake-up command and corresponds to the corresponding second node The node communicates, but the second node does not respond, that is, the related process of "Node T" has been terminated unexpectedly, and the uninstallation process instruction is sent through the first node.
优选的,通过第一节点发送卸载进程指令之后,还包括:接收第一节点发 送的卸载进程指令;删除第一节点和第二节点之间的路由路径。Preferably, after the uninstallation process instruction is sent by the first node, the method further includes: receiving the uninstallation process instruction sent by the first node; and deleting the routing path between the first node and the second node.
实施例5Example 5
本发明的一个实施例,在上述任一实施例的基础上,建立连接第一节点和第二节点的路由路径之后,还包括:接收第一节点发送的安装执行代码并执行指令,第二节点加载进程执行代码,并在加载完成后立即开始该进程的运行;或接收第一节点发送的安装执行代码及初始化并执行指令,第二节点加载进程执行代码以及初始化变量,并在加载完成后立即开始该进程的运行。An embodiment of the present invention, on the basis of any of the foregoing embodiments, after establishing a routing path connecting the first node and the second node, further includes: receiving the installation execution code sent by the first node and executing the instruction, the second node Load the process execution code, and start running the process immediately after the loading is completed; or receive the installation execution code and initialization and execution instructions sent by the first node, the second node loads the process execution code and initializes variables, and immediately after the loading is completed. Start the process running.
优选的,建立连接第一节点和第二节点的路由路径之后,还包括:通过短消息的方式进行第一节点和第二节点之间的数据交互;和/或通过文件消息的方式进行第一节点和第二节点之间的数据交互。Preferably, after establishing the routing path connecting the first node and the second node, the method further includes: performing data interaction between the first node and the second node by means of short messages; and/or performing the first node and the second node by means of file messages Data interaction between the node and the second node.
进一步优选的,通过第二节点运行第一节点发送的进程之后,还包括:在进程运行结束后,接收第二节点发送的无消息返回指令,并在无消息返回指令传递的过程中删除路由信息;或在进程运行结束后,接收第二节点发送的有消息返回指令,并在有消息返回指令传递的过程中删除路由信息,有消息返回指令与无消息返回指令的区别是在指令传递的过程中会携带一条短消息。Further preferably, after running the process sent by the first node through the second node, it also includes: after the process running ends, receiving a no-message return instruction sent by the second node, and deleting the routing information in the process of transmitting the no-message return instruction. ; or after the process ends, receive a message return instruction sent by the second node, and delete the routing information in the process of message return instruction delivery. The difference between a message return instruction and no message return instruction is in the process of instruction transmission. will carry a short message.
应当说明的是,上述实施例均可根据需要自由组合。以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。It should be noted that the above embodiments can be freely combined as required. The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the principles of the present invention, several improvements and modifications can be made. It should be regarded as the protection scope of the present invention.

Claims (10)

  1. 一种超级计算机架构实现方法,其特征在于,包括步骤:A method for realizing supercomputer architecture, comprising the steps of:
    构建连接超级计算机内各节点的进程交换单元;Build a process exchange unit that connects each node in the supercomputer;
    通过所述进程交换单元建立所述超级计算机中用于发送进程的第一节点与用于接收进程的第二节点之间的路由路径;Establish a routing path between a first node for sending a process and a second node for receiving a process in the supercomputer by using the process exchange unit;
    通过所述第二节点运行所述第一节点发送的进程。The process sent by the first node is run through the second node.
  2. 根据权利要求1所述的一种超级计算机架构实现方法,其特征在于,所述的通过所述进程交换单元建立所述超级计算机中用于发送进程的第一节点与用于接收进程的第二节点之间的路由路径,具体包括:The method for implementing a supercomputer architecture according to claim 1, wherein the process switching unit is used to establish a first node for sending a process and a second node for receiving a process in the supercomputer. Routing paths between nodes, including:
    接收所述第一节点发送的路由建立指令;receiving a route establishment instruction sent by the first node;
    通过所述进程交换单元向所述超级计算机的其它节点广播所述路由建立指令,并接收其它节点对应发送的路由建立反馈指令;Broadcast the route establishment instruction to other nodes of the supercomputer through the process exchange unit, and receive route establishment feedback instructions sent by other nodes correspondingly;
    通过所述进程交换单元在发送所述建立反馈指令的其它节点中选择路由时间最小的节点作为所述第二节点;Selecting, by the process switching unit, the node with the smallest routing time as the second node among other nodes that send the establishment feedback instruction;
    建立连接所述第一节点和所述第二节点的路由路径。A routing path connecting the first node and the second node is established.
  3. 根据权利要求2所述的一种超级计算机架构实现方法,其特征在于,所述的通过所述进程交换单元在发送所述建立反馈指令的其它节点中选择路由时间最小的节点作为所述第二节点,还包括:The method for implementing a supercomputer architecture according to claim 2, wherein the process switching unit selects a node with the smallest routing time among other nodes that send the establishment feedback instruction as the second node, which also includes:
    当路由时间最小的节点有多个时,选择负荷比重最小的节点作为所述第二节点。When there are multiple nodes with the smallest routing time, the node with the smallest load proportion is selected as the second node.
  4. 根据权利要求2所述的一种超级计算机架构实现方法,其特征在于,所述的接收所述第一节点发送的路由建立指令时,还包括:The method for implementing a supercomputer architecture according to claim 2, wherein when receiving the route establishment instruction sent by the first node, the method further comprises:
    在所述进程交换单元中创建用于记录进程信息的路由与任务描述文件;Create a route and task description file for recording process information in the process exchange unit;
    所述的建立连接所述第一节点和所述第二节点的路由路径时,还包括:When establishing the routing path connecting the first node and the second node, the method further includes:
    在所述路由与任务描述文件中写入路由信息;Write routing information in the routing and task description files;
    所述的通过所述第二节点运行所述第一节点发送的进程,具体包括:The running of the process sent by the first node through the second node specifically includes:
    通过所述路由与任务描述文件将所述第一节点发送的进程加载到所述第二节点中,并由所述第二节点进行该进程的运行。The process sent by the first node is loaded into the second node through the routing and task description file, and the second node executes the process.
  5. 根据权利要求4所述的一种超级计算机架构实现方法,其特征在于,所述的通过所述路由与任务描述文件将所述第一节点发送的进程加载到所述第二节点中,并由所述第二节点进行该进程的运行之后,还包括:The method for implementing a supercomputer architecture according to claim 4, wherein the process sent by the first node is loaded into the second node through the routing and task description file, and the process is sent by the second node. After the second node runs the process, it further includes:
    当路由路径上存在指令时,更新所述路由与任务描述文件中的路由时间戳;When there is an instruction on the routing path, update the routing timestamp in the routing and task description files;
    当所述路由时间戳与所述进程交换单元的当前时间的差距超过预定范围时,通过所述进程交换单元向所述第一节点发送唤醒指令;When the difference between the routing timestamp and the current time of the process switching unit exceeds a predetermined range, send a wake-up instruction to the first node through the process switching unit;
    若所述第一节点未响应所述唤醒指令,则自动卸载所述第二节点中的进程,并删除当前路由路径;If the first node does not respond to the wake-up instruction, automatically uninstall the process in the second node, and delete the current routing path;
    若所述第一节点响应所述唤醒指令并与对应的所述第二节点进行通信,但所述第二节点未回应,则通过所述第一节点发送卸载进程指令。If the first node responds to the wake-up instruction and communicates with the corresponding second node, but the second node does not respond, an uninstallation process instruction is sent through the first node.
  6. 根据权利要求5所述的一种超级计算机架构实现方法,其特征在于,所述的通过所述第一节点发送卸载进程指令之后,还包括:The method for implementing a supercomputer architecture according to claim 5, characterized in that, after sending the uninstallation process instruction through the first node, the method further comprises:
    接收所述第一节点发送的所述卸载进程指令;receiving the uninstallation process instruction sent by the first node;
    删除所述第一节点和所述第二节点之间的路由路径。The routing path between the first node and the second node is deleted.
  7. 根据权利要求2所述的一种超级计算机架构实现方法,其特征在于, 所述的建立连接所述第一节点和所述第二节点的路由路径之后,还包括:The method for implementing a supercomputer architecture according to claim 2, wherein, after the establishing a routing path connecting the first node and the second node, the method further comprises:
    接收所述第一节点发送的安装执行代码并执行指令,所述第二节点加载进程执行代码,并在加载完成后立即开始该进程的运行;Receive the installation execution code sent by the first node and execute the instruction, the second node loads the process execution code, and starts running the process immediately after the loading is completed;
    或;or;
    接收所述第一节点发送的安装执行代码及初始化并执行指令,所述第二节点加载进程执行代码以及初始化变量,并在加载完成后立即开始该进程的运行。After receiving the installation execution code and the initialization and execution instruction sent by the first node, the second node loads the process execution code and initializes variables, and starts running the process immediately after the loading is completed.
  8. 根据权利要求2所述的一种超级计算机架构实现方法,其特征在于,所述的建立连接所述第一节点和所述第二节点的路由路径之后,还包括:The method for implementing a supercomputer architecture according to claim 2, characterized in that after establishing the routing path connecting the first node and the second node, the method further comprises:
    通过短消息的方式进行所述第一节点和所述第二节点之间的数据交互;perform data interaction between the first node and the second node by means of short messages;
    和/或;and / or;
    通过文件消息的方式进行所述第一节点和所述第二节点之间的数据交互。The data interaction between the first node and the second node is performed by means of file messages.
  9. 根据权利要求1-8任一所述的一种超级计算机架构实现方法,其特征在于,所述的通过所述第二节点运行所述第一节点发送的进程之后,还包括:The method for implementing a supercomputer architecture according to any one of claims 1-8, wherein after the process sent by the first node is executed by the second node, the method further comprises:
    在进程运行结束后,接收所述第二节点发送的无消息返回指令,并在所述无消息返回指令传递的过程中删除路由信息;After the process finishes running, receive the no-message return instruction sent by the second node, and delete routing information in the process of transmitting the no-message return instruction;
    或;or;
    在进程运行结束后,接收所述第二节点发送的有消息返回指令,并在所述有消息返回指令传递的过程中删除路由信息。After the running of the process is completed, a message-returning instruction sent by the second node is received, and routing information is deleted in the process of transmitting the message-returning instruction.
  10. 根据权利要求8所述的一种超级计算机架构实现方法,其特征在于,所述的构建连接超级计算机内各节点的进程交换单元之前,还包括步骤:The method for realizing a supercomputer architecture according to claim 8, characterized in that, before the process switching unit that connects each node in the supercomputer is constructed, the method further comprises the steps:
    在软件开发时,判断用户软件是否支持动态布放;During software development, determine whether the user software supports dynamic deployment;
    若支持,则通过编译器将进程之间的通信以短消息和/或文件消息的形式进行封装。If supported, the communication between processes is encapsulated in the form of short messages and/or file messages by the compiler.
PCT/CN2021/071347 2021-01-13 2021-01-13 Supercomputer architecture implementation method WO2022150995A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/071347 WO2022150995A1 (en) 2021-01-13 2021-01-13 Supercomputer architecture implementation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/071347 WO2022150995A1 (en) 2021-01-13 2021-01-13 Supercomputer architecture implementation method

Publications (1)

Publication Number Publication Date
WO2022150995A1 true WO2022150995A1 (en) 2022-07-21

Family

ID=82446358

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/071347 WO2022150995A1 (en) 2021-01-13 2021-01-13 Supercomputer architecture implementation method

Country Status (1)

Country Link
WO (1) WO2022150995A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101986272A (en) * 2010-11-05 2011-03-16 北京大学 Task scheduling method under cloud computing environment
CN102207883A (en) * 2011-06-01 2011-10-05 华中科技大学 Transaction scheduling method of heterogeneous distributed real-time system
US20140164831A1 (en) * 2010-12-23 2014-06-12 Mongodb, Inc. Method and apparatus for maintaining replica sets
CN105718355A (en) * 2016-01-21 2016-06-29 中国人民解放军国防科学技术大学 Online learning-based super computer node active fault-tolerant method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101986272A (en) * 2010-11-05 2011-03-16 北京大学 Task scheduling method under cloud computing environment
US20140164831A1 (en) * 2010-12-23 2014-06-12 Mongodb, Inc. Method and apparatus for maintaining replica sets
CN102207883A (en) * 2011-06-01 2011-10-05 华中科技大学 Transaction scheduling method of heterogeneous distributed real-time system
CN105718355A (en) * 2016-01-21 2016-06-29 中国人民解放军国防科学技术大学 Online learning-based super computer node active fault-tolerant method

Similar Documents

Publication Publication Date Title
EP3340055B1 (en) Communicating state information in distributed operating systems
EP3340053B1 (en) Organizing execution of distributed operating systems for network devices
US9378005B2 (en) Hitless software upgrades
US7194652B2 (en) High availability synchronization architecture
US20180176120A1 (en) Maintaining coherency in distributed operating systems for network devices
US8799419B1 (en) Configuration update on virtual control plane
CN112615666B (en) Micro-service high-availability deployment method based on RabbitMQ and HAproxy
US20080215915A1 (en) Mechanism to Change Firmware in a High Availability Single Processor System
US20030074426A1 (en) Dynamic cluster versioning for a group
US20020073410A1 (en) Replacing software at a telecommunications platform
KR20070026327A (en) Redundant routing capabilities for a network node cluster
CN101196823B (en) Method, system and equipment for on-line software upgrade in open application structure
US11403319B2 (en) High-availability network device database synchronization
CN110113406B (en) Distributed computing service cluster system
CN115242877B (en) Spark collaborative computing and operating method and device for multiple K8s clusters
WO2021043124A1 (en) Kbroker distributed operating system, storage medium, and electronic device
US8321608B2 (en) Pool I/O device operation confirmation method and computer system
CN113515316A (en) Novel edge cloud operating system
US20020073409A1 (en) Telecommunications platform with processor cluster and method of operation thereof
WO2022150995A1 (en) Supercomputer architecture implementation method
CN113904973B (en) Route updating method, medium, device and computing equipment
CN112363971A (en) Super computer architecture implementation method
CN110650312B (en) Capacity expansion method and device based on image monitoring system
CN113568669A (en) Service board card starting method based on orthogonal architecture, service board card and orthogonal equipment
CN115277750B (en) Multisystem intelligent cabin communication assembly

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21918213

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21918213

Country of ref document: EP

Kind code of ref document: A1