CN111913907A

CN111913907A - FPGA clustering method, FPGA chip and FPGA clustering system

Info

Publication number: CN111913907A
Application number: CN202010814108.XA
Authority: CN
Inventors: 杜聚有; 李光源
Original assignee: Shanghai Jucheng Ruixun Technology Co ltd
Current assignee: Shanghai Jucheng Ruixun Technology Co ltd
Priority date: 2020-08-13
Filing date: 2020-08-13
Publication date: 2020-11-10

Abstract

The invention provides an FPGA (field programmable gate array) clustering method, an FPGA chip and an FPGA clustering system. Each FPGA node of the FPGA cluster comprises a first storage area and a second storage area, resource information of the second storage area is recorded in the first storage area, the FPGA node comprises a main node and a plurality of sub-nodes, and the FPGA cluster method comprises the following steps: when the resources of the second storage area of the first child node are insufficient, sending a resource allocation request to the main node; a first child node sends an access request to a second child node, and the second child node and the first child node establish an access channel; the first sub-node sends a service completion signal to the main node; and each node updates the content of the first storage area according to a preset rule. By the configuration, the resources of the FPGA chip in the whole local area network can be integrated and reasonably distributed, so that the aim of optimally utilizing the resources is fulfilled.

Description

FPGA clustering method, FPGA chip and FPGA clustering system

Technical Field

The invention relates to the field of computers and communication, in particular to an FPGA (field programmable gate array) clustering method, an FPGA chip and an FPGA clustering system.

Background

With the development of computer technology, the CPU cannot meet the performance requirements of high-performance computing software, resulting in a gap between the requirements and the performance, and the heterogeneous computing mode of the special coprocessor is used to improve the processing performance, which is the first choice in the industry. Compared with a GPU (graphics processing unit), an FPGA (field-programmable gate array) has the advantages of low power consumption and low delay; compared with an ASIC, an FPGA can be programmed, and the isomorphism of a data center can be kept. Computer devices of the CPU + FPGA configuration are therefore widely used.

As FPGAs are widely used, there may be a considerable number of computer devices using FPGA chips in a local area network. However, the FPGA chips in these devices perform computation independently, which cannot meet the huge computation demand on one hand, and on the other hand, causes a certain amount of waste in idle time. Therefore, how to integrate the resources of the FPGA chip in the whole lan and reasonably allocate them to achieve the purpose of optimally utilizing these resources is an important problem to be solved in the art.

Disclosure of Invention

The invention aims to provide an FPGA (field programmable gate array) clustering method, an FPGA chip and an FPGA clustering system so as to solve the problem of integration of FPGA chip resources in a local area network.

In order to solve the technical problem, the present invention provides an FPGA clustering method, each FPGA node includes a first storage area and a second storage area, resource information of the second storage area is recorded in the first storage area, the FPGA node includes a master node and a plurality of child nodes, the FPGA clustering method includes the following steps:

step S1: when the resource of the second storage area of the first child node is insufficient, a resource allocation request is sent to the main node, the main node allocates the second storage area of the second child node to the first child node according to the information of the first storage area of the main node, and the content of the first storage area of the main node is updated;

step S2: the first child node sends an access request to the second child node, the second child node establishes an access channel with the first child node, and the second child node updates the content of the first storage area of the second child node;

step S3: and the first child node sends a service completion signal to the main node, and the main node updates the content of the first storage area of the main node.

Optionally, the step S2 further includes: and the second child node sends the updated content in the first storage area of the child node to the main node, and the main node updates the content in the first storage area of the child node.

Optionally, the step S3 further includes: the first sub-node sends a service completion signal to the second sub-node, and the second sub-node updates the content of the first storage area of the second sub-node, or the master node sends a service completion signal to the second sub-node, and the second sub-node updates the content of the first storage area of the second sub-node.

Optionally, the access channel described in step S2 is established using RDMA technology.

Optionally, the first storage area of the master node includes resource information of the second storage areas of all the FPGA nodes; the first storage area of the child node includes resource information of the second storage area of the child node.

Optionally, in one of the FPGA nodes, the resource information includes an address corresponding to the second storage area and a use state of the second storage area.

Optionally, the use state includes:

a first state representing that the indicated second storage area is free;

a second state representing that the indicated second storage area is read by at least one of the FPGA nodes;

a third state, which represents that the indicated second storage area has been exclusively written by a node and is read by at least one node; and the number of the first and second groups,

and a fourth state representing that the indicated second storage area has been exclusively written to by a node.

Optionally, the size of the first storage area is set in advance or automatically adjusted according to an actual operating condition, and the size of the second storage area is set in advance or automatically adjusted according to an actual operating condition.

In order to solve the above technical problem, according to a second aspect of the present invention, there is provided an FPGA chip, including a first logic module and a network card supporting RDMA technology, where the first logic module is in communication connection with the network card to form an FPGA node, and the FPGA node is used in the FPGA clustering method and is in communication connection with other FPGA nodes in an FPGA cluster through the network card.

Optionally, the FPGA chip includes a memory module, and the memory module is divided into a first storage area and a second storage area.

Optionally, the FPGA chip includes a Cache module, configured to buffer the content of the first storage area.

Optionally, the FPGA chip includes a second logic module and a memory module, the second logic module is composed of memory granules or a flash, the second logic module is configured as a first storage area, and the memory module is configured as a second storage area.

In order to solve the technical problem, according to a third aspect of the present invention, there is provided an FPGA cluster system, including two or more FPGA chips, where one of the FPGA chips is configured as a master node, and the remaining FPGA chips are configured as child nodes.

Compared with the prior art, in the FPGA clustering method, the FPGA chip and the FPGA clustering system provided by the invention, each FPGA node comprises a first storage area and a second storage area, and the first storage area records the resource information of the second storage area. When the resource information of the second storage area of the first child node is insufficient, sending a resource allocation request to the main node, and the main node allocates the second storage area of the second child node to the first child node according to the information of the first storage area of the main node and updates the content of the first storage area of the main node; then, the first child node sends an access request to a second child node, the second child node establishes an access channel with the first child node, and the second child node updates the content of the first storage area of the second child node; and finally, the first child node sends a service completion signal to the main node, and the main node updates the content of the first storage area of the main node. By the configuration, the resources of the FPGA chip in the whole local area network can be integrated and reasonably distributed, so that the aim of optimally utilizing the resources is fulfilled.

Drawings

It will be appreciated by those skilled in the art that the drawings are provided for a better understanding of the invention and do not constitute any limitation to the scope of the invention. Wherein:

fig. 1 is a schematic diagram of an FPGA chip and an FPGA cluster system according to a first embodiment of the present invention;

fig. 2 is a schematic flowchart of an FPGA clustering method according to a first embodiment of the present invention;

FIG. 3a is a logic diagram of switching operation states according to a first embodiment of the present invention;

FIG. 3b is a schematic diagram of another operating state switching logic according to the first embodiment of the present invention;

fig. 4 is a schematic diagram of an FPGA chip according to a second embodiment of the present invention.

In the drawings:

1-FPGA chip; 2-a network bus; 3-a read/write request judgment unit; 4-read request reaches the ceiling request unit.

Detailed Description

To further clarify the objects, advantages and features of the present invention, a more particular description of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. It is to be noted that the drawings are in greatly simplified form and are not to scale, but are merely intended to facilitate and clarify the explanation of the embodiments of the present invention. Further, the structures illustrated in the drawings are often part of actual structures. In particular, the drawings may have different emphasis points and may sometimes be scaled differently.

As used in this application, the singular forms "a", "an" and "the" include plural referents, the term "or" is generally employed in a sense including "and/or," the terms "a" and "an" are generally employed in a sense including "at least one," the terms "at least two" are generally employed in a sense including "two or more," and the terms "first", "second" and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit to the number of technical features indicated. Thus, features defined as "first", "second" and "third" may explicitly or implicitly include one or at least two of the features, "one end" and "the other end" and "proximal end" and "distal end" generally refer to the corresponding two parts, which include not only the end points, but also the terms "mounted", "connected" and "connected" should be understood broadly, e.g., as a fixed connection, as a detachable connection, or as an integral part; can be mechanically or electrically connected; either directly or indirectly through intervening media, either internally or in any other relationship. Furthermore, as used in the present invention, the disposition of an element with another element generally only means that there is a connection, coupling, fit or driving relationship between the two elements, and the connection, coupling, fit or driving relationship between the two elements may be direct or indirect through intermediate elements, and cannot be understood as indicating or implying any spatial positional relationship between the two elements, i.e., an element may be in any orientation inside, outside, above, below or to one side of another element, unless the content clearly indicates otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

The following description refers to the accompanying drawings.

[ EXAMPLES one ]

Referring to fig. 1 to fig. 3b, fig. 1 is a schematic diagram of an FPGA chip and an FPGA clustering method according to a first embodiment of the present invention; fig. 2 is a schematic flowchart of an FPGA clustering method according to a first embodiment of the present invention; FIG. 3a is a logic diagram of switching operation states according to a first embodiment of the present invention; fig. 3b is a schematic diagram of another operating state switching logic according to the first embodiment of the present invention.

The embodiment provides an FPGA cluster system. Referring to fig. 1, the FPGA cluster system includes more than two FPGA nodes, where each FPGA node includes a first storage area and a second storage area, resource information of the second storage area is recorded in the first storage area, and the FPGA node includes a master node and a plurality of child nodes.

Based on the FPGA cluster system, this embodiment provides an FPGA clustering method, please refer to fig. 2, where the FPGA clustering method includes the following steps:

With the configuration, each child node provides the second storage area of the child node as a public resource, so that the potential memory resources of a single child node are greatly increased; meanwhile, the first storage area is used as an assistant, and the resource information of the second storage area is recorded, so that the overall planning of the whole system on the idle resources of each sub-node is facilitated, and the resources can be adaptively and optimally configured. Furthermore, repeated memory registration is avoided, and the load of a host and a coprocessor (FPGA) is reduced; repeated reading, writing and barriers are avoided; the network load is reduced, the CPU load is reduced, and the actual throughput rate of the memory is improved.

Preferably, the step S2 further includes: and the second child node sends the updated content in the first storage area of the child node to the main node, and the main node updates the content in the first storage area of the child node. Therefore, the main node can always obtain the latest view of the global resource information, and the main node can be favorable for more accurately distributing the system resources.

Preferably, the step S3 further includes: the first sub-node sends a service completion signal to the second sub-node, and the second sub-node updates the content of the first storage area of the second sub-node, or the master node sends a service completion signal to the second sub-node, and the second sub-node updates the content of the first storage area of the second sub-node. The method further ensures that the global resource information obtained by the main node is up-to-date, and is beneficial to more accurately distributing the system resources by the main node.

Preferably, the access channel described in the step S2 is established by using RDMA technology. RDMA (Remote Direct Memory Access) refers to a technology for supporting data transmission between nodes of a Memory and an internal Memory without Direct participation of host operating systems of both parties, and such configuration can provide bandwidth improvement and delay reduction, and is beneficial to cooperative work among a plurality of FPGA chips 1.

Preferably, the first storage area of the master node includes resource information of the second storage areas of all the FPGA nodes; the first storage area of the child node includes resource information of the second storage area of the child node. By the configuration, the main node can reserve all information of the whole situation, so that the allocation of resources can be better planned, and meanwhile, each node only reserves the information of the node, so that the storage space is saved.

In an embodiment, the storage modes of the main node and the child nodes may be different, and the main node may store the data in the first storage area in a coarse-grained summarized mode; the child node may store the data of the first storage area in a fine-grained, more detailed manner. When the main node performs resource planning, under most working conditions, the main node can still make correct judgment according to the data of the main node, and when the data of the main node is not enough to make judgment, a request for detailed information is sent to the corresponding sub-node, and the judgment is made after the data is obtained. So configured, can further save memory space.

Preferably, in one of the FPGA nodes, the resource information includes an address corresponding to the second storage area and a use state of the second storage area. The address is a data format which can facilitate each child node to quickly locate the target second storage area, and is globally unique, and in an embodiment, the second storage area can be identified by using the IP address + the memory address of the child node. The use state can be summarized and divided in advance according to the actual service logic, and is identified by adopting an enumeration value. By the configuration, the child nodes needing resources can quickly locate the target memory, and the judgment of the main node is facilitated.

In one embodiment, the usage state includes:

a first state representing that the indicated second storage area is free;

Referring to fig. 3a, when a certain second storage area is in the first state, if a request comes, the read/write request determining unit 3 first determines whether the request is a read request or a write request, if the request is a read request, the state is updated to the second state, otherwise, the state is updated to the fourth state. When a certain second storage area is in the second state, similarly, if a request comes, the read/write request determining unit 3 determines whether the request is a read request or a write request, if the request comes, the read request stays in the second state, and if the request comes, the state is updated to the fourth state.

Referring to fig. 3b, when a certain second storage area is in the third state, if a request comes, the read/write request determining unit 3 determines whether the request is a read request or a write request, if the request is a write request, the state is updated to the fourth state, if the request is a read request, the read request reaching upper limit requesting unit 4 determines whether the request has reached the upper limit, if the request has reached the upper limit, the state is updated to the first state, otherwise, the request stays in the third state. When a certain second storage area is in a fourth state, if a request comes, the read/write request judgment unit 3 judges whether the request is a read request or a write request, if the request comes, the state is updated to be in the third state, otherwise, the state is continuously kept in the fourth state.

Firstly, the possible occupation conditions of all the memories are classified by adopting four states, namely, the storage space required by the state enumeration value can be saved to the maximum extent, namely, only 2 bits are needed, the current resource state can be clearly expressed, and a basis is provided for the judgment of the master node. Secondly, when different requests are met in each state, the inventor defines the respective conversion logic of the states, and the requirements of actual operation are met.

In some embodiments, the size of the first storage area may be set in advance, and the size of the second storage area may also be set in advance. The design has the advantages that the algorithm can be simplified, and the system can be optimized by using the prior knowledge. In other embodiments, the first storage area may be automatically adjusted according to actual operating conditions, and the size of the second storage area may also be automatically adjusted according to actual operating conditions. The design has the advantages of strong adaptability, no need of priori knowledge and application to the working conditions with complicated and variable business logics.

The embodiment also provides an FPGA chip 1. Referring to fig. 1, the FPGA chip 1 includes a first logic module and a network card supporting RDMA technology, the first logic module is in communication connection with the network card to form an FPGA node, and the FPGA node is used in the FPGA clustering method and is in communication connection with other FPGA nodes in the FPGA cluster through the network card. The first logic module is used for implementing the logic and the flow in the method.

In an optional embodiment, the FPGA chip includes a memory module, and the memory module is divided into a first storage area and a second storage area. The first logic module directly communicates with the first storage area of the memory module to access and rewrite the content of the first storage area. The embodiment has the advantages of intuitive logic, less modification to the existing FPGA chip and low cost.

In another optional embodiment, the FPGA chip includes a memory module, and the memory module is divided into a first storage area and a second storage area. The FPGA chip comprises a Cache module used for buffering the content of the first storage area. The Cache module is in direct communication with the first storage area, and the first logic module interacts with the Cache module to read and write the first storage area and does not directly communicate with the first storage area. By the configuration, access delay is reduced, and the running speed of the whole cluster can be increased.

Optionally, in the FPGA cluster system, one of the FPGA chips is configured as a master node, and the remaining FPGA chips are configured as child nodes. The design of the main node and the sub-nodes is beneficial to overall planning, and the logic is clear.

In summary, in the FPGA clustering method, the FPGA chip, and the FPGA clustering system provided in this embodiment, each FPGA node includes a first storage area and a second storage area, and resource information of the second storage area is recorded in the first storage area. And each subnode is allowed to access idle resources of other subnodes, and a better self-adaptive distribution effect is achieved by reasonably designing resource distribution logic, so that the problem of integration of FPGA chip resources in a local area network is solved.

[ example two ]

Referring to fig. 4, fig. 4 is a schematic diagram of an FPGA chip according to a second embodiment of the present invention.

In this embodiment, the FPGA chip includes a second logic module and a memory module, where the second logic module is composed of memory granules or a flash, the second logic module is configured as a first storage area, and the memory module is configured as a second storage area. The function of the first storage area is realized by using a special logic module, the operation speed of the whole cluster can be further improved by performing targeted optimization on the material and logic of the module, and more memory space is saved for the second storage area.

[ EXAMPLE III ]

The embodiment provides an FPGA cluster, and the system has a tree structure and is specifically divided into a master node, N1 level-1 child nodes, N2 level-2 child nodes, … …, and Nk level-k child nodes. Wherein each child node has a unique one belonging to its previous level node, but each child node does not necessarily have a next level child node belonging to it.

When the resource of a certain child node is insufficient, a resource request is sent to the node at the upper level, after the node at the upper level receives the request, whether idle resources exist in all the child nodes at the lower levels of the node at the upper level is judged, and if the idle resources exist in all the child nodes at the lower levels of the node at the upper level, the idle resources are directly fed back to the requesting node. And the request node finds the target resource according to the feedback information and establishes an access channel. All the nodes on the access path update the content of the first storage area of the node; if not, the request is sent to the higher level until the request is sent to the main node.

The configuration is beneficial to the function division of the FPGA chip, the sharing of idle resources always occurs between the FPGA chips with closer functions, and the influence on the whole network is smaller. If necessary, the system can be split into independent clusters which do not interfere with each other.

[ EXAMPLE IV ]

The embodiment provides an FPGA cluster, and child nodes in the cluster are connected through a P2P technology.

When the resource of a certain node is insufficient, a resource request is sent to the main node, when the main node does not respond for a long time, the request node broadcasts the resource request in the P2P network, and after a certain idle node receives the request, the request is locked, and the second storage area address of the idle node is sent to the request node. An access channel is established between the two and the first storage areas of the two are updated simultaneously.

The configuration is beneficial to increasing the robustness of the network, and the paralysis of the whole cluster caused by the single point failure of the main node can be avoided.

It should be noted that, in the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. The above description is only for the purpose of describing the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention, and any variations and modifications made by those skilled in the art based on the above disclosure are within the scope of the appended claims.

Claims

1. An FPGA clustering method is characterized in that each FPGA node of the FPGA cluster comprises a first storage area and a second storage area, resource information of the second storage area is recorded in the first storage area, the FPGA node comprises a main node and a plurality of sub-nodes, and the FPGA clustering method comprises the following steps:

2. The FPGA clustering method of claim 1, wherein said step S2 further comprises: and the second child node sends the updated content in the first storage area of the child node to the main node, and the main node updates the content in the first storage area of the child node.

3. The FPGA clustering method of claim 1, wherein said step S3 further comprises: the first sub-node sends a service completion signal to the second sub-node, and the second sub-node updates the content of the first storage area of the second sub-node, or the master node sends a service completion signal to the second sub-node, and the second sub-node updates the content of the first storage area of the second sub-node.

4. The FPGA clustering method of claim 1 wherein the access channel of step S2 is established using RDMA techniques.

5. The FPGA clustering method according to claim 1, wherein the first storage area of the master node comprises resource information of the second storage areas of all the FPGA nodes; the first storage area of the child node includes resource information of the second storage area of the child node.

6. The FPGA clustering method of claim 1, wherein in one FPGA node, the resource information comprises an address corresponding to the second storage area and a use state of the second storage area.

7. The FPGA clustering method of claim 6, wherein the use state comprises:

a first state representing that the indicated second storage area is free;

8. The FPGA clustering method according to claim 1, wherein the size of the first storage area is preset or automatically adjusted according to actual operating conditions, and the size of the second storage area is preset or automatically adjusted according to actual operating conditions.

9. An FPGA chip comprising a first logic module and an RDMA capable network card, wherein said first logic module is communicatively connected to said network card to form an FPGA node, and wherein said FPGA node is configured to be communicatively connected to other FPGA nodes in an FPGA cluster via said network card according to the FPGA clustering method of any one of claims 1 to 8.

10. The FPGA chip of claim 9 comprising a memory module, said memory module being partitioned into a first storage region and a second storage region.

11. The FPGA clustering method of claim 10, comprising a Cache module configured to buffer contents of the first storage area.

12. The FPGA chip of claim 9, comprising a second logic module and a memory module, wherein the second logic module is composed of memory granules or flash, the second logic module is configured as a first storage area, and the memory module is configured as a second storage area.

13. An FPGA cluster system, characterized by comprising more than two FPGA chips according to any one of claims 8 to 12, wherein one of the FPGA chips is configured as a master node and the remaining FPGA chips are configured as child nodes.