CN217847021U

CN217847021U - AI edge server system architecture with high performance computing power

Info

Publication number: CN217847021U
Application number: CN202221565872.9U
Authority: CN
Inventors: 林增权; 吴戈; 吕腾; 李鸿强; 莫良伟
Original assignee: Baode Computer System Co ltd
Current assignee: Baode Computer System Co ltd
Priority date: 2022-06-22
Filing date: 2022-06-22
Publication date: 2022-11-18
Anticipated expiration: 2032-06-22

Abstract

The application discloses a high-performance computing AI edge server system architecture which is used for improving the flexibility of use and meeting the increasing requirements on data real-time performance and safety. The AI edge server system architecture in the present application includes: the computer case comprises a case body, a CPU computing node module, an accelerator card computing node module, a routing board and a power supply module, wherein the CPU computing node module comprises a PCIe channel, a first accelerator card hot plug module is arranged on the accelerator card computing node module, an accelerator card connecting network is inserted in the first accelerator card hot plug module, the first accelerator card hot plug module is converged at an accelerator card computing node module network interface through a switching module in a switching way, a second accelerator card hot plug module is also arranged on the accelerator card computing node module, the second accelerator card hot plug module is connected to the CPU computing node module through a cable, an accelerator card inserted in the second accelerator card hot plug module is converged at the accelerator card computing node module network interface through the PCIe channel and the switching module, and the network interface is connected with the routing board network interface.

Description

AI edge server system architecture with high performance computing power

Technical Field

The present application relates to the field of computer technologies, and in particular, to a high-performance computing AI edge server system architecture.

Background

With the proliferation of the number of terminal devices of the internet of things and the increasing demand for real-time data and security, edge computing becomes crucial in application scenarios of many industries, such as road management and automatic driving of intelligent traffic, quality detection and device monitoring of intelligent manufacturing, disease monitoring and auxiliary diagnosis of intelligent medical care, and the like. Edge computing is still in an early development stage in China, and with the increasing and more complex data, the number of accelerator cards of an edge computing server widely used in the market at present is limited, so that the computing power of edge computing is limited.

At present, in order to solve the problem of limited number of accelerator cards of an edge computing server, a large number of accelerator cards are expanded to improve the computing power of edge computing. Usually, the expansion is performed by using a Peripheral Component Interconnect Express (PCIE) expansion method.

When the existing accelerator card expansion mode is used, a complex back plate structure is needed when the accelerator card is expanded, the PCIE bus rate is very high, a plurality of high-speed connectors and signal reconstruction or signal amplification chips are needed, the operation and maintenance are difficult, and the changing requirements are difficult to meet.

SUMMERY OF THE UTILITY MODEL

In order to solve the technical problems, the application provides a high-performance computing AI edge server system architecture, which is used for solving the problems of difficult operation and maintenance, limited number of expanded accelerator cards and limited computing power of the original standard-based server, improving the use flexibility and meeting the increasing requirements on data instantaneity and safety.

The application provides a high-performance computationally intensive AI edge server system architecture, comprising:

the system comprises a case, a CPU (central processing unit) computing node module, an accelerator card computing node module, a routing board and a power supply module;

the power supply module is electrically connected with the CPU computing node module, the accelerator card computing node module and the routing board respectively;

the CPU computing node module is arranged on the lower layer of the case and comprises a PCIe channel;

the accelerator card computing node module is installed on the upper layer of the chassis, a first accelerator card hot plug module is arranged on the accelerator card computing node module, an accelerator card connecting network is inserted in the first accelerator card hot plug module, a switching module is arranged on the accelerator card, the first accelerator card hot plug module is converged on a network interface of the accelerator card computing node module through the switching module in a switching mode, a second accelerator card plug module is further arranged on the accelerator card computing node module and connected to the CPU computing node module through a cable, the accelerator card inserted in the second accelerator card plug module passes through the PCIe channel and is converged on the network interface of the accelerator card computing node module through the switching module, and the network interface of the accelerator card computing node module is connected with the network interface of the routing board.

Optionally, a PCIe slot is disposed on the CPU compute node module, and the PCIe slot is used for inserting a GPU card.

Optionally, the CPU computing node module further includes a hard disk module for providing a storage function.

Optionally, a fan is further disposed on the CPU compute node module and configured to dissipate heat of the CPU compute node module.

Optionally, the CPU computation node module is provided with a memory for storing data.

Optionally, a middle plate is installed on the accelerator card computing node module;

and the network interface of the accelerator card computing node module is connected with the middle plate through a connector, and the network interface on the middle plate is connected with the network cable of the network interface of the routing plate.

Optionally, an OCP3.0 module is disposed on the CPU compute node module.

Optionally, the second accelerator card plug-in module includes 18 accelerator card modules supporting PCIe.

Optionally, the chassis is a 4U double-layer chassis.

Optionally, the chassis further includes a chassis upper cover;

the upper cover of the case is arranged at the opening of the case and used for sealing the case.

According to the technical scheme, the embodiment of the application has the following advantages:

in the application, the CPU computing node module comprises a PCIe channel, a first accelerating card hot plug module is arranged on the accelerating card computing node module, an accelerating card connecting network inserted in the first accelerating card hot plug module, a switching module is arranged on the accelerating card, the first accelerating card hot plug module is converged at a network interface of the accelerating card computing node module through the switching module, a second accelerating card plug module is further arranged on the accelerating card computing node module and is connected to the CPU computing node module through a cable, an accelerating card inserted in the second accelerating card plug module is converged at the network interface of the accelerating card computing node module through the PCIe channel and the switching module, and the network interface of the accelerating card computing node module is connected with a network interface of the routing board. The accelerator cards are exchanged through the exchange module, are connected through the network, are converged at the network interface of the routing board, can provide two IP and terminal connections through the conversion of the routing board, and each accelerator card works independently and can provide a plurality of signal channels, so that the AI computing capacity is greatly improved, the use flexibility is improved, the operation and maintenance are simple, and the increasing data real-time and safety requirements are met.

Drawings

In order to more clearly illustrate the technical solutions in the present application, the drawings required for the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings may be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic diagram of a high performance computing AI edge server system architecture of the present application;

FIG. 2 is a schematic top view of a CPU compute node module according to the present application;

FIG. 3 is a schematic top view of an acceleration card compute node module of the present application;

FIG. 4 is a schematic top view of an accelerator card of the present application;

fig. 5 is a schematic power supply line diagram of the architecture of the high-performance computationally AI edge server system in the present application.

Detailed Description

In the present application, the terms "upper", "lower", "left", "right", "front", "rear", "top", "bottom", "inner", "outer", "middle", "vertical", "horizontal", "lateral", "longitudinal", and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are used only for explaining relative positional relationships between the respective members or components, and do not particularly limit specific mounting orientations of the respective members or components.

Moreover, some of the above terms may be used to indicate other meanings besides the orientation or positional relationship, for example, the term "on" may also be used to indicate some kind of attachment or connection relationship in some cases. The specific meaning of these terms in this application will be understood by those of ordinary skill in the art as the case may be.

Furthermore, the terms "mounted," "disposed," "provided," "connected," and "connected" are to be construed broadly. For example, it may be a fixed connection, a removable connection, or a unitary construction; can be a mechanical connection, or an electrical connection; may be directly connected, or indirectly connected through intervening media, or may be in internal communication between two devices, elements or components. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as the case may be.

In addition, the structures, the proportions, the sizes, and the like in the drawings attached to the present application are only used for understanding and reading the contents disclosed in the specification, and are not used for limiting the conditions under which the present application can be implemented, so that the present application does not have a substantial technical meaning, and any modifications of the structures, changes of the proportion relationships, or adjustments of the sizes, can still fall within the scope of the technical contents disclosed in the present application without affecting the efficacy and the achievable purpose of the present application.

The embodiment of the application provides a high-performance computing AI edge server system architecture, which is used for solving the problems of difficult operation and maintenance, limited number of expanded accelerator cards and limited computing power of the original standard-based server, improving the use flexibility and meeting the increasing requirements on data instantaneity and safety.

Typically, edge computing is often tied to the internet of things. Internet of things devices participate in increasingly powerful processes, and thus the large amount of data generated needs to be migrated to the "edge" of the network, without the data having to be continuously transferred back and forth between centralized servers for processing. Thus, edge computing is more efficient, less delayed, faster in processing speed, and scalable in managing large amounts of data from internet of things devices. Edge computing is still in an early development stage in China, along with the increasing and the more complex of data, the number of accelerator cards of an edge computing server widely used in the market at present is limited, the computing power of edge computing is limited, in the past, if a large number of accelerator cards need to be expanded, a PCIe expansion mode is used, a complex backboard structure needs to be used, the PCIe bus rate is very high and can reach 2.5Gbps-16Gbps, a plurality of high-speed connectors and signal reconstruction or signal amplification chips need to be used, the operation and the maintenance are difficult, and the changing requirements are difficult to meet. The high performance computationally intensive AI edge server system architecture of the present application is effective in solving the above-mentioned problems,

referring to fig. 1, fig. 1 is a schematic structural diagram of an architecture of a high-performance computing AI edge server system according to the present application, including:

the system comprises a case, a CPU computing node module 1, an accelerator card computing node module 2, a routing board and a power module 11;

the power supply module 11 is respectively electrically connected with the CPU computing node module 1, the accelerator card computing node module 2 and the routing board;

the CPU computing node module 1 is arranged on the lower layer of the chassis, and the CPU computing node module 1 comprises a PCIe channel;

the accelerator card computing node module 2 is installed on the upper layer of a chassis, a first accelerator card hot plug module 21 is arranged on the accelerator card computing node module 2, an accelerator card connected network is arranged on the first accelerator card hot plug module 21, a switching module 24 is arranged on the accelerator card, the first accelerator card hot plug module 21 is switched and converged on a network interface of the accelerator card computing node module 2 through the switching module 24, a second accelerator card hot plug module is further arranged on the accelerator card computing node module 2 and connected to the CPU computing node module 1 through a cable, an accelerator card arranged on the second accelerator card hot plug module is converged on a network interface of the accelerator card computing node module 2 through a PCIe channel and the switching module 24, and the network interface of the accelerator card computing node module 2 is connected with a network interface of a routing board.

With the increasing requirements of network devices on bandwidth, flexibility and performance, the PCIe standard is developed. PCIe (peripheral component interconnect express) is a high-speed serial computer expansion bus standard. PCIe belongs to high-speed serial point-to-point double-channel high-bandwidth transmission, connected devices are distributed with independent channel bandwidth and do not share bus bandwidth, functions of active power management, error reporting, end-to-end reliable transmission, hot plug, service quality and the like are mainly supported, and data transmission rate is high. The channels are paths or interfaces for connecting external signals, the PCIe channels are PCIe signals, one channel corresponds to one signal, for example, force, temperature, humidity and the like of a plurality of points are measured, signals collected by each channel are generally sequentially transmitted to the signal conditioning circuit in turn, then are subjected to A/D conversion and then are transmitted to the microprocessor. The number of channels is typically a multiple of 8, i.e. 8 channels, 16 channels, etc.

Referring to fig. 2, fig. 3 and fig. 4, the power module 11 is connected to the motherboard connector in a hot-plug manner to supply power to the motherboard, and hot-plug, i.e. hot-plug, means that the module and the board card are plugged into or pulled out of the system without affecting the normal operation of the system under the condition that the system power is not turned off, so as to improve the reliability, quick maintenance, redundancy, and capability of recovering from a disaster in time. The mainboard is provided with PCIe slots 13, the PCIe slots 13 are used for inserting GPU cards, 3 PCIE 5.0 slots are generally arranged, 3A 100 GPU cards can be expanded, and the utilization rate of computing resources can be optimized; still include hard disk module 12 on the CPU calculation node module 1, this hard disk module 12 is 12 3.5 cun hard disk modules, and the backplate of hard disk module 12 passes through PCIe passageway and is connected with the slim sas interface cable on the mainboard, provides the memory function, improves the operating performance and the flexibility of use of computer. The CPU computing node module is also provided with a fan 15 which is used for dissipating heat for the CPU computing node module and reducing the reduction of the computing performance of the CPU at high temperature; the CPU computing node module is provided with a memory 14, and the memory is 32 DDR5 memories for storing data. The CPU computing node module 1 further comprises a main board, an OCP3.0 module, 2 Sapphire Rapid series processors, a routing board and a group of 1+1 hot-plug power modules 11,1+1, which share 50% of load, if one power module 11 fails, the other power module will bear all the load, and the condition that the power module 11 is stopped due to failure is prevented. The first accelerator card computing node module 2 includes two front 25 first accelerator card hot plug modules, two rear 16 first accelerator card hot plug modules, the second accelerator card computing node module 2 including 18 PCIe lanes, and a total of 100 accelerator cards, the switching module 24 is disposed on the accelerator card, and further includes a set of 1+1 hot plug power module 11 and a heat dissipation fan. Optionally, the accelerator card computing node module 2 is further installed with a middle board 23, a network interface of the accelerator card computing node module 2 is connected to the middle board 23 through a connector, and a network interface on the middle board 23 is connected to a network cable of a network interface of the routing board.

Referring to fig. 5, the power supply line scheme may be: the 16 first accelerating card hot plug modules and the 25 first accelerating card hot plug modules are directly connected through a network. And the 18 second accelerator card hot plug modules walk PCIe channels on the mainboard of the CPU computing node module, and the clock sequence number of each channel is different. The power supply of the 18 second accelerator card hot plug modules of the accelerator card computing node module 2 supplies power to the power supply of the CPU computing node module 1. 2000W1+1 redundant hot plug power supply of the CPU computing node module supplies power to the mainboard, and supplies power to 18 second accelerator card hot plug modules and fans of the CPU computing node module through cables; the 16 first accelerator card hot plug modules and the 25 first accelerator card hot plug modules of the accelerator card computing node module 2 are both directly powered by the hot plug power supply of 2000W1+1 above.

The system comprises an accelerator card node, wherein 25 first accelerator card hot plug modules and 16 first accelerator card hot plug modules of the accelerator card node are directly connected with each other through a network, 1 exchange module can exchange 9 accelerator cards at most, two groups of 25 first accelerator card hot plug modules and two groups of 16 first accelerator card hot plug modules are exchanged and converged at 10 network interfaces through 10 exchange modules arranged at the back of the accelerator cards and are connected with a first middle plate through a connector, and the 10 network interfaces on the first middle plate are connected with 10 network interface network lines of a routing plate; 18 second accelerator card plug modules of upper accelerating card node pass through the mcio interface and link to each other with the mcio interface of lower floor CPU computational node upper main board, walk the PCIe signal, then assemble out 2 network interface through 2 exchange module exchanges, be connected with the second medium plate through the connector, 2 network interface on the second medium plate and 2 network interface network connection of route board, finally all accelerator cards assemble in the route board through 12 network interface altogether, it is continuous with the terminal to externally provide 2 IP addresses by the route board. Each accelerator card works independently, can provide as many as 2000 signal channels to the maximum extent, greatly improves AI computing capacity, improves flexibility of use, is simple in operation and maintenance, and meets ever-increasing data real-time and safety requirements.

Optionally, the chassis is a 4U double-deck chassis. The "U" in the server field refers to the thickness of a rack server, which is an abbreviation of unit for unit, and represents the external size of the server, and the detailed size is determined by the american electronic industry association as an industry group. The thickness is in centimeters as the basic unit. 4 times of 1U is 4.45cm when 1U is 4U and 17.8cm when 4U is 4U. In this embodiment, the size of the chassis may be adjusted according to the actual need of the equipment to be installed.

Optionally, the chassis further comprises a chassis upper cover;

the upper cover of the case is arranged at the opening of the case and used for sealing the case. The situation that dust enters the machine box and causes damage to components in the machine box is reduced.

It should be noted that the above-mentioned disclosure and the detailed description are intended to demonstrate the practical application of the technical solutions provided in the present application, and should not be construed as limiting the scope of the present application. Various modifications, equivalent substitutions, or improvements within the spirit and scope of the application may occur to those skilled in the art. The protection scope of this application is subject to the appended claims.

Claims

1. A high performance computationally intensive AI edge server system architecture, comprising:

the computer comprises a case, a CPU computing node module, an accelerator card computing node module, a routing board and a power supply module;

the CPU computing node module is arranged on the lower layer of the chassis and comprises a PCIe channel;

the utility model discloses a network interface of accelerator card computing node module, including the host computer, the host computer is provided with accelerator card computing node module, accelerator card hot plug module, switching module, CPU computing node module, PCIe passageway assembles in accelerator card computing node module's network interface, accelerator card computing node module installs the quick-witted case upper strata, be provided with first accelerator card hot plug module on the accelerator card computing node module, the accelerator card hot plug module of placeeing on the first accelerator card hot plug module connects with the network interface of accelerator card computing node module, be provided with second accelerator card hot plug module on the accelerator card computing node module, second accelerator card hot plug module passes through the cable connection, the accelerator card that placeeing on the second accelerator card hot plug module install the PCIe passageway and assemble in through switching module with the network interface of accelerator card computing node module, the network interface of accelerator card computing node module with the network interface connection of routing board.

2. The AI edge server system architecture of claim 1, wherein the CPU compute node modules are provided with PCIe slots for receiving GPU cards.

3. The AI edge server system architecture of claim 1, wherein the CPU compute node module further includes a hard disk module thereon for providing storage functionality.

4. The AI edge server system architecture of claim 1, wherein the CPU compute node module is further provided with a fan for dissipating heat from the CPU compute node module.

5. The AI edge server system architecture of claim 1, wherein the CPU compute node module has a memory disposed thereon for storing data.

6. The AI edge server system architecture of claim 1, wherein the accelerator card compute node module has a midplane installed thereon;

and the network interface of the accelerator card computing node module is connected with the middle plate through a connector, and the network interface on the middle plate is connected with the network cable of the network interface of the routing board.

7. The AI edge server system architecture of claim 1, wherein an OCP3.0 module is provided on the CPU compute node module.

8. The AI edge server system architecture of claim 1, wherein the second accelerator card plug module includes 18 PCIe enabled accelerator card modules.

9. The AI edge server system architecture of any of claims 1-8, wherein the chassis is a 4U dual-tier chassis.

10. The AI edge server system architecture of claim 9, wherein the chassis further includes a chassis top cover;