CN113741642B - High-density GPU server - Google Patents

High-density GPU server Download PDF

Info

Publication number
CN113741642B
CN113741642B CN202110852692.2A CN202110852692A CN113741642B CN 113741642 B CN113741642 B CN 113741642B CN 202110852692 A CN202110852692 A CN 202110852692A CN 113741642 B CN113741642 B CN 113741642B
Authority
CN
China
Prior art keywords
gpu
density
pcie switch
card
backboard
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110852692.2A
Other languages
Chinese (zh)
Other versions
CN113741642A (en
Inventor
王安
孔祥涛
刘圣金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202110852692.2A priority Critical patent/CN113741642B/en
Publication of CN113741642A publication Critical patent/CN113741642A/en
Application granted granted Critical
Publication of CN113741642B publication Critical patent/CN113741642B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/18Packaging or power distribution
    • G06F1/183Internal mounting support structures, e.g. for printed circuit boards, internal connecting means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/20Cooling means
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Power Engineering (AREA)
  • Cooling Or The Like Of Electrical Apparatus (AREA)

Abstract

The invention provides a high-density GPU server, which comprises a server case shell, wherein a heat dissipation module, a GPU module, a middle backboard and a storage network module are sequentially arranged in the server case shell from front to back; the GPU module comprises a transversely arranged GPU board, a plurality of PCIE SWITCH chips are arranged on the GPU board, each PCIE SWITCH chip is connected with a first high-density connector and a plurality of GPU interfaces, and each GPU interface is connected with a GPU card; the middle backboard is vertically arranged; the front surface of the middle backboard is provided with a second high-density connector, the back surface of the middle backboard is provided with a third high-density connector, the second high-density connector is communicated with the third high-density connector, and the PCIE SWITCH chip is spliced with the second high-density connector through the first high-density connector; the storage network module is plugged with the third high-density connector. The invention improves the deployment density of the GPU in the server case shell and improves the performance of the GPU.

Description

High-density GPU server
Technical Field
The invention belongs to the technical field of high-density server design, and particularly relates to a high-density GPU server.
Background
With the generation of emerging technologies such as artificial intelligence, big data, cloud computing and the like, the demands of the industry on servers are increasing, the demands on large-scale AI service deployment are increasing, the computing power demands and density demands on the GPU are gradually increased, and the core number and computing power of the GPU are gradually increased along with the rapid increase of the demands. In the case of the same GPU type, how to design a higher density GPU server product at an existing size is a challenge, where how much power a server can provide depends on the density of the GPU.
Among current server products, AI server products can be roughly divided into two forms, one is to design a system by using an onboard GPU board, and the general configuration of the form is that one onboard GPU board can carry 8 GPUs, and a server with a height of 4U can be matched with 8 GPU cards; in another form, the design is performed by using PCIE GPU cards that are standard in the industry, and 8 GPU cards may be collocated in a 4U server. The current server with mainstream configuration supports 8 maximum GPU cards in 4U space, the density of the GPU is not high enough, and for standard cards, the product does not support the GPU direct memory function and the GPU direct RDMA function, and the performance of the GPU cannot be fully exerted in topological design.
This is a deficiency of the prior art, and therefore, it is highly desirable to provide a high-density GPU server that addresses the above-described deficiencies in the prior art.
Disclosure of Invention
Aiming at the defects that the density of the GPU in the conventional AI server is not high enough and the topology of the GPU cannot fully exert the advantages of the GPU, the invention provides a high-density GPU server to solve the technical problems.
In a first aspect, the invention provides a high-density GPU server, comprising a server chassis shell, wherein a heat dissipation module, a GPU module, a middle back plate and a storage network module are sequentially arranged in the server chassis shell from the front side to the back side;
the GPU module comprises a GPU board, the GPU board is transversely arranged, a plurality of PCIE SWITCH chips are arranged on the GPU board, each PCIE SWITCH chip is connected with a first high-density connector and a plurality of GPU interfaces, and each GPU interface is connected with a GPU card;
the middle backboard is vertically arranged, and the board surface is parallel to the front side surface of the shell of the server case; the front surface of the middle backboard is provided with a second high-density connector, the back surface of the middle backboard is provided with a third high-density connector, the second high-density connector is communicated with the third high-density connector, and the PCIE SWITCH chip is spliced with the second high-density connector through the first high-density connector;
the storage network module is plugged with the third high-density connector.
Further, the GPU card includes a long side, a short side, and a thick side;
a golden finger is arranged at the side surface formed by the short side edge and the thickness edge, and is spliced with the GPU interface;
the side surface formed by the long side edge and the thickness edge is provided with ventilation and heat dissipation holes. The GPU card adopts a double-width half-length GPU card, namely the width of the new GPU card is designed to be twice that of the traditional GPU card, and the length of the new GPU card is designed to be half that of the traditional GPU card. The GPU card and the GPU board are interconnected through the golden finger, the position of the golden finger is adjusted to the side edge from the bottom relative to the traditional board card, and compared with the traditional golden finger design, the current design can meet the additional power supply requirement of the GPU, the GPU card does not need other power supply cables for connection, and the deployment density of the GPU on the GPU board is improved; the ventilation and heat dissipation holes of the GPU card are moved from one side to the side, so that the open area is increased, and heat dissipation is better.
Further, the GPU board comprises a first side, a second side, a third side and a fourth side;
ten PCIE SWITCH chips are arranged on the GPU board, the ten PCIE SWITCH chips are distributed in two rows, and the two rows of PCIE SWITCH chips are distributed along the direction from the fourth side edge to the second side edge;
five PCIE SWITCH chips of each row are distributed along the direction from the first side edge to the third side edge;
each PCIE SWITCH chip is connected with two GPU interfaces, and each PCIE SWITCH chip is connected with a first high-density connector;
two GPU interfaces connected with one row of PCIE SWITCH chips close to the second side are arranged at the first side;
two GPU interfaces connected with one row of PCIE SWITCH chips close to the fourth side are arranged between the two rows of PCIE SWITCH chips;
ten first high density connectors are provided at the edge of the fourth side. Each PCIE SWITCH chip is connected with one GPU interface through two PCIE X16 signal lines respectively. The invention supports the topological design of GPU, IB card and SSD1:1:1 ratio.
Further, the GPU card 6 is further provided with an NVLink interface, and the NVLink interfaces of adjacent GPU cards 6 are connected through interconnection GPU cards. The NVLink interface enables high-speed interconnection of adjacent GPU cards on a GPU board, where adjacent GPU cards are GPU cards connected to the same PCIE SWITCH chip.
Further, two board card power supply ports are also arranged on the GPU board, and the two board card power supply ports are respectively arranged on two sides of the first high-density connector;
two groups of PSU connectors are arranged on two sides of the back surface of the middle back plate, and PSU modules are respectively arranged on two sides of the storage network module;
the board card power supply port is connected with the PSU module through a corresponding group of PSU connectors. The middle backboard not only interconnects high-speed signals in the system, but also communicates power supply among the devices.
Further, each group of PSU connectors on the middle backboard comprises two PSU connectors arranged vertically;
the middle backboard is also provided with a heat dissipation opening. Each PSU connector connects PSU modules of 54V.
Further, the heat dissipation module comprises heat dissipation fans arranged in rows and columns. The number of the radiating fans is ten, and the radiating fans are distributed in two rows of five. The cooling fan adopts the 8086 type cooling fan, so that the cooling requirement of the system is ensured, and the upper layer of cooling fan and the lower layer of cooling fan can meet the cooling requirement of the upper layer of equipment and the lower layer of cooling fan.
Further, the storage network module comprises a storage unit and a network card unit, the storage unit is arranged at the upper part of the network card unit, the lower part of the network card unit is provided with a main board, and the main board is provided with a CPU;
the storage unit, the network card unit, the CPU and the third high-density connector are connected in an inserting mode.
Further, the storage unit comprises a storage backboard and a plurality of SSDs, wherein the SSDs are inserted on the storage backboard and are longitudinally arranged to generate SSD storage columns;
the network unit comprises a network backboard and a plurality of IB cards; the IB cards are inserted into the network backboard and are longitudinally arranged to generate IB card storage columns;
the storage backboard, the network backboard and the main board are inserted into the back surface of the middle backboard through a third high-density connector, are parallel and perpendicular to the middle backboard;
the IB card storage column is arranged at the upper part of the SSD storage column, and the OCP network card and the CPU are arranged at the lower part of the IB card. The SSD adopts an e1.L type SSD. The storage backboard and the network backboard use PCIE Riser cards. Each PCIE SWITCH chip is connected with two IB cards through two PCIE X16 signal lines and a first high-density connector, a second high-density connector and a third high-density connector; each PCIE SWITCH chip is connected with the CPU through a PCIE X16 signal wire through a first high-density connector, a second high-density connector and a third high-density connector; each PCIE SWITCH chip is connected with two SSDs through a PCIE X8 signal wire via a first high-density connector, a second high-density connector and a third high-density connector. The IB card supports 200G rate and E1.L SSD, so that the invention supports GPU direct storage and GPU remote direct reading. The e1.L type SSD effectively improves storage density. E1.L type SSD replaces traditional NVME hard disk design, saves design space, promotes design density.
Further, the number of the CPUs is two, the motherboard is further provided with an OCP network card 20, and the two CPUs 17 are connected with the OCP network card 20.
Further, the server chassis housing is 4U high. The AI server with the specification of 4U can support 20 full-height half-length double-width GPU cards at maximum, and the GPU cards can be interconnected at high speed through an NVLink interface.
The invention has the advantages that,
according to the high-density GPU server provided by the invention, the deployment density of the GPU in the case shell of the server is improved by changing the original GPU card structure, and the performance of the GPU is improved.
The invention realizes that 20 full-height half-length double-width GPU cards are supported in the 4U AI server at maximum, and the GPU cards are interconnected at high speed through an NVLink interface; support GPU card, IB card and SSD according to the topological design of 1:1:ratio; according to the invention, the NVME hard disk is replaced by the E1.L SSD, so that the storage design space is saved, and the storage density is improved; according to the invention, on the premise of improving the deployment density of the GPU, the heat dissipation effect of the GPU is enhanced.
In addition, the invention has reliable design principle, simple structure and very wide application prospect.
It can be seen that the present invention has outstanding substantial features and significant advances over the prior art, as well as the benefits of its implementation.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a schematic diagram of the overall structure within the server chassis housing of the high-density GPU server of the present invention.
Fig. 2 is a schematic structural diagram of a GPU board of the high density GPU server of the present invention.
Fig. 3 is a schematic structural diagram of a GPU card of the high density GPU server of the present invention.
Fig. 4 is a schematic diagram of the connection of a GPU card of the high-density GPU server to a GPU board according to the present invention.
Fig. 5 is a schematic view of a GPU card heat dissipation structure of a high density GPU server according to the present invention.
FIG. 6 is a schematic diagram of the structure of the back of the middle backplate of the high density GPU server of the present invention.
Fig. 7 is a schematic diagram of a heat dissipation module of a high-density GPU server according to the present invention.
Fig. 8 is a schematic rear side view of a server chassis housing of the high-density GPU server of the present invention.
Fig. 9 is a schematic diagram of the overall circuit connection of the high-density GPU server of the present invention.
In the figure, 1-a server chassis housing; 1.1-front side; 1.2-rear side; 2-a heat dissipation module; a 3-GPU module; 4-a middle backboard; 5-a storage network module; 6-GPU board; 6.1-a first side; 6.2-a second side; 6.3-third side; 6.4-fourth side; 7-a first high density connector; 8-a third high density connector; 9-GPU card; 9.1-long sides; 9.2-short sides; 9.3-thickness sides; 10-golden finger; 11-ventilation and heat dissipation holes; 12-a board card power supply port; a 13-PSU connector; a 14-PSU module; 15-a heat radiation fan; 16-a main board; 17-CPU; 17.1-a first CPU; 17.2-a second CPU;18-SSD;19-IB card; 20-OCP network card; SW 1-first PCIE SWITCH chip; SW 2-second PCIE SWITCH chip; SW 3-third PCIE SWITCH chip; SW 4-fourth PCIE SWITCH chip; SW 5-fifth PCIE SWITCH chip; SW 6-sixth PCIE SWITCH chip; SW 7-seventh PCIE SWITCH chip; SW 8-eighth PCIE SWITCH chip; SW 9-ninth PCIE SWITCH chip; SW 10-tenth PCIE SWITCH chip; g1-a first GPU interface; g2-a second GPU interface; g3-a third GPU interface; g4-a fourth GPU interface; g5—fifth GPU interface; g6—sixth GPU interface; g7-a seventh GPU interface; g8—eighth GPU interface; g9—ninth GPU interface; g10—tenth GPU interface; a G11-eleven GPU interface; g12—twelfth GPU interface; g13—thirteenth GPU interface; g14-fourteenth GPU interface; g15—fifteenth GPU interface; g16—sixteenth GPU interface; g17—seventeenth GPU interface; g18—eighteenth GPU interface; g19—nineteenth GPU interface; g20—twentieth GPU interface.
Detailed Description
In order to make the technical solution of the present invention better understood by those skilled in the art, the technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
RDMA, abbreviated as Remote Direct Memory Access, is a remote direct data access that is generated to address the delay of server-side data processing in network transmission.
PCIE is short for PCI-Express, peripheral component interconnect Express is short for high-speed serial computer expansion bus standard.
PCIE SWITCH, which is a PCIE switch or PCIE bridge for short.
NVLink, a bus and communication protocol developed and proposed by NVIDIA. NVLink adopts a point-to-point structure and serial transmission, is used for connecting a CPU and a GPU, and can also be used for interconnecting a plurality of GPUs.
IB card is short for InfiniBand card, wireless bandwidth card.
SSD is a short term solid state disk.
The SSD with a brand new form of the industry standard EDSFF introduced by intel is mainly divided into two specifications of length, namely EDSFF 1U Long, called E1.L for short; short, called EDSFF 1U Short, E1.S for Short, is used to replace M.2.
Example 1:
as shown in fig. 1, the invention provides a high-density GPU server, which comprises a server casing 1, wherein a heat dissipation module 2, a GPU module 3, a middle backboard 4 and a storage network module 5 are sequentially arranged from a front side 1.1 to a rear side 1.2 in the server casing 1;
the GPU module 3 comprises a GPU board 6, the GPU board 6 is transversely arranged, a plurality of PCIE SWITCH chips are arranged on the GPU board 6, each PCIE SWITCH chip is connected with a first high-density connector 7 and a plurality of GPU interfaces, and each GPU interface is connected with a GPU card 9;
the middle backboard 4 is vertically arranged, and the board surface is parallel to the front side surface 1.1 of the server case shell 1; the front surface of the middle backboard 4 is provided with a second high-density connector, the back surface of the middle backboard 4 is provided with a third high-density connector 8, the second high-density connector is communicated with the third high-density connector 8, a PCIE SWITCH chip is plugged with the second high-density connector through the first high-density connector 7, and the storage network module 5 is plugged with the third high-density connector 8.
Example 2:
as shown in fig. 1 and fig. 2, the present invention provides a high-density GPU server, which includes a server chassis housing 1, wherein a heat dissipation module 2, a GPU module 3, a middle backboard 4 and a storage network module 5 are sequentially disposed from a front side 1.1 to a rear side 1.2 in the server chassis housing 1;
the GPU module 3 comprises a GPU board 6, the GPU board 6 is transversely arranged, a plurality of PCIE SWITCH chips are arranged on the GPU board 6, each PCIE SWITCH chip is connected with a first high-density connector 7 and a plurality of GPU interfaces, and each GPU interface is connected with a GPU card 9;
the middle backboard 4 is vertically arranged, and the board surface is parallel to the front side surface 1.1 of the server case shell 1; the front surface of the middle backboard 4 is provided with a second high-density connector, the back surface of the middle backboard 4 is provided with a third high-density connector 8, the second high-density connector is communicated with the third high-density connector 8, a PCIE SWITCH chip is spliced with the second high-density connector through the first high-density connector 7, and the storage network module 5 is spliced with the third high-density connector 8;
as shown in fig. 3, 4 and 5, the GPU card 9 comprises a long side 9.1, a short side 9.2 and a thick side 9.3;
a golden finger 10 is arranged at the side surface formed by the short side edge 9.2 and the thickness edge 9.3, and the golden finger 10 is spliced with the GPU interface;
the side surface formed by the long side edge 9.1 and the thickness side edge 9.3 is provided with ventilation and heat dissipation holes 11;
the GPU card 9 adopts a double-width half-length GPU card, i.e. the new GPU card width is designed to be twice the width of the traditional GPU card, and the new GPU card length is designed to be half the length of the traditional GPU card. The GPU card 9 and the GPU board 6 are interconnected through the golden finger 10, the position of the golden finger 10 is adjusted to the side edge from the bottom relative to the traditional board card, and compared with the traditional golden finger design, the current design can meet the additional power supply requirement of the GPU, the GPU card 9 does not need other power supply cables for connection, and the deployment density of the GPU on the GPU board is improved; the ventilation and heat dissipation holes 11 of the GPU card are moved from one side to the other side, so that the open area is increased, and heat dissipation is better.
Example 3:
as shown in fig. 3, the invention provides a high-density GPU server, which comprises a server casing 1, wherein a heat dissipation module 2, a GPU module 3, a middle backboard 4 and a storage network module 5 are sequentially arranged from a front side 1.1 to a rear side 1.2 in the server casing 1;
the GPU module 3 comprises a GPU board 6, the GPU board 6 is transversely arranged, a plurality of PCIE SWITCH chips are arranged on the GPU board 6, each PCIE SWITCH chip is connected with a first high-density connector 7 and a plurality of GPU interfaces, and each GPU interface is connected with a GPU card 9;
the middle backboard 4 is vertically arranged, and the board surface is parallel to the front side surface 1.1 of the server case shell 1; the front surface of the middle backboard 4 is provided with a second high-density connector, the back surface of the middle backboard 4 is provided with a third high-density connector 8, the second high-density connector is communicated with the third high-density connector 8, a PCIE SWITCH chip is spliced with the second high-density connector through the first high-density connector 7, and the storage network module 5 is spliced with the third high-density connector 8;
the GPU card 6 is also provided with NVLink interfaces, and the NVLink interfaces of adjacent GPU cards 6 are connected through interconnection GPU cards; the interfaces are interconnected, and the NVLink interface realizes high-speed interconnection of adjacent GPU cards 6 on the GPU board;
the GPU board 6 is also provided with two board card power supply ports 12, and the two board card power supply ports 12 are respectively arranged at two sides of the first high-density connector 7;
as shown in fig. 6, two groups of PSU connectors are arranged on two sides of the back surface of the middle back plate 4, and two PSU modules 14 are respectively arranged on two sides of the storage network module 5;
the board card power supply port 12 is connected with the PSU module 14 through a corresponding group of PSU connectors 13; the middle backboard not only interconnects high-speed signals in the system, but also communicates power supply among all devices;
each group of PSU connectors on the middle backplate 4 comprises two PSU connectors 13 arranged vertically;
the middle backboard 4 is also provided with a heat dissipation opening; each PSU connector connects a 54V PSU module 14.
Example 4:
as shown in fig. 3, the invention provides a high-density GPU server, which comprises a server casing 1, wherein a heat dissipation module 2, a GPU module 3, a middle backboard 4 and a storage network module 5 are sequentially arranged from a front side 1.1 to a rear side 1.2 in the server casing 1;
the GPU module 3 comprises a GPU board 6, the GPU board 6 is transversely arranged, a plurality of PCIE SWITCH chips are arranged on the GPU board 6, each PCIE SWITCH chip is connected with a first high-density connector 7 and a plurality of GPU interfaces, and each GPU interface is connected with a GPU card 9;
the middle backboard 4 is vertically arranged, and the board surface is parallel to the front side surface 1.1 of the server case shell 1; the front surface of the middle backboard 4 is provided with a second high-density connector, the back surface of the middle backboard 4 is provided with a third high-density connector 8, the second high-density connector is communicated with the third high-density connector 8, a PCIE SWITCH chip is spliced with the second high-density connector through the first high-density connector 7, and the storage network module 5 is spliced with the third high-density connector 8;
as shown in fig. 3, 4 and 5, the GPU card 9 comprises a long side 9.1, a short side 9.2 and a thick side 9.3;
a golden finger 10 is arranged at the side surface formed by the short side edge 9.2 and the thickness edge 9.3, and the golden finger 10 is spliced with the GPU interface;
the side surface formed by the long side edge 9.1 and the thickness side edge 9.3 is provided with ventilation and heat dissipation holes 11;
the GPU board 6 comprises a first side 6.1, a second side 6.2, a third side 6.3 and a fourth side 6.4;
ten PCIE SWITCH chips are arranged on the GPU board 6, the first PCIE SWITCH chip SW1, the second PCIE SWITCH chip SW2, the third PCIE SWITCH chip SW3, the fourth PCIE SWITCH chip SW4, the fifth PCIE SWITCH chip SW5, the sixth PCIE SWITCH chip SW6, the seventh PCIE SWITCH chip SW7, the eighth PCIE SWITCH chip SW8, the ninth PCIE SWITCH chip SW9 and the tenth PCIE SWITCH chip SW10 are distributed in two rows, and the two rows PCIE SWITCH chips are distributed along the direction from the fourth side 6.4 to the second side 6.2; the first PCIE SWITCH, second PCIE SWITCH, third PCIE SWITCH, fourth PCIE SWITCH, and fifth PCIE SWITCH chips SW1, SW3, SW4, SW5 are disposed in a row at the fourth side 6.4, and the sixth PCIE SWITCH, seventh PCIE SWITCH, eighth PCIE SWITCH, ninth PCIE SWITCH, SW9, and tenth PCIE SWITCH chips SW10 are disposed in a row at the second side 6.2;
five PCIE SWITCH chips of each row are distributed along the directions from the first side 6.1 to the third side 6.3; the first PCIE SWITCH chip SW1, the second PCIE SWITCH chip SW2, the first PCIE SWITCH chip SW1, the fourth PCIE SWITCH chip SW4 and the fifth PCIE SWITCH chip SW5 are sequentially distributed from the first side 6.1 to the third side 6.3, and the sixth PCIE SWITCH chip SW6, the seventh PCIE SWITCH chip SW7, the eighth PCIE SWITCH chip SW8, the ninth PCIE SWITCH chip SW9 and the tenth PCIE SWITCH chip SW10 are sequentially distributed from the first side 6.1 to the third side 6.3;
each PCIE SWITCH chip is connected with two GPU interfaces, and each PCIE SWITCH chip is connected with a first high-density connector 7; the first PCIE SWITCH chip SW1 is connected with a first GPU interface G1 and a second GPU interface G2, the second PCIE SWITCH chip SW2 is connected with a third GPU interface G3 and a fourth GPU interface G4, the third PCIE SWITCH chip SW3 is connected with a fifth GPU interface G5 and a sixth GPU interface G6, the fourth PCIE SWITCH chip SW4 is connected with a seventh GPU interface G7 and an eighth GPU interface G8, the fifth PCIE SWITCH chip SW5 is connected with a ninth GPU interface G9 and a tenth GPU interface G10, the sixth PCIE SWITCH chip SW6 is connected with an eleventh GPU interface G11 and a twelfth GPU interface G12, the seventh PCIE SWITCH chip SW7 is connected with a thirteenth GPU interface G13 and a fourteenth GPU interface G14, the eighth PCIE SWITCH chip SW8 is connected with a fifteenth GPU interface G15 and a sixteenth GPU interface G16, the ninth PCIE SWITCH chip SW9 is connected with a seventeenth GPU interface G17 and a eighteenth GPU interface G18, and the tenth PCIE SWITCH chip SW10 is connected with a nineteenth GPU interface G19 and a twentieth GPU interface G20;
two GPU interfaces connected by a row of PCIE SWITCH chips near the second side 6.2 are arranged at the first side 6.1; the eleventh GPU interface G11, the twelfth GPU interface G12, the thirteenth GPU interface G13, the fourteenth GPU interface G14, the fifteenth GPU interface G15, the sixteenth GPU interface G16, the seventeenth GPU interface G17, the eighteenth GPU interface G18, the nineteenth GPU interface G19 and the twentieth GPU interface G20 are arranged at the first side 6.1;
two GPU interfaces connected with one row of PCIE SWITCH chips close to the fourth side 6.4 are arranged between the two rows of PCIE SWITCH chips; the first GPU interface G1, the second GPU interface G2, the third GPU interface G3, the fourth GPU interface G4, the fifth GPU interface G5, the sixth GPU interface G6, the seventh GPU interface G7, the eighth GPU interface G8, the ninth GPU interface G9 and the tenth GPU interface G10 are arranged between two rows PCIE SWITCH of chips;
ten first high density connectors 7 are provided at the edge of the fourth side 6.4; each PCIE SWITCH chip is connected with one GPU interface through two PCIE X16 signal lines respectively.
Example 5:
as shown in fig. 3, the invention provides a high-density GPU server, which comprises a server casing 1, wherein a heat dissipation module 2, a GPU module 3, a middle backboard 4 and a storage network module 5 are sequentially arranged from a front side 1.1 to a rear side 1.2 in the server casing 1;
the GPU module 3 comprises a GPU board 6, the GPU board 6 is transversely arranged, a plurality of PCIE SWITCH chips are arranged on the GPU board 6, each PCIE SWITCH chip is connected with a first high-density connector 7 and a plurality of GPU interfaces, and each GPU interface is connected with a GPU card 9;
the middle backboard 4 is vertically arranged, and the board surface is parallel to the front side surface 1.1 of the server case shell 1; the front surface of the middle backboard 4 is provided with a second high-density connector, the back surface of the middle backboard 4 is provided with a third high-density connector 8, the second high-density connector is communicated with the third high-density connector 8, a PCIE SWITCH chip is spliced with the second high-density connector through the first high-density connector 7, and the storage network module 5 is spliced with the third high-density connector 8;
the GPU card 6 is also provided with NVLink interfaces, the NVLink interfaces of adjacent GPU cards 6 are connected through interconnection GPU cards, and the NVLink interfaces realize high-speed interconnection of the adjacent GPU cards on the GPU board;
the GPU board 6 is also provided with two board card power supply ports 12, and the two board card power supply ports 12 are respectively arranged at two sides of the first high-density connector 7;
as shown in fig. 6, two groups of PSU connectors are arranged on two sides of the back surface of the middle back plate 4, and two PSU modules 14 are respectively arranged on two sides of the storage network module 5;
the board card power supply port 12 is connected with the PSU module 14 through a corresponding group of PSU connectors 13; the middle backboard not only interconnects high-speed signals in the system, but also communicates power supply among all devices;
each group of PSU connectors on the middle backplate 4 comprises two PSU connectors 13 arranged vertically;
the middle backboard 4 is also provided with a heat dissipation opening; each PSU connector connects a 54V PSU module 14;
as shown in fig. 7, the heat dissipation module 2 includes heat dissipation fans 15 arranged in rows and columns; the number of the radiating fans is ten, and the ten radiating fans 15 are distributed in two rows of five in each row; the cooling fan adopts an 8086 type cooling fan, so that the cooling requirement of the system is ensured, and the upper layer of fan and the lower layer of fan can meet the cooling requirement of the upper layer of equipment and the lower layer of equipment;
the storage network module 5 comprises a storage unit and a network card unit, wherein the storage unit is arranged at the upper part of the network card unit, the lower part of the network card unit is provided with a main board 16, and the main board 16 is provided with a CPU 17;
the storage unit, the network card unit, the CPU 17 and the third high-density connector 8 are spliced;
the storage unit comprises a storage backboard and a plurality of SSDs 18, wherein the SSDs 18 are inserted on the storage backboard and are longitudinally arranged to generate SSD storage columns;
the network unit comprises a network backboard and a plurality of IB cards 19; the IB cards 19 are inserted into the network backboard and are longitudinally arranged to generate an IB card storage column;
the storage backboard, the network backboard and the main board 16 are inserted into the back surface of the middle backboard 4 through the third high-density connector 8, and the storage backboard, the network backboard and the main board 16 are parallel and perpendicular to the middle backboard 4;
as shown in fig. 8, the IB card memory column is provided at the upper part of the SSD memory column, and the CPU 17 is provided at the lower part of the IB card 19; the SSD 18 is an e1.L type SSD. The storage backboard and the network backboard use PCIE Riser cards; each PCIE SWITCH chip is connected with two IB cards 19 through two PCIE X16 signal lines via a first high-density connector 7, a second high-density connector and a third high-density connector 8; each PCIE SWITCH chip is connected with the CPU 17 through a PCIE X16 signal wire and a first high-density connector 7, a second high-density connector and a third high-density connector 8; each PCIE SWITCH chip is connected with two SSDs 18 through a PCIE X8 signal wire and a first high-density connector 7, a second high-density connector and a third high-density connector 8; the IB card 19 supports 200G rate and E1.L SSD so that the invention supports GPU direct storage and GPU remote direct reading; the E1.L type SSD effectively improves the storage density; E1.L type SSD replaces the traditional NVME hard disk design, so that the design space is saved, and the design density is improved; the invention supports the topological design of GPU, IB card and SSD in a ratio of 1:1:1.
Example 6:
as shown in fig. 3, the invention provides a high-density GPU server, which comprises a server casing 1, wherein a heat dissipation module 2, a GPU module 3, a middle backboard 4 and a storage network module 5 are sequentially arranged from a front side 1.1 to a rear side 1.2 in the server casing 1; the server case housing 1 is 4U in height;
the GPU module 3 comprises a GPU board 6, the GPU board 6 is transversely arranged, a plurality of PCIE SWITCH chips are arranged on the GPU board 6, each PCIE SWITCH chip is connected with a first high-density connector 7 and a plurality of GPU interfaces, and each GPU interface is connected with a GPU card 9;
the middle backboard 4 is vertically arranged, and the board surface is parallel to the front side surface 1.1 of the server case shell 1; the front surface of the middle backboard 4 is provided with a second high-density connector, the back surface of the middle backboard 4 is provided with a third high-density connector 8, the second high-density connector is communicated with the third high-density connector 8, a PCIE SWITCH chip is spliced with the second high-density connector through the first high-density connector 7, and the storage network module 5 is spliced with the third high-density connector 8;
the GPU card 6 is also provided with NVLink interfaces, the NVLink interfaces of adjacent GPU cards 6 are connected through interconnection GPU cards, and the NVLink interfaces realize high-speed interconnection of the adjacent GPU cards on the GPU board;
the GPU board 6 is also provided with two board card power supply ports 12, and the two board card power supply ports 12 are respectively arranged at two sides of the first high-density connector 7;
as shown in fig. 6, two groups of PSU connectors are arranged on two sides of the back surface of the middle back plate 4, and two PSU modules 14 are respectively arranged on two sides of the storage network module 5;
the board card power supply port 12 is connected with the PSU module 14 through a corresponding group of PSU connectors 13; the middle backboard not only interconnects high-speed signals in the system, but also communicates power supply among all devices;
each group of PSU connectors on the middle backplate 4 comprises two PSU connectors 13 arranged vertically;
the middle backboard 4 is also provided with a heat dissipation opening; each PSU connector connects a 54V PSU module 14;
as shown in fig. 7, the heat dissipation module 2 includes heat dissipation fans 15 arranged in rows and columns; the number of the radiating fans is ten, and the ten radiating fans 15 are distributed in two rows of five in each row; the cooling fan adopts an 8086 type cooling fan, so that the cooling requirement of the system is ensured, and the upper layer of fan and the lower layer of fan can meet the cooling requirement of the upper layer of equipment and the lower layer of equipment;
the storage network module 5 comprises a storage unit and a network card unit, wherein the storage unit is arranged at the upper part of the network card unit, the lower part of the network card unit is provided with a main board 16, and the main board 16 is provided with a CPU 17;
the storage unit, the network card unit, the CPU 17 and the third high-density connector 8 are spliced;
the storage unit comprises a storage backboard and twenty SSDs 18, wherein the twenty SSDs 18 are spliced on the storage backboard and are longitudinally arranged to generate SSD storage columns;
the network element comprises a network back plate and twenty IB cards 19; twenty IB cards 19 are inserted on the network backboard and are longitudinally arranged to generate an IB card storage column;
the storage backboard, the network backboard and the main board 16 are inserted into the back surface of the middle backboard 4 through the third high-density connector 8, and the storage backboard, the network backboard and the main board 16 are parallel and perpendicular to the middle backboard 4;
as shown in fig. 8, the IB card memory column is provided at the upper part of the SSD memory column, and the CPU 17 is provided at the lower part of the IB card 19; SSD 18 is an E1.L type SSD; the storage backboard and the network backboard use PCIE Riser cards;
as shown in fig. 9, each PCIE SWITCH chip is connected with two IB cards 19 through two PCIE X16 signal lines via the first high-density connector 7, the second high-density connector, and the third high-density connector 8; each PCIE SWITCH chip is connected with the CPU 17 through a PCIE X16 signal wire and a first high-density connector 7, a second high-density connector and a third high-density connector 8; each PCIE SWITCH chip is connected with two SSDs 18 through a PCIE X8 signal wire and a first high-density connector 7, a second high-density connector and a third high-density connector 8; the IB card 19 supports 200G rate and E1.L SSD so that the invention supports GPU direct storage and GPU remote direct reading; the E1.L type SSD effectively improves the storage density; E1.L type SSD replaces the traditional NVME hard disk design, so that the design space is saved, and the design density is improved;
the number of the CPUs 17 is two, the CPU comprises a first CPU 17.1 and a second CPU 17.2, an OCP network card 20 is further arranged on the main board, and the two CPUs 17 are connected through the OCP network card 20; the first CPU 17.1 is connected to the first PCIE SWITCH chip SW1, the second PCIE SWITCH chip SW2, the third PCIE SWITCH chip SW3, the fourth PCIE SWITCH chip SW4 and the fifth PCIE SWITCH chip SW5, and the second CPU 17.2 is connected to the sixth PCIE SWITCH chip SW6, the seventh PCIE SWITCH chip SW7, the eighth PCIE SWITCH chip SW8, the ninth PCIE SWITCH chip SW9 and the tenth PCIE SWITCH chip SW 10; the first CPU 17.1 and the second CPU 17.2 are both connected with the OCP network card 20;
the invention supports the topological design of GPU cards, IB cards and SSD with the ratio of 1:1:1; the AI server with the specification of 4U can support 20 full-height half-length double-width GPU cards at maximum, and the GPU cards can be interconnected at high speed through an NVLink interface.
Although the present invention has been described in detail by way of preferred embodiments with reference to the accompanying drawings, the present invention is not limited thereto. Various equivalent modifications and substitutions may be made in the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and it is intended that all such modifications and substitutions be within the scope of the present invention/be within the scope of the present invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. The high-density GPU server is characterized by comprising a server case shell (1), wherein a heat dissipation module (2), a GPU module (3), a middle backboard (4) and a storage network module (5) are sequentially arranged in the server case shell (1) from a front side surface (1.1) to a rear side surface (1.2);
the GPU module (3) comprises a GPU board (6), the GPU board (6) is transversely arranged, a plurality of PCIE SWITCH chips are arranged on the GPU board (6), each PCIE SWITCH chip is connected with a first high-density connector (7) and a plurality of GPU interfaces, and each GPU interface is connected with a GPU card (9);
the middle backboard (4) is vertically arranged, and the board surface is parallel to the front side surface (1.1) of the server case shell (1); the front surface of the middle backboard (4) is provided with a second high-density connector, the back surface of the middle backboard (4) is provided with a third high-density connector (8), the second high-density connector is communicated with the third high-density connector (8), a PCIE SWITCH chip is spliced with the second high-density connector through a first high-density connector (7), and a storage network module (5) is spliced with the third high-density connector (8);
the GPU card (9) comprises a long side (9.1), a short side (9.2) and a thickness side (9.3);
a golden finger (10) is arranged at the side surface formed by the short side edge (9.2) and the thickness edge (9.3), and the golden finger (10) is spliced with the GPU interface;
the storage network module (5) comprises a storage unit and a network card unit, wherein the storage unit comprises a storage backboard and a plurality of SSDs (18), and the SSDs adopt an E1.L type SSD.
2. A high-density GPU server according to claim 1, characterized in that ventilation and heat dissipation holes (11) are provided at the sides formed by the long sides (9.1) and the thick sides (9.3).
3. The high-density GPU server of claim 1, wherein the GPU board (6) comprises a first side (6.1), a second side (6.2), a third side (6.3), and a fourth side (6.4);
ten PCIE SWITCH chips are arranged on the GPU board (6), the ten PCIE SWITCH chips are distributed in two rows, and the two rows of PCIE SWITCH chips are distributed along the direction from the fourth side edge (6.4) to the second side edge (6.2);
five PCIE SWITCH chips of each row are distributed along the directions from the first side edge (6.1) to the third side edge (6.3);
each PCIE SWITCH chip is connected with two GPU interfaces, and each PCIE SWITCH chip is connected with a first high-density connector (7);
two GPU interfaces connected by a row of PCIE SWITCH chips close to the second side (6.2) are arranged at the first side (6.1);
two GPU interfaces connected with one row of PCIE SWITCH chips close to the fourth side (6.4) are arranged between the two rows of PCIE SWITCH chips;
ten first high density connectors (7) are provided at the edge of the fourth side (6.4).
4. The high-density GPU server of claim 1, wherein the GPU card (6) is further provided with an NVLink interface, and the NVLink interfaces of adjacent GPU cards (6) are connected by interconnecting GPU cards.
5. The high-density GPU server according to claim 1, wherein two board card power supply ports (12) are further arranged on the GPU board (6), and the two board card power supply ports (12) are respectively arranged at two sides of the first high-density connector (7);
two groups of PSU connectors are arranged on two sides of the back of the middle back plate (4), and one PSU module (14) is arranged on two sides of the storage network module (5) respectively;
the board card power supply port (12) is connected with the PSU module (14) through a corresponding group of PSU connectors (13).
6. A high-density GPU server according to claim 5, wherein each group of PSU connectors on the midplane (4) comprises two PSU connectors (13) arranged vertically;
the middle backboard (4) is also provided with a heat dissipation opening.
7. A high-density GPU server according to claim 1, wherein the heat dissipating module (2) comprises heat dissipating fans (15) arranged in rows and columns.
8. The high-density GPU server according to claim 7, wherein the storage unit is provided at an upper portion of the network card unit, a motherboard (16) is provided at a lower portion of the network card unit, and a CPU (17) is provided on the motherboard (16);
the storage unit, the network card unit, the CPU (17) and the third high-density connector (8) are connected in an inserting mode.
9. The high-density GPU server of claim 8, wherein SSDs (18) are plugged onto a storage backplane and arranged longitudinally to generate SSD storage columns;
the network unit comprises a network backboard and a plurality of IB cards (19); IB cards (19) are inserted on the network backboard and are longitudinally arranged to generate IB card storage columns;
the storage backboard, the network backboard and the main board (16) are inserted into the back of the middle backboard (4) through a third high-density connector (8), and the storage backboard, the network backboard and the main board (16) are parallel and perpendicular to the middle backboard (4);
the IB card memory column is arranged at the upper part of the SSD memory column, and the CPU (17) is arranged at the lower part of the IB card (19).
10. The high-density GPU server according to claim 8, wherein the number of CPUs (17) is two, an OCP network card (20) is further provided on the motherboard, and both CPUs (17) are connected to the OCP network card (20).
CN202110852692.2A 2021-07-27 2021-07-27 High-density GPU server Active CN113741642B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110852692.2A CN113741642B (en) 2021-07-27 2021-07-27 High-density GPU server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110852692.2A CN113741642B (en) 2021-07-27 2021-07-27 High-density GPU server

Publications (2)

Publication Number Publication Date
CN113741642A CN113741642A (en) 2021-12-03
CN113741642B true CN113741642B (en) 2023-08-11

Family

ID=78729203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110852692.2A Active CN113741642B (en) 2021-07-27 2021-07-27 High-density GPU server

Country Status (1)

Country Link
CN (1) CN113741642B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114550759B (en) * 2022-02-26 2023-07-14 苏州浪潮智能科技有限公司 Backboard for high-density storage equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669901A (en) * 2018-12-03 2019-04-23 郑州云海信息技术有限公司 A kind of server

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669901A (en) * 2018-12-03 2019-04-23 郑州云海信息技术有限公司 A kind of server

Also Published As

Publication number Publication date
CN113741642A (en) 2021-12-03

Similar Documents

Publication Publication Date Title
JP6713791B2 (en) Modular non-volatile flash memory blade and operating method thereof
US7103753B2 (en) Backplane system having high-density electrical connectors
JP3157935U (en) server
US11596073B2 (en) Electronic equipment that provides multi-function slots
KR20160032274A (en) System and method for flexible storage and networking provisioning in large scalable processor installations
US20150092788A1 (en) Input-output module
CN1262935C (en) Topology for 66 mhz pci bus riser card system
CN113741642B (en) High-density GPU server
CN113692620A (en) Multi-rank staggered DIMM layout and routing topology
CN108153697A (en) The server system of mainboard with hot insertions function
CN110134206B (en) Computing board card
WO2024152586A1 (en) Signal transmission circuit and computing device
WO2024041077A1 (en) Server and data center
CN212569645U (en) Flexibly configurable edge server system architecture
CN111258948B (en) Novel GPU server system
CN108334172A (en) A kind of cabinet
CN109471823B (en) 4OU storage structure
US9984023B2 (en) Multi-server system interconnect
CN109101090A (en) server
TWI488573B (en) Server rack system and server
CN102402262B (en) Server structure
CN209248518U (en) A kind of solid state hard disk expansion board clamping and server
CN208283857U (en) A kind of cabinet
CN115696737B (en) Circuit board and computing device
CN221686903U (en) Modularized double-path universal server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant