WO2022057464A1 - 一种灵活配置的多计算节点服务器主板结构和程序 - Google Patents

一种灵活配置的多计算节点服务器主板结构和程序 Download PDF

Info

Publication number
WO2022057464A1
WO2022057464A1 PCT/CN2021/109444 CN2021109444W WO2022057464A1 WO 2022057464 A1 WO2022057464 A1 WO 2022057464A1 CN 2021109444 W CN2021109444 W CN 2021109444W WO 2022057464 A1 WO2022057464 A1 WO 2022057464A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
processing unit
instruction
module
management controller
Prior art date
Application number
PCT/CN2021/109444
Other languages
English (en)
French (fr)
Inventor
魏东
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Priority to US18/020,768 priority Critical patent/US20230305980A1/en
Publication of WO2022057464A1 publication Critical patent/WO2022057464A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4022Coupling between buses using switching circuits, e.g. switching matrix, connection or expansion network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/18Packaging or power distribution
    • G06F1/183Internal mounting support structures, e.g. for printed circuit boards, internal connecting means
    • G06F1/184Mounting of motherboards
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/20Cooling means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
    • G06F13/4291Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus using a clocked protocol
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/34User authentication involving the use of external additional devices, e.g. dongles or smart cards
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0016Inter-integrated circuit (I2C)
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express

Definitions

  • the present application relates to the field of object retrieval, and in particular to a flexibly configured multi-computing node server motherboard structure and program.
  • BMC Baseboard Management Controller, baseboard management controller, baseboard management controller
  • CPLD baseboard management controller
  • I2C bus to connect PCIE devices to obtain the whole machine information, including temperature, device presence, device number and other information Obtain.
  • I2C SWITCH I2C SWITCH
  • I2C resources of the BMC are limited. When too many PCIE devices are installed in the server, address conflicts are likely to occur, and when there are too many devices, the I2C polling time of the BMC is too long, and the device information cannot be monitored in real time. It is often impossible to find abnormal information in the first time, and it is impossible to conduct heat dissipation regulation and abnormal alarm in the first time.
  • the clock signal of the existing PCIE device is provided by PCH (Platform Controller Hub, Intel's integrated south bridge).
  • PCH Plate Controller Hub, Intel's integrated south bridge.
  • the CPU's CLOCK BUFFER clock buffer
  • the PCIE clock signals all come from the PCH of the mainboard or the CLOCK BUFFER of the mainboard (the clock source of the CLOCK BUFFER is PCH), and the PCH of the mainboard requires the mainboard CPU to be in place to work normally; some communication processes do not require CPU participation in PCIE
  • the device also needs to power on the CPU and PCH during operation, and cannot work independently; and when the onboard system of the motherboard is used, the resource utilization rate of the CPU and PCH is very low when only the clock signal is provided, resulting in a waste of costs; when the CPU or PCH of the motherboard is used When an exception occurs, PCIE devices that do not require CPU participation in the communication process cannot work normally.
  • the invention provides a multi-computing node server motherboard structure and program with flexible configuration, aiming to solve the problem of address conflict when too many devices are installed in the server, and the problem that the I2C polling time of the BMC is too long, and the information of the devices cannot be monitored in real time. ;
  • the PCH and CPU of the motherboard must be in place to provide the clock signal for the device, resulting in the problem that the device cannot work independently and the problem of resource waste caused by the PCH and CPU running only to provide the clock signal.
  • the present invention provides a flexible configuration of a multi-computing node server motherboard structure, including a processing unit,
  • the processing unit is respectively connected to several PCIE devices through several I2C buses, and the processing unit acquires data of the PCIE device in parallel; the processing unit analyzes whether the acquired data is abnormal;
  • the processing unit is connected to the baseboard management controller through I2C. If the data is normal, the processing unit polls and transmits the analyzed data to the baseboard management controller via I2C; and if the data is abnormal, the processing unit suspends the cycle. Query transmission information, and preferentially transmit abnormal information to the baseboard management controller;
  • the PCIE device is connected to the PCIEswitch through the PCIE bus, and the PCIEswitch is connected to the CPU through the PCIE bus;
  • the CPU is electrically connected to the PCH, and the CPU is electrically connected to the storage unit.
  • the processing unit is configured with an internal clock module and an external clock module, the external clock module is connected to the clock output of the PCH, the internal clock module and the external clock module are connected to a data selection module, and the data The output of the selection module is electrically connected to the PCIE device.
  • the processing unit is configured with an I2C communication protocol, some serial IO ports of the processing unit are connected to the PCIE device through the I2C bus, and at least one serial IO port of the processing unit is connected to the baseboard management controller; The serial IO port is connected to the internal storage of the processing unit.
  • the internal storage configures a first space dedicated to storing parameter thresholds, and the parameter thresholds are used to judge that the PCIE device operates normally; the internal storage configures a second space dedicated to storing the data.
  • the processing unit is configured with a logic operation module, the logic operation module is connected to the internal storage, the logic operation module obtains the data and the parameter threshold, performs logical comparison and outputs the comparison result, the processing unit Whether the data is abnormal is determined according to the comparison result.
  • the processing unit executes the first instruction, cyclically obtains data from different storage addresses in the second space, and sends the data to the baseboard management controller in a polled manner.
  • the processing unit stops executing the first instruction through a second instruction, and the processing unit fetches the data from the second space storing the abnormal data through the second instruction and sends the data To the baseboard management controller, the baseboard management controller returns a response signal, the processing unit continues to execute the first instruction, and the processing unit cyclically obtains data from different storage addresses in the second space, The data polling is sent to the baseboard management controller.
  • the data selection module includes an OR gate, the output end of the OR gate is connected to the PCIE device to provide a clock signal, and the two input ends of the OR gate are respectively connected to the output end of the first AND gate and the second
  • the output terminal of the AND gate, one input terminal of the first AND gate is connected to the output of the inverter, the input terminal of the inverter and one input terminal of the second AND gate are connected to the control input terminal, the first AND gate is connected to the control input terminal.
  • the other input terminal of the AND gate and the other input terminal of the second AND gate are respectively connected to the output of the external clock module and the output of the internal clock module.
  • the processing unit is configured with a watchdog module, the watchdog module detects the clock output of the PCH, and when the clock output is abnormal or there is no signal, the watchdog module outputs a control signal to control the data.
  • the select block outputs the signal of the internal clock block.
  • the present invention also provides a program for a flexible configuration of a multi-computing node server, which is applied to the motherboard structure of the multi-computing node server with flexible configuration.
  • the program includes a first instruction and a second instruction, and the first instruction polls
  • the data stored in the internal storage is sent to the baseboard management controller, and the second instruction obtains the output of the analysis logic operation module. If the data is abnormal, the second instruction suspends the execution of the first instruction, and the second instruction suspends the execution of the first instruction.
  • the abnormal data is sent to the baseboard management controller, and after the second instruction acquires the response signal returned by the baseboard management controller, the second instruction controls the first instruction to continue to be executed.
  • the data of the PCIE device is received in parallel by the processing unit, and the data of the PCIE device is analyzed in parallel, thereby avoiding the address conflict problem when using the baseboard management controller and the I2Cswitch to receive the data of the PCIE device, and the parallel analysis and analysis speed is fast.
  • the abnormal data is analyzed, the abnormal information is directly transmitted to the baseboard management controller in priority. Compared with the way that the baseboard management controller obtains data by polling, the abnormal data can be transmitted in time to make the abnormal response faster.
  • the processing unit independently provides a clock to the PCIE device, so that the PCIE device that does not require the CPU to work can obtain a clock signal and work normally when the CPU and PCH are not powered on, thereby reducing energy consumption.
  • FIG. 1 is a schematic diagram of a motherboard structure of a flexibly configured multi-computing node server in an embodiment of the present invention
  • FIG. 2 is a schematic diagram of an architecture of a processing unit in an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a data selection unit in an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of a second instruction in an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of another flexibly configured multi-computing node server motherboard according to an embodiment of the present invention.
  • processing unit 101, serial communication module, 102, internal clock module, 103, external clock module, 104, data selection module, 105, internal storage, 106, logic operation module, 107, watchdog module, 200, substrate Management controller, 300, PCIEswitch, 400, CPU, 500, PCH, 600, storage unit.
  • BMC represents the baseboard management controller
  • PCIEdevicex represents the PCIE device.
  • the present invention provides a flexible configuration of a multi-computing node server motherboard structure, including a motherboard, a processing unit 100 is configured on the motherboard, and a plurality of serial communication modules 101 are configured on the processing unit 100.
  • the serial communication module 101 is connected to a PCIE device through the I2C bus.
  • the processing unit 100 may be an FPGA chip, the FPGA chip is configured with an I2C communication protocol, some serial IO ports of the FPGA chip are connected to PCIE devices through the I2C bus, and a serial IO port of the FPGA chip is connected to The baseboard management controller 200; the serial IO port is connected to the internal storage 105 of the FPGA chip.
  • the serial port communication module 101 obtains the data of the PCIE device in parallel; specifically, the PCIE device sends the data to the serial IO port of the FPGA chip, and the FPGA receives the data and stores the internal storage 105 second space.
  • the internal storage 105 is further configured with a first space, and a parameter threshold is programmed into the first space. The parameter threshold is used to judge that the PCIE device is running normally. Different storage addresses in the second space are mapped one-to-one with different PCIE devices; different storage addresses in the first space are mapped one-to-one with storage addresses in the second space.
  • the processing unit 100 analyzes whether the data acquired by the serial communication module 101 is abnormal; the processing unit 100 configures a logic operation module 106, the logic operation module 106 is connected to the internal storage 105, and the logic operation module 106 At the same time, the data of the second space is read, and the logic operation module 106 reads the parameter thresholds of the first space at the same time, and the logic operation module 106 parallelizes all the data and the corresponding parameter thresholds A logical comparison is performed and a comparison result is output, and the processing unit 100 judges whether the data is abnormal according to the comparison result.
  • a specific feasible way is that the logic operation module outputs the first signal when the data is within the parameter threshold range, and the logic operation module outputs the second signal when the data exceeds the parameter threshold range.
  • the processing unit 100 is connected to the baseboard management controller 200 through I2C. Specifically, as shown in FIG. 4 , the second command sets a conditional structure for judging whether the received signal is the first signal or the second signal.
  • the second command A monitoring input port is configured, the monitoring input port is connected to the output of the logic operation module, and the conditional structure of the second instruction is used to select whether to continue executing the first instruction or suspend the first instruction and execute and transmit abnormal data. If the data is normal, the processing unit 100 polls and transmits the analyzed data to the baseboard management controller 200 via I2C; specifically, if the data is normal, the processing unit 100 continues to execute the first instruction, and the cycle is performed in turn.
  • the first instruction defines the storage address of the second space, and through a loop structure, the data obtained by polling from the storage address of the second space is transmitted to the baseboard management controller.
  • the processing unit 100 suspends polling for transmission information, and preferentially transmits the abnormal information to the baseboard management controller 200; specifically, the processing unit 100 stops executing the first An instruction, the processing unit 100 executes a second instruction to transmit abnormal data, specifically fetches the data from the second space where the abnormal data is stored, and sends the data to the baseboard management controller 200, the baseboard management control
  • the processor 200 returns a response signal, the processing unit 100 continues to execute the first instruction, the processing unit cyclically obtains data from different storage addresses in the second space in turn, and sends the data polled to the baseboard management controller 200.
  • the processing unit 100 is configured with an internal clock module 102 and an external clock module 103, the clock source of the external clock module 103 is the clock output connected to the PCH 500, and the external clock module outputs the clock of the PCH Frequency multiplication output;
  • the internal clock module 102 is a phase-locked loop configured inside the processing unit 100, and a 100MHz clock signal is generated by the phase-locked loop, and the internal clock module 102 is connected to the external clock module 103.
  • the data selection module 104, the output of the data selection module 104 is electrically connected to the PCIE device; specifically, as shown in FIG.
  • the data selection module 104 includes an OR gate, and the output end of the OR gate is connected to the The PCIE device provides a clock signal, the two input ends of the OR gate are respectively connected to the output end of the first AND gate and the output end of the second AND gate, and one input end of the first AND gate is connected to the output of the inverter, The input of the inverter and one input terminal of the second AND gate are connected to the control input terminal, the other input terminal (input terminal 1) of the first AND gate is connected to the output of the external clock module, and the second AND gate is connected to the output of the external clock module. The other input of the gate (input 2) is connected to the output of the internal clock block.
  • the data selection module 104 When the control input terminal inputs a high level, the data selection module 104 outputs the output of the internal clock module connected to the input terminal 2, and when the control input terminal inputs a low level, the data selection module 104 outputs the input terminal 1 connection. output of the external clock module.
  • the processing unit 100 is configured with a watchdog module 107, the input of the watchdog module 107 is connected to the clock signal output by the PCH to the processing unit 100, and the output of the watchdog module 107 is connected to the control At the input end, if the clock signal is abnormal or the clock disappears, the watchdog module 107 outputs a high level, so that the internal clock module provides clock output.
  • Described PCIE equipment connects PCIEswitch300 through PCIE bus, described PCIEswitch300 connects CPU400 through PCIE bus; Described PCIEswitch configures any user-specified PCIE equipment as subordinate equipment and configures described CPU400 as main equipment, makes CPU and PCIE equipment establish PCIE communication ;
  • the PCIE configures any PCIE device designated by the user as a slave device and configures another PCIE device designated by the user as a master device, so that one PCIE device establishes PCIE communication with another PCIE device.
  • the watchdog module 107 monitors that the PCH 500 is not outputting a clock signal, and outputs a high level
  • the data selection module 104 is controlled to select and output the signal of the internal clock module.
  • the CPU 400 is electrically connected to the PCH 500 , and the CPU 400 is electrically connected to the storage unit 600 .
  • Embodiment 2 the main difference between Embodiment 2 and Embodiment 1 is that the processing unit 100 is connected to the baseboard management controller 200 through a (a ⁇ positive integer, 1 ⁇ a ⁇ N) I2Cs, and each I2C is responsible for transmitting part of the Two spatial data.
  • the data stored in the second space is transmitted through multiple pieces of I2C, so as to avoid the situation that when one piece of I2C is used for transmission in Embodiment 1, the information cannot be transmitted if one piece of I2C damages the entire PCIE device.
  • the present invention also provides a program for a flexible configuration of a multi-computing node server, which is applied to the motherboard structure of the multi-computing node server with flexible configuration.
  • the program includes a first instruction and a second instruction, and the first instruction polls
  • the data stored in the internal storage is sent to the baseboard management controller, and the second instruction obtains the output of the analysis logic operation module. If the data is abnormal, the second instruction suspends the execution of the first instruction, and the second instruction suspends the execution of the first instruction.
  • the abnormal data is sent to the baseboard management controller, and after the second instruction acquires the response signal returned by the baseboard management controller, the second instruction controls the first instruction to continue to execute.
  • the present invention also provides a storage medium of a flexible configuration of the multi-computing node server, which can be externally connected to the processing unit, and the storage medium stores the program of the flexible configuration of the multi-computing node server.
  • any reference signs placed between parentheses shall not be construed as limiting the claim.
  • the word “comprising” does not exclude the presence of elements or steps not listed in a claim.
  • the word “a” or “an” preceding an element does not preclude the presence of a plurality of such elements.
  • the invention can be implemented by means of hardware comprising several different components and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means may be embodied by one and the same item of hardware.
  • the use of the words first, second, and third, etc. do not denote any order. These words can be interpreted as names.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • Power Engineering (AREA)
  • Software Systems (AREA)
  • Hardware Redundancy (AREA)

Abstract

一种灵活配置的多计算节点服务器主板结构和程序。处理单元通过 I2C 并行连接 PCIE 设备; 所述处理单元分析获取的 PCIE 设备的数据是否异常; 所述处理单元通过 I2C 连接基板管理控制器,如果数据正常,所述处理单元轮询地将分析后的数据经 I2C 传给所述基板管理控制器; 而如果数据异常,所述处理单元暂停轮询传输信息,优先将异常信息传递给基板管理控制器; 所述处理单元配置内部时钟模块和外部时钟模块,所述外部时钟模块连接 PCH,所述内部时钟模块和所述外部时钟模块连接数据选择模块,所述数据选择模块的输出电性连接所述 PCIE 设备; 所述 PCIE 设备通过 PCIE 总线连接PCIEswitch,所述 PCIEswitch 电性连接 CPU; 所述 CPU 电性连接 PCH,所述 CPU 电性连接存储单元。

Description

一种灵活配置的多计算节点服务器主板结构和程序
本申请要求于2020年09月18日提交中国专利局、申请号为202010991856.5、发明名称为“一种灵活配置的多计算节点服务器主板结构和程序”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及对象检索领域,特别涉及一种灵活配置的多计算节点服务器主板结构和程序。
背景技术
通过BMC(Baseboard Management Controller,基板管理控制器)和CPLD来实现整机散热、供电等方面的管理,BMC常用I2C总线连接PCIE设备获取整机信息,包括温度、设备在位、设备编号等信息的获取。
由于BMC的I2C资源有限,经常需要使用I2C SWITCH来进行I2C的扩展。BMC的I2C资源有限,当服务器内安装过多PCIE设备时,极易发生地址冲突,且设备过多时BMC的I2C轮询时间过长,无法对设备的信息进行实时监控。往往无法第一时间发现异常信息,不能在第一时间进行散热调控、异常报警。
现有的PCIE设备的时钟信号由PCH(Platform Controller Hub,intel公司的集成南桥)提供,当PCH的时钟资源不足时,利用CPU的CLOCK BUFFER(时钟缓冲器)来为加速卡、NVME等设备提供时钟。因此PCIE时钟信号都来自于主板的PCH或主板的CLOCK BUFFER(CLOCK BUFFER的时钟源为PCH),而主板的PCH则需要主板CPU在位才可以正常工作;有一些通信过程不需要CPU参与的PCIE设备在运行时也需要对CPU和PCH上电,无法独立工作;而且使用主板的板载系统,仅提供时钟信号时CPU和PCH的资源利用率非常低,造成成本浪费;当主板的CPU或PCH出现异常时,通信过程不需要CPU参与的PCIE设备都无法正常工作。
发明内容
本发明提供灵活配置的多计算节点服务器主板结构和程序,旨在解决服务器内安装过多设备时的地址冲突问题,以及BMC的I2C轮询时间过长,无法对设备的信息进行实时监控的问题;主板的PCH和CPU必须在位,才能为设备提供时钟信号,导致设备无法独立工作的问题以及PCH和CPU运行仅提供时钟信号导致资源浪费问题。
为实现上述目的,本发明提供一种灵活配置的多计算节点服务器主板结构,包括处理单元,
所述处理单元通过若干I2C总线分别连接若干PCIE设备,所述处理单元并行获取所述PCIE设备的数据;所述处理单元分析获取的所述数据是否异常;
所述处理单元通过I2C连接基板管理控制器,如果数据正常,所述处理单元轮询地将分析后的数据经I2C传给所述基板管理控制器;而如果数据异常,所述处理单元暂停轮询传输信息,优先将异常信息传递给基板管理控制器;
所述PCIE设备通过PCIE总线连接PCIEswitch,所述PCIEswitch通过PCIE总线连接CPU;
所述CPU电性连接PCH,所述CPU电性连接存储单元。
更近一步地,所述处理单元配置内部时钟模块和外部时钟模块,所述外部时钟模块连接所述PCH的时钟输出,所述内部时钟模块和所述外部时钟模块连接数据选择模块,所述数据选择模块的输出电性连接所述PCIE设备。
更进一步地,所述处理单元配置I2C通信协议,所述处理单元的部分串行IO口通过I2C总线连接PCIE设备,所述处理单元的至少一个串行IO口连接所述基板管理控制器;所述串行IO口连接所述处理单元的内部存储。
更进一步地,所述内部存储配置专门存储参数阈值的第一空间,所述参数阈值用于判断所述PCIE设备正常运行;所述内部存储配置专门存储所述数据的第二空间。
更进一步地,所述处理单元配置逻辑运算模块,所述逻辑运算模块连接所述内部存储,所述逻辑运算模块获取所述数据和所述参数阈值进行逻辑比较并输出比较结果,所述处理单元根据所述比较结果判断所述数据是否异常。
更进一步地,所述数据正常,所述处理单元执行第一指令,循环依次从所述第二空间的不同存储地址获取数据,将所述数据轮询的发送给所述基板管理控制器。
更进一步地,所述数据异常,所述处理单元通过第二指令停止执行所述第一指令,所述处理单元通过第二指令从存储异常的所述数据的第二空间调取所述数据发送给所述基板管理控制器,所述基板管理控制器返回响应信号,所述处理单元继续执行所述第一指令所述处理单元循环依次从所述第二空间的不同存储地址获取数据,将所述数据轮询的发送给所述基板管理控制器。
更进一步地,所述数据选择模块包括或门,所述或门的输出端连接所述PCIE设备提供时钟信号,所述或门的两个输入端分别连接第一与门的输出端和第二与门的输出端,所述第一与门的一个输入端连接反相器的输出,所述反相器的输入和所述第二与门的一个输入端连接控制输入端,所述第一与门的另一个输入端和所述第二与门的另一个输入端分别连接外部时钟模块输出和内部时钟模块输出。
更进一步地,所述处理单元配置看门狗模块,所述看门狗模块检测所述PCH的时钟输出,当时钟输出异常或者无信号时,所述看门狗模块输出控制信号控制所述数据选择模块输出内部时钟模块的信号。
本发明还提供一种灵活配置的多计算节点服务器的程序,应用于所述的灵活配置的多计算节点服务器主板结构,所述程序包括第一指令和第二指令,所述第一指令轮询的将内部存储中存储的数据发送给基板管理控制器,所述第二指令获取分析逻辑运算模块输出,如果数据异常则所述第二指令暂停所述第一指令的执行,所述第二指令将异常的所述数据发送给所述基板管理控制器,所述第二指令获取所述基板管理控制器返回的响应信号后所述第二指令控制所述第一指令继续执行。
本申请提出的一种灵活配置的多计算节点服务器主板结构和程序具体有以下有益效果:
(1)通过所述处理单元并行接收PCIE设备的数据,并行分析PCIE设备的数据,从而避免了使用基板管理控制器和I2Cswitch接收PCIE设备的数据时的地址冲突问题,而且并行分析分析速度快。
(2)当分析出所述数据异常时,直接将异常信息优先传给所述基板管理控制器,相比基板管理控制器轮询获取数据的方式,能及时传输异常数据使得异常响应快。
(3)通过处理单元分析数据,降低基板管理控制器的资源消耗,节约基板管理控制器的I2C接口资源。
(4)通过由所述处理单元独立的向PCIE设备提供时钟,使得无需CPU参与工作的PCIE设备在CPU和PCH不上电的情况下获取时钟信号而正常工作,降低能耗。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图示出的结构获得其他的附图。
图1是本发明实施例中一种灵活配置的多计算节点服务器主板结构的示意图;
图2是本发明实施例中处理单元的架构的示意图;
图3是本发明实施例中数据选择单元的示意图;
图4是本发明实施例中第二指令的流程示意图;
图5是本发明实施例中另一种灵活配置的多计算节点服务器主板结构示意图。
图中标号及含义如下:
100、处理单元,101、串口通信模块,102、内部时钟模块,103、外部时钟模块,104、数据选择模块,105、内部存储,106、逻辑运算模块,107、看门狗模块,200、基板管理控制器,300、PCIEswitch,400、CPU,500、PCH,600、存储单元。
图中BMC代表基板管理控制器;图中PCIEdevicex代表PCIE设备。
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
实施例1
参阅图1所示,本发明提供一种灵活配置的多计算节点服务器主板结构,包括主板,主板上配置有处理单元100,所述处理单元100上配置多个串口通信模块101,每个所述串口通信模块101通过I2C总线连接一个PCIE设备。具体实施过程中所述处理单元100可以为FPGA芯片,所述FPGA芯片配置I2C通信协议,所述FPGA芯片的部分串行IO口通过I2C总线连接PCIE设备,所述FPGA芯片一个串行IO口连接所述基板管理控制器200;所述串行IO口连接FPGA芯片的内部存储105。
所述串口通信模块101并行获取所述PCIE设备的数据;具体的,所述PCIE设备向所述FPGA芯片的串行IO口发送所述数据,所述FPGA接收所述数据并存储所述内部存储105的第二空间。所述内部存储105还配置有第一空间,所述第一空间烧录有参数阈值所述参数阈值用于判断所述PCIE设备正常运行。所述第二空间的不同存储地址与不同的PCIE设备一一映射;所述第一空间的不同存储地址与所述第二空间的存储地址一一映射。
所述处理单元100分析所述串口通信模块101获取的所述数据是否异常;所述处理单元100配置逻辑运算模块106,所述逻辑运算模块106连接所述内部存储105,所述逻辑运算模块106同时读取所述第二空间的数 据,所述逻辑运算模块106同时读取所述第一空间的参数阈值,所述逻辑运算模块106对并行对所有的所述数据和相应的所述参数阈值进行逻辑比较并输出比较结果,所述处理单元100根据所述比较结果判断所述数据是否异常。具体的一种可行的方式为所述数据在参数阈值范围内则所述逻辑运算模块输出第一信号,所述数据超出参数阈值范围内则所述逻辑运算模块输出第二信号。
所述处理单元100通过I2C连接基板管理控制器200,具体的,参阅图4所示,所述第二指令设置判断接收到信号是第一信号还是第二信号的条件结构,所述第二指令配置监控输入口,所述监控输入口连接所述逻辑运算模块的输出,通过所述第二指令的条件结构对继续执行第一指令还是暂停第一指令并执行传输异常数据进行选择。如果数据正常,所述处理单元100轮询地将分析后的数据经I2C传给所述基板管理控制器200;具体的,所述数据正常,所述处理单元100继续执行第一指令,循环依次从所述第二空间的不同存储地址获取数据,将所述数据轮询的发送给所述基板管理控制器200。具体的,所述第一指令定义了第二空间的存储地址,通过循环结构,轮询的从第二空间的存储地址获取数据传给所述基板管理控制器。
而如果数据异常,所述处理单元100暂停轮询传输信息,优先将异常信息传递给基板管理控制器200;具体的,所述数据异常,所述处理单元100通过第二指令停止执行所述第一指令,所述处理单元100执行第二指令进行异常数据传输,具体的从存储异常的所述数据的第二空间调取所述数据发送给所述基板管理控制器200,所述基板管理控制器200返回响应信号,所述处理单元100继续执行所述第一指令所述处理单元循环依次从所述第二空间的不同存储地址获取数据,将所述数据轮询的发送给所述基板管理控制器200。
参阅图2所示,所述处理单元100配置内部时钟模块102和外部时钟模块103,所述外部时钟模块103的时钟源为连接PCH500的时钟输出,所述外部时钟模块对所述PCH的时钟输出倍频输出;所述内部时钟模块102为配置于所述处理单元100内部的锁相环,由所述锁相环产生100MHz 的时钟信号,所述内部时钟模块102和所述外部时钟模块103连接数据选择模块104,所述数据选择模块104的输出电性连接所述PCIE设备;具体的,参阅图3所示,所述数据选择模块104包括或门,所述或门的输出端连接所述PCIE设备提供时钟信号,所述或门的两个输入端分别连接第一与门的输出端和第二与门的输出端,所述第一与门的一个输入端连接反相器的输出,所述反相器的输入和所述第二与门的一个输入端连接控制输入端,所述第一与门的另一个输入端(输入端1)连接外部时钟模块输出,所述第二与门的另一个输入端(输入端2)连接内部时钟模块输出。当所述控制输入端输入高电平时,所述数据选择模块104输出输入端2连接的内部时钟模块的输出,所述控制输入端输入低电平时,所述数据选择模块104输出输入端1连接的外部时钟模块的输出。
所述处理单元100配置有看门狗模块107,所述看门狗模块107的输入链接所述PCH向所述处理单元100输出的时钟信号,所述看门狗模块107的输出连接所述控制输入端,如果时钟信号异常或者所述时钟消失,则所述看门狗模块107输出高电平,使得内部时钟模块提供时钟输出。
所述PCIE设备通过PCIE总线连接PCIEswitch300,所述PCIEswitch300通过PCIE总线连接CPU400;所述PCIEswitch配置任一用户指定的PCIE设备为从属设备并配置所述CPU400为主设备,使得CPU与PCIE设备建立PCIE通信;
所述PCIE配置任一用户指定的PCIE设备为从属设备并配置用户指定的另外一个PCIE设备为主设备,使得一个PCIE设备与另一个PCIE设备建立PCIE通信。
对于一个PCIE设备与另一个PCIE设备建立PCIE通信而无需CPU和PCH参与,对所述PCH500和CPU400断电,则所述看门狗模块107监控到所述PCH500不在输出时钟信号,输出高电平控制所述数据选择模块104选择输出内部时钟模块的信号。
所述CPU400电性连接PCH500,所述CPU400电性连接存储单元600。
实施例2
参阅图5所示,实施例2与实施例1的主要区别在于处理单元100通 过a(a∈正整数,1<a<N)条I2C连接基板管理控制器200,每条I2C负责传输部分第二空间的数据。通过多条I2C传输所述第二空间存储的数据,避免实施例1中采用一条I2C传输时,如果一条I2C损坏整个PCIE设备的信息无法传输的情况。
本发明还提供一种灵活配置的多计算节点服务器的程序,应用于所述的灵活配置的多计算节点服务器主板结构,所述程序包括第一指令、第二指令,所述第一指令轮询的将内部存储中存储的数据发送给基板管理控制器,所述第二指令获取分析逻辑运算模块输出,如果数据异常则所述第二指令暂停所述第一指令的执行,所述第二指令将异常的所述数据发送给所述基板管理控制器,所述第二指令获取所述基板管理控制器返回的响应信号后所述第二指令控制所述第一指令继续执行。
本发明还提供一种灵活配置的多计算节点服务器的存储介质,可以外接于所述处理单元,所述存储介质存储所述的一种灵活配置的多计算节点服务器的程序。
应当注意的是,在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的部件或步骤。位于部件之前的单词“一”或“一个”不排除存在多个这样的部件。本发明可以借助于包括有若干不同部件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。
尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在 内。

Claims (10)

  1. 一种灵活配置的多计算节点服务器主板结构,其特征在于,包括处理单元(100),
    所述处理单元(100)通过若干I2C总线分别连接若干PCIE设备,所述处理单元(100)并行获取所述PCIE设备的数据;所述处理单元(100)分析获取的所述数据是否异常;
    所述处理单元(100)通过I2C连接基板管理控制器(200),如果数据正常,所述处理单元(100)轮询地将分析后的数据经I2C传给所述基板管理控制器(200);而如果数据异常,所述处理单元(100)暂停轮询传输信息,优先将异常信息传递给基板管理控制器(200);
    所述PCIE设备连接PCIEswitch(300),所述PCIEswitch(300)连接CPU(400);
    所述CPU(400)电性连接PCH(500),所述CPU(400)电性连接存储单元(600)。
  2. 根据权利要求1所述的灵活配置的多计算节点服务器主板结构,其特征在于,所述处理单元(100)配置内部时钟模块(102)和外部时钟模块(103),所述外部时钟模块(103)连接所述PCH(500)的时钟输出,所述内部时钟模块(102)和所述外部时钟模块(103)连接数据选择模块(104),所述数据选择模块(104)的输出电性连接所述PCIE设备。
  3. 根据权利要求2所述的灵活配置的多计算节点服务器主板结构,其特征在于,所述处理单元(100)配置I2C通信协议,所述处理单元(100)的部分串行IO口通过I2C总线连接PCIE设备,所述处理单元(100)的至少一个串行IO口连接所述基板管理控制器(200);所述串行IO口连接所述处理单元(100)的内部存储(105)。
  4. 根据权利要求3所述的灵活配置的多计算节点服务器主板结构,其特征在于,所述内部存储(105)配置专门存储参数阈值的第一空间,所述参数阈值用于判断所述PCIE设备正常运行;所述内部存储(105)配置专门存储所述数据的第二空间。
  5. 根据权利要求4所述的灵活配置的多计算节点服务器主板结构,其 特征在于,所述处理单元(100)配置逻辑运算模块(106),所述逻辑运算模块(106)连接所述内部存储(105),所述逻辑运算模块(106)获取所述数据和所述参数阈值进行逻辑比较并输出比较结果,所述处理单元根据所述比较结果判断所述数据是否异常。
  6. 根据权利要求5所述的灵活配置的多计算节点服务器主板结构,其特征在于,所述数据正常,所述处理单元(100)执行第一指令,循环依次从所述第二空间的不同存储地址获取数据,将所述数据轮询的发送给所述基板管理控制器(200)。
  7. 根据权利要求6所述的灵活配置的多计算节点服务器主板结构,其特征在于,所述数据异常,所述处理单元(100)通过第二指令停止执行所述第一指令,所述处理单元(100)通过第二指令从存储异常的所述数据的第二空间调取所述数据发送给所述基板管理控制器(200),所述基板管理控制器(200)返回响应信号,所述处理单元(100)继续执行所述第一指令所述处理单元循环依次从所述第二空间的不同存储地址获取数据,将所述数据轮询的发送给所述基板管理控制器(200)。
  8. 根据权利要求2所述的灵活配置的多计算节点服务器主板结构,其特征在于,所述数据选择模块(104)包括或门,所述或门的输出端连接所述PCIE设备提供时钟信号,所述或门的两个输入端分别连接第一与门的输出端和第二与门的输出端,所述第一与门的一个输入端连接反相器的输出,所述反相器的输入和所述第二与门的一个输入端连接控制输入端,所述第一与门的另一个输入端和所述第二与门的另一个输入端分别连接外部时钟模块输出和内部时钟模块输出。
  9. 根据权利要求8所述的灵活配置的多计算节点服务器主板结构,其特征在于,所述处理单元(100)配置看门狗模块(107),所述看门狗模块(107)检测所述PCH的时钟输出,当时钟输出异常或者无信号时,所述看门狗模块(107)输出控制信号控制所述数据选择模块(104)输出内部时钟模块(102)的信号。
  10. 一种灵活配置的多计算节点服务器的程序,应用于如权利要求1-8任一所述的灵活配置的多计算节点服务器主板结构,其特征在于,所述程 序包括第一指令和第二指令,所述第一指令轮询的将内部存储中存储的数据发送给基板管理控制器,所述第二指令获取分析逻辑运算模块输出,如果数据异常则所述第二指令暂停所述第一指令的执行,所述第二指令将异常的所述数据发送给所述基板管理控制器,所述第二指令获取所述基板管理控制器返回的响应信号后所述第二指令控制所述第一指令继续执行。
PCT/CN2021/109444 2020-09-18 2021-07-30 一种灵活配置的多计算节点服务器主板结构和程序 WO2022057464A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/020,768 US20230305980A1 (en) 2020-09-18 2021-07-30 Flexibly configured multi-computing-node server mainboard structure and program

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010991856.5A CN111966189B (zh) 2020-09-18 2020-09-18 一种灵活配置的多计算节点服务器主板结构和程序
CN202010991856.5 2020-09-18

Publications (1)

Publication Number Publication Date
WO2022057464A1 true WO2022057464A1 (zh) 2022-03-24

Family

ID=73387292

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/109444 WO2022057464A1 (zh) 2020-09-18 2021-07-30 一种灵活配置的多计算节点服务器主板结构和程序

Country Status (3)

Country Link
US (1) US20230305980A1 (zh)
CN (1) CN111966189B (zh)
WO (1) WO2022057464A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114741347A (zh) * 2022-04-29 2022-07-12 阿里巴巴(中国)有限公司 一种PCIe卡的控制方法、装置及PCIe卡
CN116582471A (zh) * 2023-07-14 2023-08-11 珠海星云智联科技有限公司 Pcie设备、pcie数据捕获系统和服务器

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111966189B (zh) * 2020-09-18 2022-11-25 苏州浪潮智能科技有限公司 一种灵活配置的多计算节点服务器主板结构和程序
CN113872796B (zh) * 2021-08-26 2024-04-23 浪潮电子信息产业股份有限公司 服务器及其节点设备信息获取方法、装置、设备、介质
CN113900982B (zh) * 2021-12-09 2022-03-08 苏州浪潮智能科技有限公司 一种分布式异构加速平台通信方法、系统、设备及介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105786421A (zh) * 2014-12-25 2016-07-20 中兴通讯股份有限公司 一种服务器显示方法及装置
US20170132166A1 (en) * 2014-06-30 2017-05-11 Sanechips Technology Co.,Ltd. Chip interconnection method, chip and device
CN107038139A (zh) * 2017-04-13 2017-08-11 广东浪潮大数据研究有限公司 一种基于ft1500a的国产服务器主板的实现方法
CN107302465A (zh) * 2017-08-18 2017-10-27 郑州云海信息技术有限公司 一种PCIe Switch服务器整机管理方法
CN109117407A (zh) * 2018-09-27 2019-01-01 郑州云海信息技术有限公司 一种管理板卡与服务器
CN109739794A (zh) * 2018-12-19 2019-05-10 郑州云海信息技术有限公司 一种使用cpld实现i2c总线扩展的系统及方法
CN111078445A (zh) * 2019-11-15 2020-04-28 苏州浪潮智能科技有限公司 一种psu掉电原因检测方法及装置
CN111966189A (zh) * 2020-09-18 2020-11-20 苏州浪潮智能科技有限公司 一种灵活配置的多计算节点服务器主板结构和程序

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407059A (zh) * 2016-09-28 2017-02-15 郑州云海信息技术有限公司 一种服务器节点测试系统及方法
CN211207261U (zh) * 2020-03-13 2020-08-07 深圳市阿普奥云科技有限公司 一种存储计算融合的ai计算服务器架构

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170132166A1 (en) * 2014-06-30 2017-05-11 Sanechips Technology Co.,Ltd. Chip interconnection method, chip and device
CN105786421A (zh) * 2014-12-25 2016-07-20 中兴通讯股份有限公司 一种服务器显示方法及装置
CN107038139A (zh) * 2017-04-13 2017-08-11 广东浪潮大数据研究有限公司 一种基于ft1500a的国产服务器主板的实现方法
CN107302465A (zh) * 2017-08-18 2017-10-27 郑州云海信息技术有限公司 一种PCIe Switch服务器整机管理方法
CN109117407A (zh) * 2018-09-27 2019-01-01 郑州云海信息技术有限公司 一种管理板卡与服务器
CN109739794A (zh) * 2018-12-19 2019-05-10 郑州云海信息技术有限公司 一种使用cpld实现i2c总线扩展的系统及方法
CN111078445A (zh) * 2019-11-15 2020-04-28 苏州浪潮智能科技有限公司 一种psu掉电原因检测方法及装置
CN111966189A (zh) * 2020-09-18 2020-11-20 苏州浪潮智能科技有限公司 一种灵活配置的多计算节点服务器主板结构和程序

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114741347A (zh) * 2022-04-29 2022-07-12 阿里巴巴(中国)有限公司 一种PCIe卡的控制方法、装置及PCIe卡
CN114741347B (zh) * 2022-04-29 2024-02-09 阿里巴巴(中国)有限公司 一种PCIe卡的控制方法、装置及PCIe卡
CN116582471A (zh) * 2023-07-14 2023-08-11 珠海星云智联科技有限公司 Pcie设备、pcie数据捕获系统和服务器
CN116582471B (zh) * 2023-07-14 2023-09-19 珠海星云智联科技有限公司 Pcie设备、pcie数据捕获系统和服务器

Also Published As

Publication number Publication date
US20230305980A1 (en) 2023-09-28
CN111966189B (zh) 2022-11-25
CN111966189A (zh) 2020-11-20

Similar Documents

Publication Publication Date Title
WO2022057464A1 (zh) 一种灵活配置的多计算节点服务器主板结构和程序
US20190220340A1 (en) System and method for remote system recovery
US9819532B2 (en) Multi-service node management system, device and method
US7747881B2 (en) System and method for limiting processor performance
EP3540605A1 (en) Cpld cache application in a multi-master topology system
US20120131249A1 (en) Methods and systems for an interposer board
WO2016107270A1 (zh) 管理设备的方法、设备和设备管理控制器
US8531893B2 (en) Semiconductor device and data processor
CN110399034B (zh) 一种SoC系统的功耗优化方法及终端
US9804980B2 (en) System management through direct communication between system management controllers
US9170976B2 (en) Network efficiency and power savings
US10298479B2 (en) Method of monitoring a server rack system, and the server rack system
JP2020053017A (ja) ハイブリッド電源のシステム及び方法
US6892312B1 (en) Power monitoring and reduction for embedded IO processors
CN114528234B (zh) 用于多路服务器系统的带外管理方法及装置
CN111367392A (zh) 一种动态电源管理系统
CN110851337A (zh) 适用于vpx架构的高带宽多通道的多dsp计算刀片装置
US11436182B2 (en) System and method for handling in-band interrupts on inactive I3C channels
US20230367508A1 (en) Complex programmable logic device and communication method
US11907155B2 (en) Bus system connecting slave devices with single-wire data access communication
US11061838B1 (en) System and method for graphics processing unit management infrastructure for real time data collection
JP6030998B2 (ja) 情報処理システム
CN111459768A (zh) 一种硬盘管理方法、装置、设备及机器可读存储介质
TWI830573B (zh) 基板管理控制裝置及其控制方法
US11989567B2 (en) Automatic systems devices rediscovery

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21868293

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21868293

Country of ref document: EP

Kind code of ref document: A1