CN102346707B - Server system and operation method thereof - Google Patents

Server system and operation method thereof Download PDF

Info

Publication number
CN102346707B
CN102346707B CN201010243788.0A CN201010243788A CN102346707B CN 102346707 B CN102346707 B CN 102346707B CN 201010243788 A CN201010243788 A CN 201010243788A CN 102346707 B CN102346707 B CN 102346707B
Authority
CN
China
Prior art keywords
management unit
node management
abstraction layer
hardware
hardware abstraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201010243788.0A
Other languages
Chinese (zh)
Other versions
CN102346707A (en
Inventor
赖德贤
陈谕正
龚景富
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quanta Computer Inc
Original Assignee
Quanta Computer Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quanta Computer Inc filed Critical Quanta Computer Inc
Priority to CN201010243788.0A priority Critical patent/CN102346707B/en
Publication of CN102346707A publication Critical patent/CN102346707A/en
Application granted granted Critical
Publication of CN102346707B publication Critical patent/CN102346707B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention relates to a server system and an operation method thereof, wherein the operation method comprises the following steps that: (A) under the control of a hardware abstraction layer, a plurality of node management units share a hardware resource; (B) when one of the node management units uses the hardware resource, the node management unit sends an instruction or data to the hardware abstraction layer which is used for replacing the node management unit to use the hardware resource; and (C) if receiving an external instruction, the hardware abstraction layer identifies that the external instruction is received by a transmission port of the hardware resource so as to transmit the external instruction to one corresponding node management unit for execution, and when the external instruction is executed, the corresponding node management unit returns information to the hardware abstraction layer, so that the hardware abstraction layer returns the information from the transmission port to an external system manager.

Description

Server system and its method of operating
Technical field
The present invention relates to a kind of server system and its method of operating.
Background technology
Traditionally, blade server (blade server) has been widely used in various application occasions.In general, numerous blade server is integrated in frame (chassis) system, promotes the operation ease of user by this.Its main operational circuit of computer service systems all in Computer Service workstation is gathered together together by blade server.System manager is responsible for each computer service system of Computer Service workstation inside and network configuration is safeguarded and keyholed back plate.By this, system manager can safeguard and keyholed back plate the multiple stage computer service system of gathering together together.
With presently, IPMI (IntelligentPlatform Management Interface is mainly followed in the management of server to node (node), intelligent platform management interface) specification, BMC (Baseboard Management Controller, baseboard management controller) is utilized to carry out the functions such as monitoring nodes, record and Fault recovery.Refer to the arithmetic element with operation independent ability at this so-called node, it at least comprises CPU (CPU (central processing unit)) and storer etc.At current product on the market, single BMC can only manage single node, cannot manage multiple node simultaneously.In addition, in known technology, hardware type CMM (Chassis Management Module, rack management module) is had in machine frame system, to manage whole machine frame system.
Along with the development of high in the clouds technology, the demand of data center (data center) is increased day by day, and how to place more node to improve arithmetic capability in limited machine room space be development priority.
The application proposes a kind of server system and method for operating thereof, and it can effectively reduce BMC number of chips, to allow the board space in server increase, improves arithmetic capability, and can reduce server cost in order to placement more nodes.
Summary of the invention
The present invention relates to a kind of server system and method for operating thereof, it makes multiple node management unit of BMC (it is software, distinctly for managing a node) can share the hardware resource of BMC by a hardware abstraction layer.
According to one embodiment of the invention, a kind of server system is proposed, comprise: at least one system board, this system board comprises a baseboard management controller and multiple node, this baseboard management controller comprises multiple node management unit, a hardware abstraction layer and a hardware resource, these node management unit manage these nodes respectively, and under the control of this hardware abstraction layer, these node management unit share this hardware resource; One connectivity port, in order to be connected to an external system supvr; And an inner passage, be connected to this system board and this connectivity port.
According to another embodiment of the present invention, a kind of method of operating of server system is proposed, this server system comprises at least one system board, this system board comprises a baseboard management controller and multiple node, this baseboard management controller comprises multiple node management unit, a hardware abstraction layer and a hardware resource, and these node management unit manage these nodes respectively.The method comprises: (A), under the control of this hardware abstraction layer, these node management unit share this hardware resource; (B) when one of them node management unit of these node management unit is for this hardware resource of use, this node management unit sends an instruction or data to this hardware abstraction layer, and this hardware abstraction layer replaces this node management unit to use this hardware resource according to this; And (C) is if receive an external command, then this hardware abstraction layer distinguishes that this external command received by that transmit port of this hardware resource, perform to be sent to a corresponding node management unit, and after this external command is performed, one information back is given this hardware abstraction layer, this information is returned to an external system supvr by this transmit port by this hardware abstraction layer by this corresponding node management unit.
In order to have better understanding to above-mentioned and other aspect of the present invention, preferred embodiment cited below particularly, and coordinating accompanying drawing, being described in detail below:
accompanying drawing explanation
Fig. 1 display is according to the machine frame system schematic diagram of the embodiment of the present invention.
Fig. 2 display is according to the schematic diagram of the BMC of the embodiment of the present invention.
Fig. 3 shows multiple NMU carrys out the hardware components of shared BMC schematic diagram by HAL.
Fig. 4 A ~ Fig. 4 C shows the schematic diagram of transfer instructions/information by HAL according to the embodiment of the present invention.
[main element symbol description]
100: machine frame system 101: connectivity port
102: Local Area Network 103:I 2c bus
110 ~ 130: system board 111,121,131:BMC
112-1 ~ 112-Y, 122-1 ~ 122-Y, 132-1 ~ 132-Y: node
211:HAL 212-1 ~ 212-Y: node management unit
221:GPIO pin 222: storage unit
223: serial port 224: sensing unit
225: system interface 226:LAN interface
227:I 2c interface 410: system operator
421 ~ 466: step
Embodiment
In embodiments of the present invention, single BMC can manage multiple node.In embodiments of the present invention, by HAL (Hardware Abstraction Layer, hardware abstraction layer) so that BMC is extended for multinode management from single node management, and still complete compatible IPMI specification.So, effectively can reduce the BMC number of chips in machine frame system, not only can reduce costs, also can save space, and the interior environment temperature of machine frame system can be reduced.
Fig. 1 display is according to the machine frame system schematic diagram of the embodiment of the present invention.As shown in Figure 1, at least comprise according to the machine frame system 100 of the embodiment of the present invention: connectivity port 101, LAN (Local AreaNetwork, LAN (Local Area Network)) 102, I 2c (Inter-Integrated Circuit, internal integrated circuit) bus 103 and multiple system board.Although comprise 3 system boards 110 ~ 130 for machine frame system 100 in Fig. 1, know that the embodiment of the present invention is not limited to this.System board 110 comprises: BMC 111 and node 112-1 ~ 112-Y; System board 120 comprises: BMC 121 and node 122-1 ~ 122-Y.System board 130 comprises: BMC 131 and node 132-1 ~ 132-Y.At this, Y is positive integer.
The instruction that system operator sends is sent to corresponding system board with signal etc. by connectivity port 101.Certainly, the message sent by system board is transmitted back to system operator by connectivity port 101.
As shown in Figure 1, LAN 102 and I 2c bus 103 provides the path of communicating with each other between the BMC of these system boards.In addition, in other embodiments of the present invention, BMC also alternative there is CMM function.
Fig. 2 display is according to the schematic diagram of the BMC of the embodiment of the present invention.As shown in Figure 2, BMC comprises hardware components and software section.The software section of BMC comprises: HAL 211 and node management unit (NMU, Node Management Unit) 212-1 ~ 212-Y.The hardware components of BMC comprises: GPIO (General Purpose Input/Output, general service I/O) pin 221, storage unit 222, serial port 223, sensing unit 224, system interface (System Interface is called for short SI) 225, LAN interface 226 and I 2c interface 227.
For each node, the reading that BMC can read sensing unit 224 carrys out the physical parameter (as cpu temperature, memory temperature, voltage etc.) of monitor node.For example, BMC may have three cpu temperature sensors, senses the temperature of the innernal CPU of three nodes that it manages respectively.And BMC carrys out the switching on and shutting down of control system by GPIO pin 221.In addition, system operator can transmit IPMI instruction to BMC by interfaces such as LAN interface 226 or system interfaces 225, to require that BMC performs IPMI instruction.
NMU is the management software realizing IPMI specification.That is, management node 112-1 ~ 112-3 can be respectively used to BMC 111, NMU1 ~ NMU 3.In embodiments of the present invention, owing to managing the relation of multiple node with single BMC, multiple NMU necessarily shares the hardware components of BMC, and therefore hardware abstraction layer (HAL) 211 can be used for solving this subject under discussion.HAL 211 can set up a set of logic (virtual) hardware unit for each NMU, and makes corresponding relation with entity hardware unit.
Fig. 3 shows multiple NMU carrys out the hardware components of shared BMC schematic diagram by HAL.As shown in Figure 3, as NMU wish access SDR (Sensor Data Record, sensed data record), NMU does not need the SDR knowing node actual in the access address of storage unit 222.When NMU is for reading SDR data, those SDR data of its corresponding node that what as long as NMU told HAL 211 to read is (it is such as cpu temperature, memory temperature, applying voltage etc.), HAL 211 namely can by this SDR data back of the node corresponding to this NMU to NMU.The SDR data of SDR1 ~ SDR3 difference representation node 1 ~ 3, it corresponds respectively to NMU 1 ~ NMU 3.
Similarly, when NMU is for storing SDR data, NMU does not need the actual memory address in storage unit 222 of SDR knowing node yet.When NMU is for storing SDR data, namely these SDR data can be stored in storage unit 222 as long as the SDR data for storing are passed to HAL 211, HAL 211 by NMU.That is, HAL 211 can carry out correspondence (mapping), with by NMU the data of wish storage/access correspond to storage unit 222.
SEL is system event record (System Event Log), its system event in order to memory node (such as system exception etc.).Similarly, when NMU 1 ~ NMU 3 is for access SEL 1 ~ SEL 3, be also be responsible for storage/access storage unit 222, as above-mentioned by HAL 211.FRU is field replaceable units (Field Replaceable Unit), and it records the system information such as numbering, name of product of this system board.Similarly, when NMU 1 ~ NMU 3 is for access FRU 1 ~ FRU 3, be also be responsible for access memory cell 222, as above-mentioned by HAL 211.What is more, HAL 211 can be responsible for function corresponding to data and not only be confined to SDR, SEL and FRU.Other functions mentioned by IPMI specification, such as network connectivity sequence (SOL, Serial Over LAN), platform events filtering (PEF, Platform EventFilter), induction monitor (Sensor Monitor), mill stand control (Chassis Control) etc., NMU is all by function that HAL reaches correspondence or passes on.
Fig. 4 A ~ Fig. 4 C shows the schematic diagram of transfer instructions/information by HAL according to the embodiment of the present invention.As shown in Figure 4 A, the communication between system operator 410 and HAL 211 is two-way, and the communication between HAL 211 and NMU is also two-way.
Fig. 4 B display system supvr 410 transmits IPMI instruction to the schematic diagram of BMC by HAL 211.As shown in Figure 4 B, system operator 410 can transmit IPMI instruction to HAL 211.Then, HAL 211 judges that this IPMI instruction is via system interface (SI) transmission (as Suo Shi step 421) or via LAN interface (LAN) transmission (as shown in step 422).If IPMI instruction transmits via SI, then HAL 211 then judges that this IPMI is by first of system interface transmit port SI 1 (it corresponds to node 1), second transmit port SI 2 (it corresponds to node 2) or the 3rd transmit port SI 3 (it corresponds to node 3), as shown in step 431 ~ 433.That is in the present embodiment, the system interface of BMC has multiple SI transmit port, 3 SI transmit pories are wherein had to be connected to system operator 410 in order to make BMC.If IPMI instruction transmits via LAN interface, then HAL 211 then judges that this IPMI is by first of LAN interface transmit port LAN 1 (it corresponds to node 1), second transmit port LAN 2 (it corresponds to node 2) or the 3rd transmit port LAN 3 (it corresponds to node 3), as shown in step 434 ~ 436.That is in the present embodiment, the LAN interface of BMC has multiple LAN transmit port, 3 LAN transmit pories are wherein had to be connected to system operator 410 in order to make BMC.HAL 211 is after the judgement of step 431 ~ 436, and HAL can judge that this IPMI instruction that system operator 410 is sent here is will to that of NMU1 ~ NMU 3, and then, this IPMI instruction is given object NMU by HAL 211.
Fig. 4 C shows BMC by HAL 211 back information to the schematic diagram of system operator 410.After NMU receives the IPMI instruction that system operator 410 transmits, this NMU can carry out corresponding operation, and afterwards, echo message can be returned to system operator 410 by HAL 211 by this NMU.As shown in Figure 4 C, NMU can send echo message to HAL 211.Then, HAL 211 judges that this echo message receives (as step 441) via system interface (SI) or receive via LAN interface (as step 442).If this echo message receives via system interface, HAL 211 analyzes received echo message, and HAL 211 can judge that this echo message is sent (step 451 ~ 453 and step 454 ~ 456) by that NMU.That is in the present embodiment, the system interface of BMC has multiple SI transmit port, 3 SI transmit pories are wherein had to be connected to BMC in order to make system operator 410; And the LAN interface of BMC has multiple LAN transmit port, 3 LAN transmit pories are wherein had to be connected to BMC in order to make system operator 410.HAL 211 can judge whether NMU transmits this echo message via system interface, judge that this echo message is sent (step 451 ~ 453) by that NMU again, so, echo message can be returned to system operator 410 (step 461 ~ 463) via former receiving interface (being such as SI) by HAL 211.Similarly, HAL 211 can judge whether NMU transmits echo message via LAN interface, then, HAL 211 judges that to be this echo message sent (step 454 ~ 456) by that NMU, can return to system operator 410 (step 464 ~ 466) by echo message via former receiving interface (LAN interface).
That is, in embodiments of the present invention, when system operator 410 transmits IPMI instruction to BMC by LAN interface or system interface, HAL 211 can distinguish that this IPMI instruction is received by that transmit port institute and instruction is delivered to corresponding NMU to go execution.When NMU performs instruction graduates, this echo message can be returned to system operator 410 by original transmit port to HAL 211, HAL 211 by information back by NMU.Certainly, the embodiment of the present invention is not limited to HAL 211 can only via LAN interface or system interface to pass on IPMI instruction, and HAL 211 also can via the interface supported in IPMI specification to pass on IPMI instruction.
In sum, the embodiment of the present invention at least has following advantages: (1) embodiment of the present invention can reduce the BMC number of chips required for high density server (as blade server), to reduce the cost; And (2) embodiment of the present invention effectively can utilize space, increase node number and the arithmetic capability of server, and the temperature of effective reduction system (because BMC number of chips reduces).
In sum, although the present invention is with preferred embodiment openly as above, so itself and be not used to limit the present invention.These those skilled in the art without departing from the spirit and scope of the present invention, when being used for a variety of modifications and variations.Therefore, protection scope of the present invention is when being as the criterion depending on the appended claims person of defining.

Claims (8)

1. a server system, comprising:
At least one system board, this system board comprises a baseboard management controller and multiple node, this baseboard management controller comprises multiple node management unit, a hardware abstraction layer and a hardware resource, these node management unit manage these nodes respectively, under the control of this hardware abstraction layer, these node management unit share this hardware resource;
One connectivity port, in order to be connected to an external system supvr; And
One inner passage, is connected to this system board and this connectivity port,
Wherein, this system board also comprises multiple transmit port, and these transmit pories are connected to this external system supvr in order to make this baseboard management controller;
If an external command is sent to this baseboard management controller by this hardware resource, then this hardware abstraction layer distinguishes that this external command by which transmit port is received, and performs this external command to be sent to a corresponding node management unit; And
After this corresponding node management unit performs this external command, an information back is given this hardware abstraction layer, so that this information is returned to this external system supvr by this transmit port by this corresponding node management unit.
2. server system as claimed in claim 1, wherein, this hardware abstraction layer is that each node management unit sets up a logic hardware device, to correspond to this hardware resource.
3. server system as claimed in claim 2, wherein, when one of them node management unit of these node management unit is for this hardware resource of use, this node management unit transmits an instruction to this hardware abstraction layer, and this hardware abstraction layer accesses this hardware resource according to this instruction and a result is returned to this node management unit.
4. server system as claimed in claim 2, wherein, when one of them node management unit of these node management unit is for this hardware resource of use, this node management unit transmits data to this hardware abstraction layer, and this hardware abstraction layer accesses this hardware resource according to these data.
5. the method for operating of a server system, this server system comprises at least one system board, this system board comprises a baseboard management controller and multiple node, this baseboard management controller comprises multiple node management unit, a hardware abstraction layer and a hardware resource, these node management unit manage these nodes respectively, and this method of operating comprises:
(A) under the control of this hardware abstraction layer, these node management unit share this hardware resource;
(B) when one of them node management unit of these node management unit is for this hardware resource of use, this node management unit sends an instruction or data to this hardware abstraction layer, and this hardware abstraction layer replaces this node management unit to use this hardware resource according to this; And
(C) if receive an external command, then this hardware abstraction layer distinguishes that this external command by which transmit port of this hardware resource received, perform to be sent to a corresponding node management unit, and after this external command is performed, one information back is given this hardware abstraction layer, this information is returned to an external system supvr by this transmit port by this hardware abstraction layer by this corresponding node management unit.
6. method of operating as claimed in claim 5, wherein, this step (A) comprising:
This hardware abstraction layer is that each node management unit sets up a logic hardware device, to correspond to this hardware resource.
7. method of operating as claimed in claim 6, wherein, this step (B) comprising:
When one of them node management unit of these node management unit is for this hardware resource of use, this node management unit transmits this instruction to this hardware abstraction layer, and this hardware abstraction layer accesses this hardware resource according to this instruction and a result is returned to this node management unit.
8. method of operating as claimed in claim 6, wherein, this step (B) comprising:
When one of them node management unit of these node management unit is for this hardware resource of use, this node management unit transmits these data to this hardware abstraction layer, and this hardware abstraction layer accesses this hardware resource according to these data.
CN201010243788.0A 2010-07-30 2010-07-30 Server system and operation method thereof Active CN102346707B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010243788.0A CN102346707B (en) 2010-07-30 2010-07-30 Server system and operation method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010243788.0A CN102346707B (en) 2010-07-30 2010-07-30 Server system and operation method thereof

Publications (2)

Publication Number Publication Date
CN102346707A CN102346707A (en) 2012-02-08
CN102346707B true CN102346707B (en) 2014-12-17

Family

ID=45545402

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010243788.0A Active CN102346707B (en) 2010-07-30 2010-07-30 Server system and operation method thereof

Country Status (1)

Country Link
CN (1) CN102346707B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9529583B2 (en) * 2013-01-15 2016-12-27 Intel Corporation Single microcontroller based management of multiple compute nodes
TWI614613B (en) * 2014-09-11 2018-02-11 廣達電腦股份有限公司 Server system and associated control method
CN105988908B (en) * 2015-02-04 2018-11-06 昆达电脑科技(昆山)有限公司 The global data processing system of single BMC multiservers
US10587935B2 (en) * 2015-06-05 2020-03-10 Quanta Computer Inc. System and method for automatically determining server rack weight
CN105099776A (en) * 2015-07-21 2015-11-25 曙光云计算技术有限公司 Cloud server management system
US10116750B2 (en) * 2016-04-01 2018-10-30 Intel Corporation Mechanism for highly available rack management in rack scale environment
CN108337307B (en) * 2018-01-31 2021-06-29 郑州云海信息技术有限公司 Multi-path server and communication method between nodes thereof
CN109271330A (en) * 2018-08-16 2019-01-25 华东计算技术研究所(中国电子科技集团公司第三十二研究所) General BMC system based on integrated information system
CN113970961A (en) * 2021-10-25 2022-01-25 西安超越申泰信息科技有限公司 Method for controlling heat dissipation of BIOS through BMC and server

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030130969A1 (en) * 2002-01-10 2003-07-10 Intel Corporation Star intelligent platform management bus topology
CN1983987A (en) * 2006-05-12 2007-06-20 华为技术有限公司 Monitor of rear card board in intelligent-platform management interface system
US20070233833A1 (en) * 2006-03-29 2007-10-04 Inventec Corporation Data transmission system for electronic devices with server units
CN101056205A (en) * 2007-04-04 2007-10-17 杭州华为三康技术有限公司 A management method, system and device based on ATCA architecture-based server

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030130969A1 (en) * 2002-01-10 2003-07-10 Intel Corporation Star intelligent platform management bus topology
US20070233833A1 (en) * 2006-03-29 2007-10-04 Inventec Corporation Data transmission system for electronic devices with server units
CN1983987A (en) * 2006-05-12 2007-06-20 华为技术有限公司 Monitor of rear card board in intelligent-platform management interface system
CN101056205A (en) * 2007-04-04 2007-10-17 杭州华为三康技术有限公司 A management method, system and device based on ATCA architecture-based server

Also Published As

Publication number Publication date
CN102346707A (en) 2012-02-08

Similar Documents

Publication Publication Date Title
CN102346707B (en) Server system and operation method thereof
USRE47289E1 (en) Server system and operation method thereof
CN102223394B (en) Methods and servers to provide remote direct access of solid-state storage
US7069349B2 (en) IPMI dual-domain controller
US7395367B2 (en) Method using a master node to control I/O fabric configuration in a multi-host environment
CN109471770B (en) System management method and device
US7761622B2 (en) Centralized server rack management using USB
CN103117866B (en) Switch fabric management method and system
US10346156B2 (en) Single microcontroller based management of multiple compute nodes
CN110809760B (en) Resource pool management method and device, resource pool control unit and communication equipment
CN105868149B (en) Serial port information transmission method and device
US10019399B2 (en) System for designing network on chip interconnect arrangements
CN102622279B (en) Redundancy control system, method and Management Controller
US20070165520A1 (en) Port trunking between switches
CN105874442B (en) The method that endpoint device is accessed in computer system and computer system
CN107835089B (en) Method and device for managing resources
CN105988905A (en) Exception processing method and apparatus
JP2005327288A (en) Method and device for excluding hidden storage channel between partitions and partition analysis
CN112015689A (en) Serial port output path switching method, system and device and switch
CN114356725B (en) Case management system
CN111858187A (en) Electronic equipment and service switching method and device
US20230078518A1 (en) Systems and methods for collapsing resources used in cloud deployments
CN108959026A (en) A kind of method of accurate monitoring RAID card
CN108959165A (en) A kind of management system of GPU whole machine cabinet cluster
CN114860389B (en) Virtual machine migration method and device, readable storage medium and computer equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant