TWI697768B - Reset bmc control method - Google Patents

Reset bmc control method Download PDF

Info

Publication number
TWI697768B
TWI697768B TW108107557A TW108107557A TWI697768B TW I697768 B TWI697768 B TW I697768B TW 108107557 A TW108107557 A TW 108107557A TW 108107557 A TW108107557 A TW 108107557A TW I697768 B TWI697768 B TW I697768B
Authority
TW
Taiwan
Prior art keywords
logic value
baseboard management
control unit
heartbeat signal
management controller
Prior art date
Application number
TW108107557A
Other languages
Chinese (zh)
Other versions
TW202034124A (en
Inventor
張衍輝
陳惠玲
Original Assignee
神雲科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 神雲科技股份有限公司 filed Critical 神雲科技股份有限公司
Priority to TW108107557A priority Critical patent/TWI697768B/en
Application granted granted Critical
Publication of TWI697768B publication Critical patent/TWI697768B/en
Publication of TW202034124A publication Critical patent/TW202034124A/en

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

A reset baseboard management controller(BMC) control method includes: generating a first heartbeat signal and a I2C heartbeat signal that jumps between a first logic value and a second logic value by each BMC in startup and normal operation; storing a plurality of health flags corresponding to the BMCs by a control unit; generating a corresponding reset signal at least based on the logic value of each of the health flags to reset the BMC that corresponds the logic value of the health flag equal to the second logic value by the control unit.

Description

重置基板管理控制器的控制方法Control method for resetting substrate management controller

本發明是有關於一種控制方法,特別是指一種重置基板管理控制器的控制方法。The present invention relates to a control method, in particular to a control method for resetting a substrate management controller.

習知的機櫃裝置(Rack)在具有多個節點(Multi-node)的設計中,使用者或管理者可以透過其基板管理控制器(Baseboard Management Controller, BMC)所支援的遠端監控的功能而獲知整個機櫃裝置的系統狀態。然而,當該機櫃裝置所包含的其中至少一基板管理控制器發生運作異常,且該機櫃裝置也不是採用機櫃管理控制器(Chassis Management Controller,  CMC)的架構時,使用者或管理者便不能透過遠端控制的方式來重置該運作異常的基板管理控制器,而必須安排人員至該機櫃裝置以執行按壓一對應基板管理控制器重置命令的重置鍵。因此,如何提供一種在非機櫃管理控制器(CMC)的架構下,可以正確且有效率的控制基板管理控制器執行重置的控制方法便成為一個待解決的問題。The conventional rack device (Rack) is designed with multiple nodes (Multi-node), and the user or administrator can use the remote monitoring function supported by its baseboard management controller (BMC). Know the system status of the entire cabinet device. However, when at least one of the baseboard management controllers included in the rack device malfunctions, and the rack device does not use the chassis management controller (CMC) architecture, users or administrators cannot The remote control method is used to reset the abnormally operating baseboard management controller, and personnel must be arranged to the cabinet device to execute a reset button corresponding to the baseboard management controller reset command. Therefore, how to provide a control method that can accurately and efficiently control the reset of the baseboard management controller under the non-cabinet management controller (CMC) architecture has become a problem to be solved.

因此,本發明的目的,即在提供一種正確且有效率的重置基板管理控制器的控制方法。Therefore, the purpose of the present invention is to provide a correct and efficient control method for resetting the baseboard management controller.

於是,本發明之一觀點,提供一種重置基板管理控制器的控制方法,適用於一機櫃裝置,該機櫃裝置包含多個基板管理控制器及一電連接該等基板管理控制器的控制單元,該控制方法包含步驟(a)~(e)。Therefore, in one aspect of the present invention, a control method for resetting a baseboard management controller is provided, which is suitable for a cabinet device that includes a plurality of baseboard management controllers and a control unit electrically connected to the baseboard management controllers, The control method includes steps (a) to (e).

於步驟(a),藉由每一該基板管理控制器在啟動且正常運作時,產生在一第一邏輯值及一第二邏輯值之間跳動的一第一心跳信號及一內部整合電路(Inter-Integrated Circuit, I2C)心跳信號。In step (a), when each baseboard management controller is activated and operating normally, a first heartbeat signal and an internal integrated circuit that beat between a first logic value and a second logic value are generated ( Inter-Integrated Circuit, I2C) heartbeat signal.

於步驟(b),藉由該控制單元儲存分別對應該等基板管理控制器的多個健康旗標。In step (b), the control unit stores a plurality of health flags corresponding to the baseboard management controllers.

於步驟(c),當該控制單元判斷對應該等基板管理控制器之其中一者的該第一心跳信號及該內部整合電路(I2C)心跳信號有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作正常,將對應的該健康旗標的邏輯值設定為該第一邏輯值。In step (c), when the control unit determines that the first heartbeat signal corresponding to one of the baseboard management controllers and the internal integrated circuit (I2C) heartbeat signal are in the first logic value and the second logic When the values jump, it is determined that the corresponding BMC is operating normally, and the corresponding logic value of the health flag is set to the first logic value.

於步驟(d),當該控制單元判斷對應該等基板管理控制器之其中一者的該第一心跳信號或該內部整合電路(I2C)心跳信號沒有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作異常,將對應的該健康旗標的邏輯值設定為該第二邏輯值。In step (d), when the control unit determines that the first heartbeat signal or the internal integrated circuit (I2C) heartbeat signal corresponding to one of the baseboard management controllers is not between the first logic value and the second logic When the values jump between the values, it is determined that the corresponding BMC is operating abnormally, and the corresponding logic value of the health flag is set to the second logic value.

於步驟(e)藉由該控制單元根據每一該健康旗標的邏輯值,產生對應的一重置信號,以重置對應該健康旗標的邏輯值等於該第二邏輯值的該基板管理控制器。In step (e), the control unit generates a corresponding reset signal according to the logic value of each health flag, so as to reset the baseboard management controller whose logic value corresponding to the health flag is equal to the second logic value .

在一些實施態樣中,其中,在步驟(e)中,在該控制單元判斷對應運作異常的該基板管理控制器的該健康旗標的邏輯值等於該第二邏輯值達一預定時間時,該控制單元產生對應運作異常的該基板管理控制器的該重置信號,以重置運作異常的該基板管理控制器。In some embodiments, in step (e), when the control unit determines that the logic value of the health flag of the baseboard management controller corresponding to the abnormal operation is equal to the second logic value for a predetermined time, the The control unit generates the reset signal corresponding to the abnormally operating baseboard management controller to reset the abnormally operating baseboard management controller.

在另一些實施態樣中,其中,在步驟(a)中,每一該基板管理控制器還產生一存在信號,並在啟動且正常運作時,產生該第一心跳信號及該內部整合電路(I2C)心跳信號。In some other implementations, in step (a), each BMC also generates a presence signal, and when it is activated and operating normally, generates the first heartbeat signal and the internal integrated circuit ( I2C) Heartbeat signal.

在步驟(c)中,該控制單元先判斷該存在信號的邏輯值等於一預設的邏輯值時,再判斷該第一心跳信號及該內部整合電路(I2C)心跳信號有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作正常。In step (c), the control unit first determines that the logic value of the presence signal is equal to a preset logic value, and then determines that the first heartbeat signal and the internal integrated circuit (I2C) heartbeat signal are in the first logic When the value and the second logical value jump between, it is determined that the corresponding baseboard management controller operates normally.

在步驟(d)中,該控制單元先判斷該存在信號的邏輯值等於該預設的邏輯值時,再判斷該第一心跳信號或該內部整合電路(I2C)心跳信號沒有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作異常。In step (d), the control unit first determines that the logic value of the presence signal is equal to the preset logic value, and then determines that the first heartbeat signal or the internal integrated circuit (I2C) heartbeat signal is not in the first logic When there is a jump between the value and the second logic value, it is determined that the corresponding baseboard management controller operates abnormally.

在另一些實施態樣中,其中,該控制單元為一個包括記憶體的複雜可程式邏輯裝置(Complex programmable Logic Device, CPLD)。In other embodiments, the control unit is a complex programmable logic device (CPLD) including memory.

於是,本發明之另一觀點,提供另一種重置基板管理控制器的控制方法,適用於一機櫃裝置及一電腦終端,該機櫃裝置包含多個基板管理控制器及一電連接該等基板管理控制器的控制單元,該控制方法包含步驟(a)~(f)。Therefore, another aspect of the present invention provides another control method for resetting the baseboard management controller, which is suitable for a cabinet device and a computer terminal. The cabinet device includes a plurality of baseboard management controllers and an electrical connection with the baseboard management controllers. The control unit of the controller, the control method includes steps (a) to (f).

於步驟(a),藉由每一該基板管理控制器在啟動且正常運作時,產生在一第一邏輯值及一第二邏輯值之間跳動的一第一心跳信號及一內部整合電路(Inter-Integrated Circuit, I2C)心跳信號。In step (a), when each baseboard management controller is activated and operating normally, a first heartbeat signal and an internal integrated circuit that beat between a first logic value and a second logic value are generated ( Inter-Integrated Circuit, I2C) heartbeat signal.

於步驟(b),藉由該控制單元儲存分別對應該等基板管理控制器的多個健康旗標。In step (b), the control unit stores a plurality of health flags corresponding to the baseboard management controllers.

於步驟(c),當該控制單元判斷對應該等基板管理控制器之其中一者的該第一心跳信號及該內部整合電路(I2C)心跳信號有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作正常,將對應的該健康旗標的邏輯值設定為該第一邏輯值。In step (c), when the control unit determines that the first heartbeat signal corresponding to one of the baseboard management controllers and the internal integrated circuit (I2C) heartbeat signal are in the first logic value and the second logic When the values jump, it is determined that the corresponding BMC is operating normally, and the corresponding logic value of the health flag is set to the first logic value.

於步驟(d),當該控制單元判斷對應該等基板管理控制器之其中一者的該第一心跳信號或該內部整合電路(I2C)心跳信號沒有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作異常,將對應的該健康旗標的邏輯值設定為該第二邏輯值。In step (d), when the control unit determines that the first heartbeat signal or the internal integrated circuit (I2C) heartbeat signal corresponding to one of the baseboard management controllers is not between the first logic value and the second logic When the values jump between the values, it is determined that the corresponding BMC is operating abnormally, and the corresponding logic value of the health flag is set to the second logic value.

於步驟(e),藉由運作正常的該等基板管理控制器之其中一者與該電腦終端建立連線,以接收藉由一使用者經由該電腦終端輸入所產生且對應運作異常的該基板管理控制器的一重置指令。In step (e), a connection is established with the computer terminal by one of the normally operating baseboard management controllers to receive the abnormally operating baseboard generated by a user input through the computer terminal A reset command of the management controller.

於步驟(f),藉由該控制單元根據每一該健康旗標的邏輯值及該重置指令,產生對應的一重置信號,以重置對應運作異常的該基板管理控制器。In step (f), the control unit generates a corresponding reset signal according to the logic value of each health flag and the reset command, so as to reset the baseboard management controller corresponding to the abnormal operation.

在一些實施態樣中,其中,在步驟(f)中,該控制單元經由運作正常的該等基板管理控制器之其中該者接收該重置指令,並在判斷對應運作異常的該基板管理控制器的該健康旗標的邏輯值等於該第二邏輯值時,產生對應的該重置信號,以重置運作異常的該基板管理控制器。In some embodiments, in step (f), the control unit receives the reset command via one of the baseboard management controllers that are functioning normally, and judges the baseboard management controller corresponding to the abnormal operation When the logic value of the health flag of the device is equal to the second logic value, the corresponding reset signal is generated to reset the baseboard management controller that operates abnormally.

在另一些實施態樣中,其中,在步驟(a)中,每一該基板管理控制器還產生一存在信號,並在啟動且正常運作時,產生該第一心跳信號及該內部整合電路(I2C)心跳信號。In some other implementations, in step (a), each BMC also generates a presence signal, and when it is activated and operating normally, generates the first heartbeat signal and the internal integrated circuit ( I2C) Heartbeat signal.

在步驟(c)中,該控制單元先判斷該存在信號的邏輯值等於一預設的邏輯值時,再判斷該第一心跳信號及該內部整合電路(I2C)心跳信號有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作正常。In step (c), the control unit first determines that the logic value of the presence signal is equal to a preset logic value, and then determines that the first heartbeat signal and the internal integrated circuit (I2C) heartbeat signal are in the first logic When the value and the second logical value jump between, it is determined that the corresponding baseboard management controller is operating normally.

在步驟(d)中,該控制單元先判斷該存在信號的邏輯值等於該預設的邏輯值時,再判斷該第一心跳信號或該內部整合電路(I2C)心跳信號沒有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作異常。In step (d), the control unit first determines that the logic value of the presence signal is equal to the preset logic value, and then determines that the first heartbeat signal or the internal integrated circuit (I2C) heartbeat signal is not in the first logic When there is a jump between the value and the second logic value, it is determined that the corresponding baseboard management controller operates abnormally.

在另一些實施態樣中,其中,該控制單元為一個包括記憶體的複雜可程式邏輯裝置(Complex programmable Logic Device, CPLD)。In other embodiments, the control unit is a complex programmable logic device (CPLD) including memory.

本發明的功效在於:藉由該控制單元先根據該第一心跳信號及該內部整合電路(I2C)心跳信號,判斷對應的該基板管理控制器是否運作正常,以決定對應的該健康旗標的邏輯值。該控制單元再根據另一電腦終端所產生的一重置指令,且進一步確認對應的該健康旗標的邏輯值後,或者,根據該健康旗標的邏輯值所指示的時間大於該預定時間時,才產生對應該基板管理控制器的該重置信號,以重置對應的該基板管理控制器(即運作異常者),而能分別實現遠端重置或自動判定而重置運作異常的基板管理控制器之目的,以解決習知的先前技術的問題。The effect of the present invention is that the control unit first determines whether the corresponding BMC is operating normally according to the first heartbeat signal and the internal integrated circuit (I2C) heartbeat signal to determine the logic of the corresponding health flag value. The control unit then further confirms the corresponding logical value of the health flag according to a reset instruction generated by another computer terminal, or only when the time indicated by the logical value of the health flag is greater than the predetermined time The reset signal corresponding to the baseboard management controller is generated to reset the corresponding baseboard management controller (that is, the abnormal operation), which can realize the remote reset or automatically determine and reset the abnormal operation of the baseboard management control. The purpose of the device is to solve the conventional prior art problems.

在本發明被詳細描述之前,應當注意在以下的說明內容中,類似的元件是以相同的編號來表示。Before the present invention is described in detail, it should be noted that in the following description, similar elements are represented by the same numbers.

參閱圖1,本發明重置基板管理控制器的控制方法適用於一機櫃裝置(Rack)及一電腦終端9,例如是一伺服器機櫃(Server Rack)。該機櫃裝置包含一機櫃殼體7、設置在該機櫃殼體7內的多個風扇單元51~54、多個機板41~44、多個識別接腳(ID Pin)61~64、多個匯流排31~34、多個基板管理控制器(Baseboard Management Controller, BMC)21~24、及一控制單元1,且該機櫃裝置不是採用機櫃管理控制器(CMC)的架構。該電腦終端9例如是一電腦主機。Referring to FIG. 1, the control method for resetting the baseboard management controller of the present invention is applicable to a rack device (Rack) and a computer terminal 9, such as a server rack (Server Rack). The cabinet device includes a cabinet shell 7, a plurality of fan units 51 to 54 arranged in the cabinet shell 7, a plurality of machine boards 41 to 44, a plurality of identification pins (ID Pin) 61 to 64, a plurality of The bus bars 31 to 34, a plurality of baseboard management controllers (BMC) 21 to 24, and a control unit 1, and the cabinet device does not adopt a cabinet management controller (CMC) architecture. The computer terminal 9 is, for example, a computer host.

每一該機板41~44被視為一個節點(Node),每一該風扇單元51~54包括多個風扇。該等識別接腳61~64及該等基板管理控制器21~24分別一對一地設置在該等機板41~44上。在本實施例中,為方便說明起見,圖1是以四個基板管理控制器21~24、四個匯流排31~34、四個風扇單元51~54、及四個機板41~44為例作說明。該等匯流排31~34支援一種內部整合電路(Inter-Integrated Circuit;I2C)的協定。而在其他實施例中,該風扇單元51~54及該風扇的數量也可以為單數或其他複數,且該風扇單元51~54可以分別設置在該等機板41~44上,或該等機板41~44之外,皆不在此限。Each board 41-44 is regarded as a node, and each fan unit 51-54 includes a plurality of fans. The identification pins 61 to 64 and the baseboard management controllers 21 to 24 are respectively arranged on the boards 41 to 44 one to one. In this embodiment, for convenience of description, FIG. 1 shows four baseboard management controllers 21-24, four bus bars 31-34, four fan units 51-54, and four board 41-44. Take an example for illustration. The bus bars 31-34 support an Inter-Integrated Circuit (I2C) protocol. In other embodiments, the fan units 51 to 54 and the number of the fans can also be singular or other plural numbers, and the fan units 51 to 54 can be arranged on the machine boards 41 to 44, or the machines Except for boards 41~44, this limit is not applicable.

每一該基板管理控制器21~24經由該匯流排31~34電連接該控制單元1。該等基板管理控制器21~24還分別電連接該等風扇單元51~54,並還分別經由該等匯流排31~34將多個運作資料儲存至該控制單元1。每一該基板管理控制器21~24還偵測所電連接的該風扇單元51~54以獲得相關於對應的該風扇單元51~54的該運作資料。每一該風扇單元51~54的該運作資料例如是該風扇單元51~54的該等風扇的轉速、溫度等等資訊。另外要補充說明的是:在本實施例中,該運作資料是以風扇單元51~54的相關運作資料為例,在其他實施例中,該運作資料也可以包含其他該伺服器機櫃中的相關資訊,例如節點的溫度、功耗、SN資訊、ID、上電狀態、開機狀態、硬體設備的健康狀態、配置資訊(如CPU、記憶體、硬碟、BIOS/BMC版本)、及開關機控制等,或者,電源單元的輸入功耗、輸出功耗、輸入電壓、輸出電壓、輸入電流、輸出電流、開關控制、狀態、及工作溫度等,都不在此限。Each of the baseboard management controllers 21-24 is electrically connected to the control unit 1 via the bus bars 31-34. The baseboard management controllers 21-24 are also electrically connected to the fan units 51-54, respectively, and also store a plurality of operating data to the control unit 1 via the bus bars 31-34, respectively. Each of the BMCs 21-24 also detects the electrically connected fan unit 51-54 to obtain the operation data related to the corresponding fan unit 51-54. The operation data of each fan unit 51-54 is, for example, the rotation speed and temperature of the fans of the fan unit 51-54. In addition, it should be added that: in this embodiment, the operation data is based on the relevant operation data of the fan units 51 to 54 as an example. In other embodiments, the operation data may also include other related information in the server cabinet. Information, such as node temperature, power consumption, SN information, ID, power-on status, boot status, health status of hardware devices, configuration information (such as CPU, memory, hard disk, BIOS/BMC version), and power on and off Control, etc., or the input power consumption, output power consumption, input voltage, output voltage, input current, output current, switch control, status, and operating temperature of the power supply unit are not limited to this.

該控制單元1包括一個記憶體11,該記憶體11包含多個記憶區塊,例如是五個,以接收並儲存分別來自該等基板管理控制器21~24的該等運作資料。該控制單元1例如是一個包括記憶體11且設置於一本地(Local)端的複雜可程式邏輯裝置(Complex programmable Logic Device, CPLD)。也就是說,該控制單元1例如可以設置在機櫃裝置的一主板、一背板、或一風扇板上。該記憶體11的該五個記憶區塊之其中四個記憶區塊用於分別儲存該等運作資料,而剩餘的另一個記憶區塊用於儲存運作在一主控模式的該基板管理控制器(如21)所產生的相關資訊,如機櫃殼體7內的環境溫度、該等電源供應器的瓦特數與溫度等等,以提供其他該等基板管理控制器22~24讀取。The control unit 1 includes a memory 11, and the memory 11 includes a plurality of memory blocks, such as five, to receive and store the operating data from the baseboard management controllers 21-24. The control unit 1 is, for example, a complex programmable logic device (CPLD) that includes a memory 11 and is set on a local end. In other words, the control unit 1 may be arranged on a main board, a back board, or a fan board of the cabinet device, for example. Four of the five memory blocks of the memory 11 are used to store the operating data, and the remaining memory block is used to store the baseboard management controller operating in a master control mode Related information (such as 21) generated, such as the ambient temperature in the cabinet casing 7, the wattage and temperature of the power supplies, etc., can be read by the other baseboard management controllers 22-24.

另外要補充說明的是:每一該基板管理控制器21~24還接收一對應的識別信號,並根據對應的該識別信號,將所產生的該運作資料經由對應的該匯流排31~34儲存至該控制單元1的該記憶體11的該等記憶區塊112~115之其中一對應者。在本實施例中,該四個基板管理控制器21~24分別設置在該電腦系統的該四個機板41~44時,該四個基板管理控制器21~24分別根據該四組識別接腳61~64,以分別決定對應的該四個識別信號的邏輯值,即每一該基板管理控制器21~24根據設置在相同機板41~44上的對應的該識別接腳61~64的該識別信號的邏輯值,進而使得該四個基板管理控制器21~24分別據以將該等運作資料儲存至該控制單元1的該記憶體11的該四個記憶區塊的位址。In addition, it should be noted that: each of the baseboard management controllers 21~24 also receives a corresponding identification signal, and according to the corresponding identification signal, stores the generated operation data through the corresponding bus 31~34 A corresponding one of the memory blocks 112 to 115 of the memory 11 of the control unit 1. In this embodiment, when the four baseboard management controllers 21 to 24 are respectively set on the four boards 41 to 44 of the computer system, the four baseboard management controllers 21 to 24 are respectively connected according to the four groups of identification Pins 61 to 64 are used to determine the corresponding logical values of the four identification signals respectively, that is, each of the baseboard management controllers 21 to 24 according to the corresponding identification pins 61 to 64 set on the same board 41 to 44 According to the logical value of the identification signal, the four baseboard management controllers 21-24 respectively store the operating data to the addresses of the four memory blocks of the memory 11 of the control unit 1.

參閱圖1與圖2,圖2是本發明重置基板管理控制器的控制方法的一實施例,並包含步驟S1~S6。1 and FIG. 2. FIG. 2 is an embodiment of the control method for resetting the baseboard management controller of the present invention, and includes steps S1 to S6.

於步驟S1,每一該基板管理控制器21~24產生一存在信號(Present)P1~P4,且將該存在信號P1~P4傳送至該控制單元1,並在啟動且正常運作時,產生在一第一邏輯值及一第二邏輯值之間跳動的一第一心跳信號(Heartbeat) HB1~HB4及一內部整合電路(Inter-Integrated Circuit, I2C)心跳信號。In step S1, each of the baseboard management controllers 21~24 generates a presence signal (Present) P1~P4, and transmits the presence signal P1~P4 to the control unit 1, and when it starts and operates normally, it generates A first heartbeat signal (Heartbeat) HB1~HB4 and an Inter-Integrated Circuit (I2C) heartbeat signal beating between a first logic value and a second logic value.

更詳細地說,該存在信號P1~P4是當對應的該機板41~44插設於該機櫃殼體7的插槽或連接器時,且開機啟動後,就會產生邏輯值的變化,例如由邏輯0變為邏輯1。每一該第一心跳信號HB1~HB4是指示對應的該基板管理控制器21~24的韌體是否運作正常,且不經由該等匯流排31~34傳送至該控制單元1。每一該內部整合電路(I2C)心跳信號是指示對應的該基板管理控制器21~24的軟體是否運作正常,且經由該等匯流排31~34傳送至該控制單元1。該第一邏輯值及該第二邏輯值之其中一者及另一者分別是邏輯1及邏輯0,該心跳信號的頻率例如是0.5或1赫茲,但不在此限。In more detail, the presence signals P1 to P4 are when the corresponding boards 41 to 44 are inserted into the slots or connectors of the cabinet casing 7, and the logic value changes after the boot is started. For example, from logic 0 to logic 1. Each of the first heartbeat signals HB1~HB4 indicates whether the firmware of the corresponding BMC 21~24 is operating normally, and is not transmitted to the control unit 1 through the buses 31~34. Each of the internal integrated circuit (I2C) heartbeat signals indicates whether the software of the corresponding BMC 21-24 is operating normally, and is transmitted to the control unit 1 through the buses 31-34. One and the other of the first logic value and the second logic value are logic 1 and logic 0, respectively. The frequency of the heartbeat signal is, for example, 0.5 or 1 Hz, but not limited thereto.

於步驟S2,該控制單元1儲存分別對應該等基板管理控制器21~24的多個健康旗標。In step S2, the control unit 1 stores a plurality of health flags corresponding to the baseboard management controllers 21-24.

於步驟S3,當該控制單元1判斷該存在信號P1~P4的邏輯值等於一預設的邏輯值時,再判斷對應該等基板管理控制器21~24之其中一者的該第一心跳信號HB1~HB4及該內部整合電路(I2C)心跳信號都有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器21~24運作正常,將對應的該健康旗標的邏輯值設定為該第一邏輯值。In step S3, when the control unit 1 determines that the logic value of the presence signals P1~P4 is equal to a preset logic value, it then determines the first heartbeat signal corresponding to one of the baseboard management controllers 21~24 When the heartbeat signals of HB1~HB4 and the internal integrated circuit (I2C) are beating between the first logic value and the second logic value, it is determined that the corresponding BMC 21~24 is operating normally, and the corresponding The logical value of the health flag is set to the first logical value.

於步驟S4,當該控制單元1判斷該存在信號P1~P4的邏輯值等於該預設的邏輯值時,再判斷對應該等基板管理控制器21~24之其中一者的該第一心跳信號HB1~HB4或該內部整合電路(I2C)心跳信號沒有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器21~24運作異常,將對應的該健康旗標的邏輯值設定為該第二邏輯值。In step S4, when the control unit 1 determines that the logic value of the presence signals P1~P4 is equal to the preset logic value, it then determines the first heartbeat signal corresponding to one of the baseboard management controllers 21~24 When the heartbeat signal of HB1~HB4 or the internal integrated circuit (I2C) does not jump between the first logic value and the second logic value, it is determined that the corresponding BMC 21-24 is operating abnormally, and the corresponding health The logic value of the flag is set to the second logic value.

舉例來說,當該基板管理控制器21~24在開機啟動前,或是在開機啟動後但其所執行的軟體運作發生異常時,則所產生的該內部整合電路(I2C)心跳信號的邏輯值會保持在該第一邏輯值或該第二邏輯值,如在邏輯0,或者,當該基板管理控制器21~24所執行的韌體運作發生異常時,則所產生的該第一心跳信號HB1~HB4的邏輯值會保持在該第一邏輯值或該第二邏輯值,如在邏輯0。For example, when the baseboard management controller 21~24 is before booting up, or after booting up but the software it runs abnormally, the logic of the internal integrated circuit (I2C) heartbeat signal generated The value will remain at the first logic value or the second logic value, such as logic 0, or the first heartbeat generated when the firmware operation executed by the baseboard management controller 21-24 is abnormal The logic values of the signals HB1 to HB4 will remain at the first logic value or the second logic value, such as logic 0.

由步驟S2~S4可知,該控制單元1接收來自該等基板管理控制器21~24的該等存在信號P1~P4、該等第一心跳信號HB1~HB4、及該等內部整合電路(I2C)心跳信號,並儲存分別對應該等基板管理控制器21~24的多個健康旗標,並根據每一該基板管理控制器21~24的該存在信號P1~P4、該第一心跳信號HB1~HB4、及該內部整合電路(I2C)心跳信號,在判斷對應的該基板管理控制器21~24運作正常時,將該健康旗標的邏輯值等於該第一邏輯值,在判斷對應的該基板管理控制器21~24運作異常時,將該健康旗標的邏輯值等於該第二邏輯值。另外要特別補充說明的是:在其他實施例中,該控制單元1也可以不根據該內部整合電路(I2C)心跳信號,改為根據該存在信號P1~P4及該第一心跳信號HB1~HB4,判斷對應的該基板管理控制器21~24是否運作正常。It can be seen from steps S2 to S4 that the control unit 1 receives the presence signals P1 to P4, the first heartbeat signals HB1 to HB4, and the internal integrated circuits (I2C) from the baseboard management controllers 21 to 24 Heartbeat signals, and store a plurality of health flags corresponding to the BMCs 21-24, and according to the presence signals P1~P4 and the first heartbeat signal HB1~ of each BMC 21-24 HB4, and the internal integrated circuit (I2C) heartbeat signal, when judging that the corresponding baseboard management controller 21~24 is operating normally, the logical value of the health flag is equal to the first logical value, and when judging the corresponding baseboard management When the controllers 21-24 operate abnormally, the logical value of the health flag is equal to the second logical value. In addition, it should be noted that in other embodiments, the control unit 1 may not be based on the internal integrated circuit (I2C) heartbeat signal, but based on the presence signal P1~P4 and the first heartbeat signal HB1~HB4. , To determine whether the corresponding baseboard management controller 21-24 is operating normally.

於步驟S5,運作正常的該等基板管理控制器21~24之其中一者與該電腦終端9建立連線,以接收藉由一使用者經由該電腦終端9輸入所產生且對應運作異常的該基板管理控制器21~24的一重置指令。更具體地說,一個使用者或管理者能夠藉由該電腦終端9與該機櫃裝置的該等基板管理控制器21~24之其中一個運作正常者建立連線,即遠端連線的方式。該使用者還能夠藉由該電腦主機,獲知該機櫃裝置的該等基板管理控制器21~24之其中運作異常者,例如讀取該控制單元1所儲存的該等健康旗標的邏輯值,進而藉由該電腦主機產生對應運作異常的該基板管理控制器21~24的該重置指令,即該使用者藉由該電腦終端9在第一次以人工確認基板管理控制器21~24發生異常時產生該重置指令。In step S5, one of the normally operating baseboard management controllers 21-24 establishes a connection with the computer terminal 9 to receive the abnormal operation generated by a user input through the computer terminal 9 A reset command of the baseboard management controller 21-24. More specifically, a user or an administrator can establish a connection with one of the baseboard management controllers 21-24 of the cabinet device through the computer terminal 9 and a normal operation, that is, a remote connection. The user can also learn the abnormal operation of the baseboard management controllers 21-24 of the cabinet device through the computer host, for example, read the logic values of the health flags stored in the control unit 1, and then The computer host generates the reset command corresponding to the abnormal operation of the baseboard management controller 21-24, that is, the user manually confirms the abnormality of the baseboard management controller 21-24 through the computer terminal 9 for the first time When the reset command is generated.

於步驟S6, 該控制單元1經由運作正常的該等基板管理控制器21~24之其中該者接收該重置指令,並在判斷對應運作異常的該基板管理控制器21~24的該健康旗標的邏輯值等於該第二邏輯值時,產生對應的一重置信號R1~R4,以重置對應該健康旗標的邏輯值等於該第二邏輯值(即運作異常)的該基板管理控制器21~24。換句話說,接收該重置指令的該基板管理控制器21~24根據對應的該健康旗標作第二次確認,以產生該重置信號R1~R4。In step S6, the control unit 1 receives the reset command through one of the normal operating BMCs 21-24, and determines the health flag corresponding to the abnormal operating BMC 21-24 When the target logic value is equal to the second logic value, a corresponding reset signal R1~R4 is generated to reset the BMC 21 whose logic value corresponding to the health flag is equal to the second logic value (that is, abnormal operation) ~24. In other words, the baseboard management controller 21-24 receiving the reset instruction makes a second confirmation according to the corresponding health flag to generate the reset signals R1 to R4.

因此,由步驟S5及S6可知,在本實施例中,在該基板管理控制器21~24之其中部分發生運作異常時,該使用者能夠透過網路藉由遠端連線的方式,先以人工作第一次的確認,再藉由運作正常的該基板管理控制器21~24作第二次的確認,以重置運作異常的該基板管理控制器21~24,而能避免習知的使用者需要直接去機房處理重新啟動的流程。而在其他實施例中,該步驟S5可以被省略,而該步驟S6改為該控制單元1在判斷對應運作異常的該基板管理控制器21~24的該健康旗標的邏輯值等於該第二邏輯值達一預定時間時,例如幾十毫秒,則該控制單元1產生對應運作異常的該基板管理控制器21~24的該重置信號R1~R4,以重置運作異常的該基板管理控制器21~24。Therefore, it can be seen from steps S5 and S6 that, in this embodiment, when some of the baseboard management controllers 21-24 are malfunctioning, the user can connect remotely through the network. The first confirmation of the person’s work, and the second confirmation by the baseboard management controller 21~24 that is operating normally, to reset the abnormally operating baseboard management controller 21~24, which can avoid the conventional The user needs to go directly to the computer room to handle the restart process. In other embodiments, the step S5 may be omitted, and the step S6 is changed to the control unit 1 when the control unit 1 determines that the logic value of the health flag of the BMC 21-24 corresponding to the abnormal operation is equal to the second logic When the value reaches a predetermined time, such as several tens of milliseconds, the control unit 1 generates the reset signals R1~R4 corresponding to the abnormally operating baseboard management controller 21-24 to reset the abnormally operating baseboard management controller 21~24.

綜上所述,藉由該控制單元先根據該存在信號的邏輯值作判斷,再根據該第一心跳信號或該第一心跳信號與該內部整合電路(I2C)心跳信號,判斷對應的該基板管理控制器是否運作正常,以決定對應的該健康旗標的邏輯值。使得使用者能夠藉由另一電腦終端在第一次以人工確認基板管理控制器發生異常時產生一重置指令,接著該控制單元在接收到該重置指令後,根據對應的該健康旗標作第二次確認,以產生該重置信號。或者,該控制單元根據該健康旗標的邏輯值所指示發生異常的時間大於該預定時間時,自動產生該重置信號,故確實能達成本發明的目的。In summary, the control unit first makes a judgment according to the logic value of the presence signal, and then judges the corresponding substrate according to the first heartbeat signal or the first heartbeat signal and the internal integrated circuit (I2C) heartbeat signal Whether the management controller is operating normally to determine the corresponding logic value of the health flag. This allows the user to use another computer terminal to manually confirm that the baseboard management controller is abnormal for the first time to generate a reset command, and then the control unit receives the reset command according to the corresponding health flag Make a second confirmation to generate the reset signal. Alternatively, the control unit automatically generates the reset signal when the abnormal time indicated by the logic value of the health flag is greater than the predetermined time, so the objective of the invention can indeed be achieved.

惟以上所述者,僅為本發明的實施例而已,當不能以此限定本發明實施的範圍,凡是依本發明申請專利範圍及專利說明書內容所作的簡單的等效變化與修飾,皆仍屬本發明專利涵蓋的範圍內。However, the foregoing are only examples of the present invention. When the scope of implementation of the present invention cannot be limited by this, all simple equivalent changes and modifications made according to the scope of the patent application of the present invention and the content of the patent specification still belong to Within the scope of the patent of the present invention.

1:控制單元 11:記憶體 21~24:基板管理控制器 31~34:匯流排 41~44:機板 51~54:風扇單元 61~64:識別接腳 7:機櫃殼體 9:電腦終端 R1~R4:重置信號 P1~P4:存在信號 HB1~HB4:第一心跳信號 S1~S6:步驟1: control unit 11: Memory 21~24: baseboard management controller 31~34: bus 41~44: Machine board 51~54: Fan unit 61~64: Identify the pins 7: Cabinet shell 9: Computer terminal R1~R4: reset signal P1~P4: There is a signal HB1~HB4: first heartbeat signal S1~S6: steps

本發明的其他的特徵及功效,將於參照圖式的實施方式中清楚地呈現,其中: 圖1是一方塊圖,說明本發明重置基板管理控制器的控制方法所適用的一機櫃裝置;及 圖2是一流程圖,說明本發明重置基板管理控制器的控制方法的一實施例。 Other features and effects of the present invention will be clearly presented in the embodiments with reference to the drawings, in which: Fig. 1 is a block diagram illustrating a cabinet device to which the control method for resetting the baseboard management controller of the present invention is applicable; and 2 is a flowchart illustrating an embodiment of the control method for resetting the baseboard management controller of the present invention.

S1~S6:步驟 S1~S6: steps

Claims (3)

一種重置基板管理控制器的控制方法,適用於一機櫃裝置,該機櫃裝置包含多個基板管理控制器及一電連接該等基板管理控制器的控制單元,該控制方法包含下列步驟:(a)藉由每一該基板管理控制器在啟動且正常運作時,產生在一第一邏輯值及一第二邏輯值之間跳動的一第一心跳信號及一內部整合電路(Inter-Integrated Circuit,I2C)心跳信號,每一該第一心跳信號是指示對應的該基板管理控制器的韌體是否運作正常,每一該內部整合電路心跳信號是指示對應的該基板管理控制器的軟體是否運作正常;(b)藉由該控制單元儲存分別對應該等基板管理控制器的多個健康旗標;(c)當該控制單元判斷對應該等基板管理控制器之其中一者的該第一心跳信號及該內部整合電路(I2C)心跳信號有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作正常,將對應的該健康旗標的邏輯值設定為該第一邏輯值;(d)當該控制單元判斷對應該等基板管理控制器之其中一者的該第一心跳信號或該內部整合電路(I2C)心跳信號沒有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作異常,將對應的該健康旗標的邏輯值設定為該第二邏輯值;及(e)藉由該控制單元根據每一該健康旗標的邏輯值, 產生對應的一重置信號,以重置對應該健康旗標的邏輯值等於該第二邏輯值的該基板管理控制器,在該控制單元判斷對應運作異常的該基板管理控制器的該健康旗標的邏輯值等於該第二邏輯值達一預定時間時,該控制單元產生對應運作異常的該基板管理控制器的該重置信號,以重置運作異常的該基板管理控制器。 A control method for resetting a baseboard management controller is applicable to a cabinet device that includes a plurality of baseboard management controllers and a control unit electrically connected to the baseboard management controllers. The control method includes the following steps: (a ) When each of the baseboard management controllers is activated and operating normally, it generates a first heartbeat signal that beats between a first logic value and a second logic value and an internal integrated circuit (Inter-Integrated Circuit, I2C) Heartbeat signal. Each first heartbeat signal indicates whether the corresponding BMC firmware is operating normally, and each internal integrated circuit heartbeat signal indicates whether the corresponding BMC software is operating normally. (B) The control unit stores a plurality of health flags corresponding to the baseboard management controllers; (c) when the control unit determines the first heartbeat signal corresponding to one of the baseboard management controllers And when the internal integrated circuit (I2C) heartbeat signal is beating between the first logic value and the second logic value, it is determined that the corresponding BMC is operating normally, and the corresponding logic value of the health flag is set to The first logic value; (d) when the control unit determines that the first heartbeat signal or the internal integrated circuit (I2C) heartbeat signal corresponding to one of the baseboard management controllers is not in the first logic value and the When the second logic value jumps, it is determined that the corresponding BMC is operating abnormally, and the logic value of the corresponding health flag is set to the second logic value; and (e) by the control unit according to each of the The logical value of the health flag, A corresponding reset signal is generated to reset the baseboard management controller whose logical value corresponding to the health flag is equal to the second logical value, and the control unit determines that the health flag corresponding to the abnormally operating baseboard management controller is When the logic value is equal to the second logic value for a predetermined time, the control unit generates the reset signal corresponding to the abnormal operation of the baseboard management controller to reset the abnormal operation of the baseboard management controller. 如請求項1所述的重置基板管理控制器的控制方法,其中,在步驟(a)中,每一該基板管理控制器還產生一存在信號,並在啟動且正常運作時,產生該第一心跳信號及該內部整合電路(I2C)心跳信號,在步驟(c)中,該控制單元先判斷該存在信號的邏輯值等於一預設的邏輯值時,再判斷該第一心跳信號及該內部整合電路(I2C)心跳信號有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作正常,在步驟(d)中,該控制單元先判斷該存在信號的邏輯值等於該預設的邏輯值時,再判斷該第一心跳信號或該內部整合電路(I2C)心跳信號沒有在該第一邏輯值及該第二邏輯值之間跳動時,判斷對應的該基板管理控制器運作異常。 The control method for resetting a baseboard management controller according to claim 1, wherein, in step (a), each baseboard management controller also generates a presence signal, and when it is started and normally operates, generates the first A heartbeat signal and the internal integrated circuit (I2C) heartbeat signal. In step (c), the control unit first determines that the logic value of the presence signal is equal to a preset logic value, and then determines the first heartbeat signal and the When the internal integrated circuit (I2C) heartbeat signal is beating between the first logic value and the second logic value, it is determined that the corresponding BMC is operating normally. In step (d), the control unit first determines the When the logic value of the presence signal is equal to the preset logic value, it is determined that the first heartbeat signal or the internal integrated circuit (I2C) heartbeat signal does not beat between the first logic value and the second logic value. The corresponding baseboard management controller operates abnormally. 如請求項1所述的重置基板管理控制器的控制方法,其中,該控制單元為一個包括記憶體的複雜可程式邏輯裝置(Complex programmable Logic Device,CPLD)。 The control method for resetting the substrate management controller according to claim 1, wherein the control unit is a complex programmable logic device (CPLD) including a memory.
TW108107557A 2019-03-07 2019-03-07 Reset bmc control method TWI697768B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW108107557A TWI697768B (en) 2019-03-07 2019-03-07 Reset bmc control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW108107557A TWI697768B (en) 2019-03-07 2019-03-07 Reset bmc control method

Publications (2)

Publication Number Publication Date
TWI697768B true TWI697768B (en) 2020-07-01
TW202034124A TW202034124A (en) 2020-09-16

Family

ID=72601817

Family Applications (1)

Application Number Title Priority Date Filing Date
TW108107557A TWI697768B (en) 2019-03-07 2019-03-07 Reset bmc control method

Country Status (1)

Country Link
TW (1) TWI697768B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI827031B (en) * 2022-04-24 2023-12-21 新加坡商鴻運科股份有限公司 Detection system and method of substrate management controller

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101938368A (en) * 2009-06-30 2011-01-05 国际商业机器公司 Virtual machine manager in blade server system and virtual machine processing method
TW201704929A (en) * 2015-07-30 2017-02-01 神雲科技股份有限公司 Server and method for detecting power reset
TW201729097A (en) * 2016-02-05 2017-08-16 神雲科技股份有限公司 Rack
CN108038019A (en) * 2017-12-25 2018-05-15 曙光信息产业(北京)有限公司 A kind of automatically restoring fault method and system of baseboard management controller

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101938368A (en) * 2009-06-30 2011-01-05 国际商业机器公司 Virtual machine manager in blade server system and virtual machine processing method
TW201704929A (en) * 2015-07-30 2017-02-01 神雲科技股份有限公司 Server and method for detecting power reset
TW201729097A (en) * 2016-02-05 2017-08-16 神雲科技股份有限公司 Rack
CN108038019A (en) * 2017-12-25 2018-05-15 曙光信息产业(北京)有限公司 A kind of automatically restoring fault method and system of baseboard management controller

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI827031B (en) * 2022-04-24 2023-12-21 新加坡商鴻運科股份有限公司 Detection system and method of substrate management controller

Also Published As

Publication number Publication date
TW202034124A (en) 2020-09-16

Similar Documents

Publication Publication Date Title
JP6515132B2 (en) Chassis management system and chassis management method
TWI582585B (en) Rack control system
US6968470B2 (en) System and method for power management in a server system
US8948000B2 (en) Switch fabric management
US7844768B2 (en) Blade server system and method of managing same
US8880938B2 (en) Reducing impact of a repair action in a switch fabric
US20150178095A1 (en) Synchronous bmc configuration and operation within cluster of bmc
CN113826073A (en) Dynamically configurable baseboard management controller
US7395323B2 (en) System and method for providing network address information in a server system
CN107179804B (en) Cabinet device
US7685348B2 (en) Dedicated server management card with hot swap functionality
US9978418B2 (en) System and method for automated hardware compatibility testing
US7263620B2 (en) System and method for graceful shutdown of host processor cards in a server system
CN111782283A (en) Automatic management method, device and medium for bare metal server
CN106940676B (en) Monitoring system of cabinet
US20200314172A1 (en) Server system and management method thereto
TWI697768B (en) Reset bmc control method
TW201729097A (en) Rack
CN111913551B (en) Control method for resetting baseboard management controller
US11113166B2 (en) Monitoring system and method with baseboard management controller
WO2017072904A1 (en) Computer system and failure detection method
TWI753606B (en) Master-slave interchangeable power supply device and its host, master-slave interchangeable power supply method and computer-readable recording medium
CN106933322B (en) Method and device for controlling multi-hard-disk spin-up
TWI726434B (en) Control method for solving abnormal operation of me
TWI704463B (en) Server system and management method thereto

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees