TWI598728B - Server and method for detecting power reset - Google Patents

Server and method for detecting power reset Download PDF

Info

Publication number
TWI598728B
TWI598728B TW104124763A TW104124763A TWI598728B TW I598728 B TWI598728 B TW I598728B TW 104124763 A TW104124763 A TW 104124763A TW 104124763 A TW104124763 A TW 104124763A TW I598728 B TWI598728 B TW I598728B
Authority
TW
Taiwan
Prior art keywords
processing unit
reset
server
power
event
Prior art date
Application number
TW104124763A
Other languages
Chinese (zh)
Other versions
TW201704929A (en
Inventor
郭明義
Original Assignee
神雲科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 神雲科技股份有限公司 filed Critical 神雲科技股份有限公司
Priority to TW104124763A priority Critical patent/TWI598728B/en
Publication of TW201704929A publication Critical patent/TW201704929A/en
Application granted granted Critical
Publication of TWI598728B publication Critical patent/TWI598728B/en

Links

Landscapes

  • Debugging And Monitoring (AREA)

Description

伺服器及其基板管理控制器及判斷伺服器重 置事件是否為電力中斷所導致的方法 Server and its baseboard management controller and judgment server Whether the event is caused by power interruption

本發明是有關於一種可檢測錯誤的系統及方法,特別是指一種伺服器及判斷伺服器重置事件是否為電力中斷所導致的方法。 The present invention relates to a system and method for detecting errors, and more particularly to a server and a method for determining whether a server reset event is a power interruption.

現有的伺服器,是接收來自外部供給的交流電力。但若交流電力由於若干原因而中斷,例如伺服器的電源插頭被意外拔下而導致的交流電力中斷,將使伺服器的運作因此受到影響,並可能致使系統關機或服務中斷等。 The existing server receives AC power from an external source. However, if the AC power is interrupted for several reasons, such as the interruption of the AC power caused by the accidental disconnection of the power plug of the server, the operation of the server will be affected, and the system may be shut down or the service may be interrupted.

目前偵測伺服器電力中斷而重置的作法,可使用一具有偵測交流電力中斷功能的高階電源供應器來供應該伺服器所需的電力,並以一管控伺服器運作狀況的基板管理控制器向該電源供應器取得發生電力重置事件的訊息,但此作法的缺點在於需特別採購具有偵測交流電力中斷功能的該高階電源供應器,此外,該高階電源供應器的使用將影響系統相容性,因此,無需使用高階電源供應器即能偵測伺服器電力中斷而重置是目前研究方向。 At present, detecting the power interruption of the server and resetting it, the high-order power supply with the function of detecting the AC power interruption can be used to supply the power required by the server, and the substrate management control of the server operating condition is controlled. The device obtains a message of a power reset event from the power supply, but the disadvantage of this method is that the high-order power supply with the function of detecting the alternating current power interruption needs to be specially purchased, and the use of the high-order power supply will affect the system. Compatibility, therefore, it is the current research direction to detect server power interruption without using a high-order power supply.

因此,本發明之目的,即在提供一種無需使用高階電源供應器而直接由伺服器的韌體偵測電力中斷而重置的判斷伺服器重置事件是否為電力中斷所導致的方法。 Accordingly, it is an object of the present invention to provide a method for determining whether a server reset event is reset by a server that is reset by a firmware of a firmware without using a high-order power supply.

於是,本發明判斷伺服器重置事件是否為電力中斷所導致的方法,由一伺服器執行,該伺服器包括一基板管理控制器,該基板管理控制器包括一隨機存取記憶體、一電連接該隨機存取記憶體的處理單元,及一電連接該處理單元的事件記錄單元,該隨機存取記憶體儲存一電力重置旗標,且該電力重置偵測方法包含一步驟(A)、一步驟(B)、一步驟(C),及一步驟(E)。 Therefore, the method for determining whether the server reset event is caused by a power interruption is performed by a server, the server includes a baseboard management controller, and the baseboard management controller includes a random access memory and an electric a processing unit connected to the random access memory, and an event recording unit electrically connected to the processing unit, the random access memory stores a power reset flag, and the power reset detection method includes a step (A ), one step (B), one step (C), and one step (E).

該步驟(A)是該處理單元根據所接收的一事件信號,以決定是否初始化該隨機存取記憶體。 The step (A) is that the processing unit determines whether to initialize the random access memory according to the received event signal.

該步驟(B)是該處理單元根據該電力重置旗標的一參數值,以判斷是否為一協定異常。 The step (B) is a parameter value of the processing unit according to the power reset flag to determine whether it is a protocol abnormality.

該步驟(C)是若步驟(B)的判斷結果為否,則該處理單元判斷是否為一自身操作異常。 In the step (C), if the result of the determination in the step (B) is NO, the processing unit determines whether it is an own operation abnormality.

該步驟(E)是若步驟(C)的判斷結果為否,則該處理單元記錄一電力重置事件至該事件記錄單元。 The step (E) is that if the result of the determination in the step (C) is NO, the processing unit records a power reset event to the event recording unit.

再者,本發明之另一目的,即在提供一種伺服器。 Furthermore, another object of the present invention is to provide a server.

於是,該伺服器包含一基板管理控制器,該基板管理控制器包括一隨機存取記憶體,及一處理單元。 Thus, the server includes a substrate management controller including a random access memory and a processing unit.

該隨機存取記憶體儲存一電力重置旗標。 The random access memory stores a power reset flag.

該處理單元電連接該隨機存取記憶體,根據所接收的一事件信號,以決定是否初始化該隨機存取記憶體,且根據該電力重置旗標的一參數值,以判斷是否為一協定異常;若判斷結果為否,則該處理單元判斷是否為一自身操作異常,若判斷結果亦為否,則該處理單元記錄一電力重置事件至該事件記錄單元。 The processing unit is electrically connected to the random access memory, according to the received event signal, to determine whether to initialize the random access memory, and according to a parameter value of the power reset flag, to determine whether it is a protocol abnormality If the determination result is no, the processing unit determines whether it is an own operation abnormality, and if the determination result is also no, the processing unit records a power reset event to the event recording unit.

本發明之功效在於:該伺服器可直接偵測該電力重置事件,無須採購使用高階電源供應器,從而不影響系統相容性。 The effect of the invention is that the server can directly detect the power reset event without purchasing a high-order power supply, thereby not affecting system compatibility.

1‧‧‧伺服器 1‧‧‧Server

11‧‧‧基板管理控制器 11‧‧‧Base management controller

111‧‧‧隨機存取記憶體 111‧‧‧ Random access memory

112‧‧‧處理單元 112‧‧‧Processing unit

113‧‧‧事件記錄單元 113‧‧‧ Event Recording Unit

114‧‧‧監控單元 114‧‧‧Monitoring unit

115‧‧‧協定層單元 115‧‧‧Member layer unit

116‧‧‧電子抹除式唯讀記憶體 116‧‧‧Electronic erasing read-only memory

12‧‧‧IPMI介面 12‧‧‧IPMI interface

13‧‧‧南橋 13‧‧‧South Bridge

2‧‧‧遠端伺服器 2‧‧‧Remote Server

21‧‧‧IPMI客戶介面 21‧‧‧IPMI client interface

A~E‧‧‧步驟 A~E‧‧‧Steps

本發明之其他的特徵及功效,將於參照圖式的實施方式中清楚地呈現,其中:圖1是一示意圖,說明本發明的一伺服器;圖2A、2B是一流程圖,說明該伺服器執行的電力重置偵測方法的一第一實施例;圖3是一流程圖,說明該伺服器執行的電力重置偵測方法的一第二實施例;圖4是一流程圖,說明該伺服器執行的電力重置偵測方法的一第三實施例。 Other features and effects of the present invention will be apparent from the embodiments of the present invention, wherein: FIG. 1 is a schematic diagram illustrating a server of the present invention; and FIGS. 2A and 2B are flowcharts illustrating the servo. A first embodiment of a power reset detection method performed by the device; FIG. 3 is a flowchart illustrating a second embodiment of the power reset detection method performed by the server; FIG. 4 is a flowchart illustrating A third embodiment of the power reset detection method performed by the server.

在本發明被詳細描述之前,應當注意在以下的說明內容中,類似的元件是以相同的編號來表示。 Before the present invention is described in detail, it should be noted that in the following description, similar elements are denoted by the same reference numerals.

參閱圖1,本發明的一伺服器1,包含一基板管 理控制器(Baseboard Management Controller,BMC)11、一南橋13,及一智慧平台管理介面(Intelligent Platform Management Interface,IPMI)12。 Referring to FIG. 1, a server 1 of the present invention includes a substrate tube Baseboard Management Controller (BMC) 11, a South Bridge 13, and an Intelligent Platform Management Interface (IPMI) 12.

該智慧平台管理介面12根據一重置指令以決定是否輸出一重置請求信號。其中,該重置指令可來自一遠端伺服器2的一智慧平台管理客戶介面(IPMI client)21,於遠端請求冷重置(cold reset)、暖重置(warm reset)或更新韌體的其中一者時發出;該重置指令亦可來自該伺服器1的該南橋13,於本地端請求冷重置、暖重置或更新韌體時的其中一者時發出。 The smart platform management interface 12 determines whether to output a reset request signal according to a reset command. The reset command may be from a smart platform management client interface (IPMI client) 21 of a remote server 2, requesting a cold reset, a warm reset, or a firmware update at the remote end. One of the commands is issued; the reset command may also come from the south bridge 13 of the server 1, and is issued when the local end requests one of a cold reset, a warm reset, or a firmware update.

該基板管理控制器11包括一儲存一電力重置旗標的隨機存取記憶體111(Random Access Memory,RAM)、一電連接該隨機存取記憶體111的處理單元112、一電連接該處理單元112並記錄一電力重置事件的事件記錄單元(event log)113、一電連接並監控該處理單元112運作情況的監測單元(watchdog)114、一電連接該處理單元112且為該處理單元112作業系統之軟體疊層架構的協定層單元(IPMI stack)115、一電連接該處理單元112的電子抹除式唯讀記憶體(Electrically-Erasable Programmable Read-Only Memory,EEPROM)116。且在本例中,該隨機存取記憶體111為一動態隨機存取記憶體(Dynamic Random Access Memory,DRAM)。 The baseboard management controller 11 includes a random access memory (RAM) that stores a power reset flag, a processing unit 112 that electrically connects the random access memory 111, and an electrical connection unit. And recording an event log 113 of a power reset event, a watchdog 114 electrically connecting and monitoring the operation of the processing unit 112, electrically connecting the processing unit 112, and being the processing unit 112 An agreement layer unit (IPMI stack) 115 of the software stack structure of the operating system, and an electrically-erasable programmable read-only memory (EEPROM) 116 electrically connected to the processing unit 112. In this example, the random access memory 111 is a dynamic random access memory (DRAM).

參閱圖2A與2B,本發明判斷伺服器重置事件是否為電力中斷所導致的方法的一第一實施例,由該伺服 器1執行,且包含以下步驟。 2A and 2B, the first embodiment of the method for determining whether a server reset event is a power interruption is performed by the servo. The device 1 executes and includes the following steps.

參閱圖2A,首先,在步驟(A),該處理單元112根據所接收的一事件信號,以決定是否初始化該隨機存取記憶體111。其中,該事件信號為一供應該伺服器1所需的輸入電力信號、一來自該監控單元114指示處理單元112操作異常的處理異常信號、一來自該智慧平台管理介面12的重置請求信號,及一來自該協定層單元115的協定異常信號的其中一者。而且步驟(A)包括以下子步驟。 Referring to FIG. 2A, first, in step (A), the processing unit 112 determines whether to initialize the random access memory 111 based on the received event signal. The event signal is an input power signal required for supplying the server 1, a processing abnormality signal from the monitoring unit 114 indicating that the processing unit 112 is abnormal, and a reset request signal from the smart platform management interface 12. And one of the agreed exception signals from the protocol layer unit 115. Moreover, step (A) includes the following sub-steps.

步驟(A1),該處理單元112接收該輸入電力信號、該處理異常信號,及該重置請求信號的其中一者,即進到步驟(A2)。詳細而言,該步驟(A1)包括以下子步驟。 In step (A1), the processing unit 112 receives one of the input power signal, the processing abnormality signal, and the reset request signal, that is, proceeds to step (A2). In detail, this step (A1) includes the following sub-steps.

步驟(A11),該處理單元112接收該處理異常信號,並進到步驟(A2)。 In step (A11), the processing unit 112 receives the processing abnormality signal and proceeds to step (A2).

步驟(A12),該處理單元112接收該重置請求信號。 In step (A12), the processing unit 112 receives the reset request signal.

步驟(A13),該處理單元112寫入一重置請求旗標值至該電子抹除式唯讀記憶體116,並進到步驟(A2)。 In step (A13), the processing unit 112 writes a reset request flag value to the electronic erasure type read only memory 116, and proceeds to step (A2).

步驟(A14),該處理單元112接收該輸入電力信號,並進到步驟(A2)。 In step (A14), the processing unit 112 receives the input power signal and proceeds to step (A2).

步驟(A2),該處理單元112重新啟動,且初始化該隨機存取記憶體111,並進到步驟(B)。 In step (A2), the processing unit 112 is restarted, and the random access memory 111 is initialized, and proceeds to step (B).

步驟(A3),該處理單元112接收該協定異常信號,則判斷不重新啟動且不初始化該隨機存取記憶體111, 而直接進到步驟(B)。 Step (A3), the processing unit 112 receives the agreement abnormality signal, and determines that the random access memory 111 is not restarted and is not initialized. Go directly to step (B).

在此補充說明,所述事件信號(該輸入電力信號、該處理異常信號、該重置請求信號,及該協定異常信號)即是分別對應下列觸發事件所致:一、供應該伺服器1的交流電力中斷又恢復以致該處理單元112重置;二、該處理單元112自身操作異常(當機)被該監控單元114重置;三、由來自該遠端伺服器2的智慧平台管理客戶介面21、或是來自該南橋13的重置指令,使該處理單元112據該重置指令執行冷重置、暖重置、或更新韌體重置等重置請求操作;四、該協定層單元115發生協定異常導致該協定層單元115重置。因此,根據觸發事件的不同,該處理單元112依所對應接收的該事件信號,而區分成先進到步驟(A2)再進到步驟(B),以及直接進到步驟(B)兩種不同的操作模式。 In addition, the event signal (the input power signal, the processing abnormal signal, the reset request signal, and the agreement abnormal signal) are respectively caused by the following trigger events: 1. Supplying the server 1 The AC power interruption is restored and the processing unit 112 is reset; 2. The processing unit 112 itself is abnormally operated (downtime) is reset by the monitoring unit 114; 3. The smart platform management client interface from the remote server 2 is used. 21. The reset command from the south bridge 13 causes the processing unit 112 to perform a reset request operation such as a cold reset, a warm reset, or an update firmware reset according to the reset command; 4. The protocol layer unit 115 The occurrence of a contract exception causes the agreement layer unit 115 to reset. Therefore, according to the different triggering events, the processing unit 112 is divided into advanced steps (A2) and then proceeds to step (B) according to the corresponding received event signal, and directly proceeds to step (B) two different Operating mode.

參閱圖2B,接著,該處理單元112須區分所接收的事件信號為所述的哪一者,於是在本例中,在步驟(B)之前先執行的步驟(D),該處理單元112根據該電子抹除式唯讀記憶體116的該重置請求旗標值,以判斷是否為一重置請求操作,且包括以下子步驟。 Referring to FIG. 2B, the processing unit 112 must then distinguish which of the received event signals is the one, and in this example, the step (D) performed prior to step (B), the processing unit 112 is based on The reset request flag value of the electronically erased read-only memory 116 is used to determine whether it is a reset request operation, and includes the following sub-steps.

步驟(D1),該處理單元112比較該重置請求旗標值是否符合一設定值,若否,則該處理單元112判斷不是該重置請求操作,並執行步驟(B)。 In step (D1), the processing unit 112 compares whether the reset request flag value meets a set value, and if not, the processing unit 112 determines that the reset request operation is not, and performs step (B).

步驟(D2),若步驟(D1)比較的結果為是,則該處理單元112判斷為該重置請求操作。 Step (D2), if the result of the comparison of the step (D1) is YES, the processing unit 112 determines that the reset request operation.

步驟(D3),該處理單元112轉存該重置請求旗標值至該隨機存取記憶體111。 In step (D3), the processing unit 112 dumps the reset request flag value to the random access memory 111.

步驟(D4),該處理單元112發送一清除指令以清除該電子抹除式唯讀記憶體116的該重置請求旗標值。 In step (D4), the processing unit 112 sends a clear command to clear the reset request flag value of the electronic erasure type read only memory 116.

於此說明的是,根據該電子抹除式唯讀記憶體116的電氣特性,當該處理單元112在接收該重置請求信號,寫入該重置請求旗標值並重新啟動,並不會導致該電子抹除式唯讀記憶體116所儲存的該重置請求旗標值消失,因此當該重置請求旗標值符合該設定值,即可判斷為該重置請求操作,並且須執行清除以供該處理單元112再次接收該重置請求信號時可再次寫入。 As described herein, according to the electrical characteristics of the electronic erasable read-only memory 116, when the processing unit 112 receives the reset request signal, writes the reset request flag value and restarts, and does not The reset request flag value stored in the electronic erasure type read-only memory 116 disappears. Therefore, when the reset request flag value meets the set value, the reset request operation can be determined and executed. The write can be rewritten when the processing unit 112 receives the reset request signal again.

接著,在步驟(B),該處理單元112根據該電力重置旗標的一參數值,以判斷是否為一協定異常,並包括以下子步驟。 Next, in step (B), the processing unit 112 determines whether it is a contract abnormality according to a parameter value of the power reset flag, and includes the following sub-steps.

步驟(B1),該處理單元112比較該電力重置旗標的該參數值是否符合一預設值。 In step (B1), the processing unit 112 compares whether the parameter value of the power reset flag meets a preset value.

步驟(B2),若步驟(B1)比較的結果為是,則該處理單元112判斷為該協定層單元115發生協定異常。 In the step (B2), if the result of the comparison in the step (B1) is YES, the processing unit 112 determines that the agreement layer unit 115 has a contract abnormality.

步驟(B3),若步驟(B1)比較的結果為否,則該處理單元112將該預設值寫入該隨機存取記憶體111的該電力重置旗標,以更新該參數值,並執行步驟(C)。 Step (B3), if the result of the comparison in step (B1) is no, the processing unit 112 writes the preset value into the power reset flag of the random access memory 111 to update the parameter value, and Perform step (C).

在此說明的是,根據該隨機存取記憶體111的電氣特性,當該處理單元112重新啟動且初始化該隨機存取記憶體111,該隨機存取記憶體111中儲存的該電力重置 旗標的參數值即會消失,但是,由於該處理單元112接收該協定異常信號並不會初始化該隨機存取記憶體111,因此當該電力重置旗標的該參數值符合預設值,即可判斷為該協定異常。 It is explained that, according to the electrical characteristics of the random access memory 111, when the processing unit 112 restarts and initializes the random access memory 111, the power reset stored in the random access memory 111 The parameter value of the flag disappears, but since the processing unit 112 receives the agreement abnormal signal and does not initialize the random access memory 111, when the parameter value of the power reset flag meets the preset value, It is judged that the agreement is abnormal.

接著,在步驟(C),該處理單元112即判斷是否為一自身操作異常,並包括以下子步驟。 Next, in step (C), the processing unit 112 determines whether it is a self-operation abnormality, and includes the following sub-steps.

步驟(C1),該處理單元112根據該監控單元114的一指示處理單元操作異常的溢時旗標(timeout flag)所處的邏輯值,判斷是否為自身操作異常。當該處理單元112判斷該溢時旗標處於一第二邏輯,則該處理單元112判斷不是該自身操作異常,並即判斷為該電力重置事件,且進到步驟(E)。 In step (C1), the processing unit 112 determines whether it is an operation abnormality according to a logic value of the timeout flag indicating that the processing unit operates abnormally by the monitoring unit 114. When the processing unit 112 determines that the overflow time flag is in a second logic, the processing unit 112 determines that the self operation is not abnormal, and determines that the power reset event, and proceeds to step (E).

步驟(C2),若該處理單元112判斷該溢時旗標處於一第一邏輯,則該處理單元112判斷為該自身操作異常。 Step (C2), if the processing unit 112 determines that the overflow time flag is in a first logic, the processing unit 112 determines that the self operation is abnormal.

接著,在步驟(E),該處理單元112記錄該電力重置事件至該事件記錄單元113。 Next, in step (E), the processing unit 112 records the power reset event to the event recording unit 113.

從以上說明可知,在本例中,該處理單元112依序經由步驟(D)排除為該重置請求操作、再由步驟(B)排除為該協定異常,最後由步驟(C)排除為該自身操作異常,才判斷伺服器1經過電力重置,並記錄該電力重置事件至該事件記錄單元113。 As can be seen from the above description, in this example, the processing unit 112 sequentially excludes the reset request operation via the step (D), and then excludes the agreement exception by the step (B), and finally excludes the step (C) as the If the operation is abnormal, the server 1 is judged to have undergone power reset, and the power reset event is recorded to the event recording unit 113.

參閱圖3,本發明判斷伺服器重置事件是否為電力中斷所導致的方法的一第二實施例,與該第一實施例相 似,不同之處在於還包含在步驟(C1)之後執行的步驟(D5)。 Referring to FIG. 3, a second embodiment of the method for determining whether a server reset event is caused by a power interruption is compared with the first embodiment. It is similar, except that the step (D5) performed after the step (C1) is also included.

在步驟(D5),若該處理單元112在步驟(C1)根據該溢時旗標處於第二邏輯,而判斷不是該自身操作異常,則該處理單元112比較該隨機存取記憶體111的該重置請求旗標值是否符合該設定值,若是,則重覆步驟(D2);若否,則執行步驟(E)。 In step (D5), if the processing unit 112 determines that the self-operation is abnormal according to the overflow flag in step (C1), the processing unit 112 compares the random access memory 111 with the random access memory 111. Whether the reset request flag value meets the set value, and if so, repeats the step (D2); if not, executes step (E).

藉此,能防止當該電力重置事件發生在步驟(D2)的該重置請求操作之後,因為該電子抹除式唯讀記憶體116所儲存的該重置請求旗標值並不會因電力重置而消失,以致該處理單元112未檢測出該電力重置事件的情形。此時,藉由該處理單元112在步驟(D3)執行該重置請求旗標值的轉存,即能在步驟(C1)之後,藉由判斷儲存該隨機存取記憶體111的該重置請求旗標值是否因為電力重置而消失,來確認經過步驟(D2)之後,是否仍有該電力重置事件發生。 Thereby, it can be prevented that after the power reset event occurs in the reset request operation of the step (D2), the reset request flag value stored by the electronic erase type read only memory 116 is not caused by The power is reset and disappears, so that the processing unit 112 does not detect the situation of the power reset event. At this time, by the processing unit 112 performing the transfer of the reset request flag value in step (D3), the reset of the random access memory 111 can be determined by the step (C1). Whether the request flag value disappears due to the power reset to confirm whether the power reset event still occurs after the step (D2).

參閱圖4,本發明判斷伺服器重置事件是否為電力中斷所導致的方法的一第三實施例為該第一及第二實施例的變形,且不同之處在於該步驟(D)是在步驟(C)之後執行,且該步驟(D)包括步驟(D1’)、步驟(D2’),及步驟(D4’)。 Referring to FIG. 4, a third embodiment of the method for determining whether a server reset event is a power interruption is a variation of the first and second embodiments, and the difference is that the step (D) is The step (C) is performed after, and the step (D) includes the step (D1'), the step (D2'), and the step (D4').

步驟(D1’),該處理單元112比較該重置請求旗標值是否符合一設定值,若否,則該處理單元112判斷為該電力重置事件,並執行步驟(E)。 In step (D1'), the processing unit 112 compares whether the reset request flag value meets a set value, and if not, the processing unit 112 determines the power reset event and performs step (E).

步驟(D2’),若步驟(D1’)比較的結果為是,則該處理單元112判斷為一重置請求操作。 Step (D2'), if the result of the comparison of the step (D1') is YES, the processing unit 112 determines that it is a reset request operation.

步驟(D4’),該處理單元112發送一清除指令以清除該電子抹除式唯讀記憶體116的該重置請求旗標值。 In step (D4'), the processing unit 112 sends a clear command to clear the reset request flag value of the electronically erasable read only memory 116.

由以上說明可知,在該第三實施例中,該處理單元112依序經由步驟(B)排除為該協定異常、再由步驟(C)排除為該自身操作異常,最後由步驟(D)排除為該重置請求操作,才判斷該伺服器1經過電力重置,並記錄該電力重置事件至該事件記錄單元113。 As can be seen from the above description, in the third embodiment, the processing unit 112 sequentially excludes the agreement exception via the step (B), and then excludes the self operation abnormality by the step (C), and finally excludes the step (D). For the reset request operation, it is judged that the server 1 has undergone power reset, and the power reset event is recorded to the event recording unit 113.

綜上所述,本發明判斷伺服器重置事件是否為電力中斷所導致的方法具有的優點如下。 In summary, the method of the present invention for determining whether a server reset event is a power interruption has the following advantages.

第一,無須高階電源供應器,僅須藉由該基板管理控制器1的處理單元112執行所述判斷,即能藉此記錄該伺服器1是否發生該電力重置事件。 First, the high-order power supply is not required, and the determination is only performed by the processing unit 112 of the baseboard management controller 1, that is, whether the power reset event of the server 1 can be recorded.

第二,本發明進一步在偵測時,該處理單元112能排除為該重置請求操作、該自身操作異常、或是該協定異常的任何一者,而最終正確地判斷該伺服器1經過電力重置,故確實能達成本發明之目的。 Secondly, in the detection of the present invention, the processing unit 112 can exclude any one of the reset request operation, the self-operation abnormality, or the agreement abnormality, and finally correctly determine that the server 1 passes the power. The reset is indeed achieved for the purpose of the present invention.

惟以上所述者,僅為本發明之較佳實施例而已,當不能以此限定本發明實施之範圍,即大凡依本發明申請專利範圍及專利說明書內容所作之簡單的等效變化與修飾,皆仍屬本發明專利涵蓋之範圍內。 The above is only the preferred embodiment of the present invention, and the scope of the present invention is not limited thereto, that is, the simple equivalent changes and modifications made by the patent application scope and patent specification content of the present invention, All remain within the scope of the invention patent.

1‧‧‧伺服器 1‧‧‧Server

11‧‧‧基板管理控制器 11‧‧‧Base management controller

111‧‧‧隨機存取記憶體 111‧‧‧ Random access memory

112‧‧‧處理單元 112‧‧‧Processing unit

113‧‧‧事件記錄單元 113‧‧‧ Event Recording Unit

114‧‧‧監控單元 114‧‧‧Monitoring unit

115‧‧‧協定層單元 115‧‧‧Member layer unit

116‧‧‧電子抹除式唯讀記憶體 116‧‧‧Electronic erasing read-only memory

12‧‧‧IPMI介面 12‧‧‧IPMI interface

13‧‧‧南橋 13‧‧‧South Bridge

2‧‧‧遠端伺服器 2‧‧‧Remote Server

21‧‧‧IPMI客戶介面 21‧‧‧IPMI client interface

Claims (10)

一種判斷伺服器重置事件是否為電力中斷所導致的方法,由一伺服器執行,該伺服器包括一基板管理控制器,該基板管理控制器包括一隨機存取記憶體、一電連接該隨機存取記憶體的處理單元,該隨機存取記憶體儲存一電力重置旗標,且該電力重置偵測方法包含以下步驟:(A)該處理單元根據所接收的一事件信號,以決定是否初始化該隨機存取記憶體;(B)該處理單元根據該電力重置旗標的一參數值,以判斷是否為一協定異常;(C)若步驟(B)的判斷結果為否,則該處理單元判斷是否為一自身操作異常;及(E)若步驟(C)的判斷結果為否,則該處理單元判斷為一屬於該伺服器的電力重置事件,該電力重置事件是關於供應給該伺服器的電力中斷所導致的伺服器重置。 A method for determining whether a server reset event is caused by a power interruption is performed by a server, the server includes a baseboard management controller, the baseboard management controller includes a random access memory, and an electrical connection to the random a processing unit that accesses the memory, the random access memory stores a power reset flag, and the power reset detection method includes the following steps: (A) the processing unit determines according to the received event signal Whether to initialize the random access memory; (B) the processing unit determines whether it is a protocol abnormality according to a parameter value of the power reset flag; (C) if the determination result of the step (B) is negative, the The processing unit determines whether it is an own operation abnormality; and (E) if the determination result of the step (C) is negative, the processing unit determines that the power reset event belongs to the server, and the power reset event is about the supply A server reset caused by a power interruption to the server. 如請求項1所述的判斷伺服器重置事件是否為電力中斷所導致的方法,該基板管理控制器還包括一電連接並監控該處理單元運作情況的監控單元及一電連接該處理單元的協定層單元,該事件信號為一輸入電力信號、一來自該監控單元指示處理單元操作異常的處理異常信號、一重置請求信號,及一來自該協定層單元的協定異常信號的其中一者,其中,該步驟(A)包括: (A1)該處理單元接收該輸入電力信號、該處理異常信號,及該重置請求信號的其中一者,即進到步驟(A2),(A2)該處理單元重新啟動,且初始化該隨機存取記憶體,並進到步驟(B),(A3)該處理單元接收該協定異常信號,則判斷不重新啟動且不初始化該隨機存取記憶體,而直接進到步驟(B)。 The method of claim 1, wherein the determining whether the server reset event is caused by a power interruption, the baseboard management controller further includes: a monitoring unit electrically connecting and monitoring the operation of the processing unit; and an electrical connection to the processing unit a protocol layer unit, the event signal being one of an input power signal, a processing exception signal from the monitoring unit indicating that the processing unit is operating abnormally, a reset request signal, and a protocol abnormality signal from the protocol layer unit, Wherein step (A) comprises: (A1) the processing unit receives one of the input power signal, the processing abnormality signal, and the reset request signal, that is, proceeds to step (A2), (A2) the processing unit restarts, and initializes the random storage The memory is taken, and the process proceeds to step (B). (A3) The processing unit receives the agreement abnormality signal, and determines that the random access memory is not restarted and does not initialize, and proceeds directly to step (B). 如請求項2所述的判斷伺服器重置事件是否為電力中斷所導致的方法,並且該步驟(B)包括:(B1)該處理單元比較該電力重置旗標的該參數值是否符合一預設值,(B2)若步驟(B1)比較的結果為是,則該處理單元判斷為該協定層單元發生協定異常,(B3)若步驟(B1)比較的結果為否,則該處理單元將該預設值寫入該隨機存取記憶體的該電力重置旗標,以更新該參數值。 The method for determining whether the server reset event is a power interruption as described in claim 2, and the step (B) comprises: (B1) the processing unit comparing whether the parameter value of the power reset flag conforms to a pre- If the result of the comparison in step (B1) is YES, the processing unit determines that the agreement layer unit has a contract exception, and (B3) if the result of the step (B1) comparison is no, the processing unit will The preset value is written to the power reset flag of the random access memory to update the parameter value. 如請求項2所述的判斷伺服器重置事件是否為電力中斷所導致的方法,並且該步驟(C)包括:(C1)該處理單元根據該監控單元的一指示處理單元操作異常的溢時旗標(timeout flag)所處的邏輯值,判斷是否為自身操作異常,若該處理單元判斷該溢時旗標處於一第二邏輯,則進到步驟(E),(C2)若該處理單元判斷該溢時旗標處於一第一邏 輯,則該處理單元判斷為該自身操作異常。 The method of determining whether the server reset event is a power interruption as described in claim 2, and the step (C) comprises: (C1) the processing unit operating the abnormal time overflow according to an indication of the monitoring unit The logical value of the timeout flag determines whether it is an abnormal operation. If the processing unit determines that the overflow flag is in a second logic, then proceeds to step (E), (C2) if the processing unit Judge that the overflow time flag is in a first logic The processing unit determines that the self operation is abnormal. 如請求項2所述的判斷伺服器重置事件是否為電力中斷所導致的方法,該基板管理控制器還包括一電連接該處理單元的電子抹除式唯讀記憶體,其中,該步驟(A1)包括:(A12)該處理單元接收該重置請求信號,(A13)該處理單元寫入一重置請求旗標值至該電子抹除式唯讀記憶體,並進到步驟(A2)。 The method of claim 2, wherein the determining whether the server reset event is caused by a power interruption, the baseboard management controller further comprising an electronic erasing read-only memory electrically connected to the processing unit, wherein the step ( A1) includes: (A12) the processing unit receives the reset request signal, and (A13) the processing unit writes a reset request flag value to the electronically erasable read-only memory, and proceeds to step (A2). 如請求項5所述的判斷伺服器重置事件是否為電力中斷所導致的方法,還包含在步驟(B)之前的步驟(D1)至(D4):(D1)該處理單元比較該重置請求旗標值是否符合一設定值,若否,則執行步驟(B),(D2)若步驟(D1)比較的結果為是,則該處理單元判斷為一重置請求操作,(D3)該處理單元轉存該重置請求旗標值至該隨機存取記憶體,(D4)該處理單元發送一清除指令以清除該電子抹除式唯讀記憶體的該重置請求旗標值。 The method of determining whether the server reset event is a power interruption as described in claim 5, further comprising the steps (D1) to (D4) before step (B): (D1) the processing unit compares the reset Whether the request flag value meets a set value, if not, step (B) is performed, and (D2) if the result of the step (D1) comparison is YES, the processing unit determines that it is a reset request operation, (D3) The processing unit dumps the reset request flag value to the random access memory, and (D4) the processing unit sends a clear command to clear the reset request flag value of the electronically erasable read-only memory. 如請求項6所述的判斷伺服器重置事件是否為電力中斷所導致的方法,還包含在步驟(C)之後執行的步驟(D5):(D5)若該處理單元在步驟(C)判斷不是該自身操作異常,則該處理單元比較該隨機存取記憶體的該重置 請求旗標值是否符合該設定值,若是,則重覆步驟(D2);若否,則執行步驟(E)。 The method of determining whether the server reset event is a power interruption as described in claim 6, further comprising the step (D5) performed after the step (C): (D5) if the processing unit determines in the step (C) If the operation is not abnormal, the processing unit compares the reset of the random access memory. Whether the request flag value conforms to the set value, and if so, repeats the step (D2); if not, executes step (E). 如請求項5所述的判斷伺服器重置事件是否為電力中斷所導致的方法,還包含在步驟(C)之後執行的步驟(D1’)、(D2’)及(D4’):(D1’)該處理單元比較該重置請求旗標值是否符合一設定值,若否,則該處理判斷為該電力重置事件,並執行步驟(E),(D2’)若步驟(D1’)比較的結果為是,則該處理單元判斷為一重置請求操作,(D4’)該處理單元發送一清除指令以清除該電子抹除式唯讀記憶體的該重置請求旗標值。 The method of determining whether the server reset event is a power interruption as described in claim 5, further comprising the steps (D1'), (D2'), and (D4') performed after the step (C): (D1) ') The processing unit compares whether the reset request flag value meets a set value, and if not, the process determines that the power reset event, and performs step (E), (D2') if step (D1') If the result of the comparison is yes, then the processing unit determines that it is a reset request operation, and (D4') the processing unit sends a clear command to clear the reset request flag value of the electronically erasable read-only memory. 一種基板管理控制器,包含:一隨機存取記憶體,儲存一電力重置旗標;及一處理單元,電連接該隨機存取記憶體,根據所接收的一事件信號,以決定是否初始化該隨機存取記憶體;其中,該處理單元還根據該電力重置旗標的一參數值,以判斷是否為一協定異常,若非該協定異常,則該處理單元判斷是否為一自身操作異常,若非該自身操作異常,則該處理單元判斷為一屬於該伺服器的電力重置事件,該電力重置事件是關於供應給該伺服器的電力中斷。 A substrate management controller includes: a random access memory for storing a power reset flag; and a processing unit electrically connected to the random access memory to determine whether to initialize the event according to the received event signal a random access memory; wherein the processing unit further determines, according to a parameter value of the power reset flag, whether it is a contract abnormality, and if the agreement is not abnormal, the processing unit determines whether it is an own operation abnormality, if not If the operation is abnormal, the processing unit determines that it is a power reset event belonging to the server, and the power reset event is related to power interruption to the server. 一種伺服器,包含: 一智慧平台管理介面,根據一重置指令以決定是否輸出一重置請求信號;及一基板管理控制器,包括一隨機存取記憶體,儲存一電力重置旗標,一處理單元,電連接該隨機存取記憶體,根據所接收的一事件信號,以決定是否初始化該隨機存取記憶體,其中該事件信號包括來自該智慧平台管理介面的重置請求信號,且該處理單元還根據該電力重置旗標的一參數值,以判斷是否為一協定異常,及一電子抹除式唯讀記憶體,電連接該處理單元,且儲存一重置請求旗標值,該處理單元比較該重置請求旗標值是否符合一設定值,若該重置請求旗標值不等同該設定值,則該處理單元根據該電力重置旗標該參數值判斷是否為該協定異常,若非該協定異常,則該處理單元判斷是否為一自身操作異常,若非該自身操作異常,則該處理單元判斷為一屬於該伺服器的電力重置事件,該電力重置事件是關於供應給該伺服器的電力中斷。 A server that contains: a smart platform management interface for determining whether to output a reset request signal according to a reset command; and a base management controller including a random access memory, storing a power reset flag, a processing unit, and an electrical connection The random access memory determines, according to the received event signal, whether to initialize the random access memory, wherein the event signal includes a reset request signal from the smart platform management interface, and the processing unit further determines a parameter value of the power reset flag to determine whether it is a protocol exception, and an electronic erasure type read only memory, electrically connecting the processing unit, and storing a reset request flag value, the processing unit comparing the weight Whether the request flag value meets a set value, and if the reset request flag value is not equal to the set value, the processing unit determines whether the parameter is abnormal according to the parameter value of the power reset flag, if the agreement is abnormal The processing unit determines whether it is an own operation exception. If the operation is not abnormal, the processing unit determines that the power belongs to the server. Home event, the electric power on reset event is supplied to the server is interrupted.
TW104124763A 2015-07-30 2015-07-30 Server and method for detecting power reset TWI598728B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW104124763A TWI598728B (en) 2015-07-30 2015-07-30 Server and method for detecting power reset

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW104124763A TWI598728B (en) 2015-07-30 2015-07-30 Server and method for detecting power reset

Publications (2)

Publication Number Publication Date
TW201704929A TW201704929A (en) 2017-02-01
TWI598728B true TWI598728B (en) 2017-09-11

Family

ID=58608979

Family Applications (1)

Application Number Title Priority Date Filing Date
TW104124763A TWI598728B (en) 2015-07-30 2015-07-30 Server and method for detecting power reset

Country Status (1)

Country Link
TW (1) TWI598728B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI704560B (en) * 2017-04-28 2020-09-11 慧榮科技股份有限公司 Storage device, access system and access method
CN109814697B (en) * 2017-11-21 2023-02-10 佛山市顺德区顺达电脑厂有限公司 Power supply method for computer system
TWI659314B (en) 2017-12-01 2019-05-11 神雲科技股份有限公司 Method for remotely resetting baseboard management controller and computer system thereof
TWI665561B (en) * 2018-03-30 2019-07-11 神雲科技股份有限公司 Server and remote control method thereof
CN110413320B (en) * 2018-04-25 2022-08-26 环达电脑(上海)有限公司 Server device and method for changing firmware setting in real time
TWI697768B (en) * 2019-03-07 2020-07-01 神雲科技股份有限公司 Reset bmc control method
CN111913551B (en) * 2019-05-08 2024-04-19 佛山市顺德区顺达电脑厂有限公司 Control method for resetting baseboard management controller
TWI739603B (en) * 2020-09-18 2021-09-11 英業達股份有限公司 Monitoring and problem analysis system during server test and method thereof
US11360839B1 (en) * 2021-02-26 2022-06-14 Quanta Computer Inc. Systems and methods for storing error data from a crash dump in a computer system

Also Published As

Publication number Publication date
TW201704929A (en) 2017-02-01

Similar Documents

Publication Publication Date Title
TWI598728B (en) Server and method for detecting power reset
US20170220419A1 (en) Method of detecting power reset of a server, a baseboard management controller, and a server
TWI470420B (en) Dubugging method and computer system using the smae
TWI605459B (en) Dynamic application of ecc based on error type
TWI709986B (en) Thermal monitoring of memory resources
TWI584196B (en) Bios recovery management system, computer program product and method for bios restoration
TWI339337B (en) Method for estimating and reporting the life expectancy of flash-disk memory
US9606889B1 (en) Systems and methods for detecting memory faults in real-time via SMI tests
CN105122262B (en) Redundant system guidance code in auxiliary non-volatile memories
US8341337B1 (en) Data storage device booting from system data loaded by host
US7945815B2 (en) System and method for managing memory errors in an information handling system
US9846616B2 (en) Boot recovery system
US20150169021A1 (en) Dynamic self-correcting power management for solid state drive
BR112015025614B1 (en) COMPUTER READable STORAGE MEDIA, COMPUTER IMPLEMENTED SYSTEM AND METHOD
KR102179829B1 (en) Storage system managing run-time bad cells
TW202030602A (en) The method and system of bios recovery and update
TWI512490B (en) System for retrieving console messages and method thereof and non-transitory computer-readable medium
US20200033928A1 (en) Method of periodically recording for events
TW201508472A (en) System and method of performing firmware update test
TW201506613A (en) System and method of detecting firmware
CN117707884A (en) Method, system, equipment and medium for monitoring power management chip
US8516310B2 (en) Information processing device equipped with write-back cache and diagnosis method for main memory of the same
CN106484599B (en) Judge whether server resetting event is method caused by power breakdown
TW201428470A (en) Automatic booting system and method
US9520162B2 (en) DIMM device controller supervisor

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees