CN113704023A - Firmware self-recovery device and method and server system - Google Patents

Firmware self-recovery device and method and server system Download PDF

Info

Publication number
CN113704023A
CN113704023A CN202110808136.5A CN202110808136A CN113704023A CN 113704023 A CN113704023 A CN 113704023A CN 202110808136 A CN202110808136 A CN 202110808136A CN 113704023 A CN113704023 A CN 113704023A
Authority
CN
China
Prior art keywords
rom
data
data information
firmware
roms
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110808136.5A
Other languages
Chinese (zh)
Other versions
CN113704023B (en
Inventor
李倩倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202110808136.5A priority Critical patent/CN113704023B/en
Publication of CN113704023A publication Critical patent/CN113704023A/en
Application granted granted Critical
Publication of CN113704023B publication Critical patent/CN113704023B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

The invention discloses a firmware self-recovery device, a firmware self-recovery method and a server system. According to the method and the device, the data information of the other ROM can be calculated through the data information of the two ROMs and a preset check algorithm, namely, the data information of any two ROMs can be regarded as complete data information, namely, when the data information of any one of the three ROMs is damaged, the integrity of the data cannot be influenced, the data safety can be improved, the fault tolerance is good, and the stability and the reliability of a server system can be effectively improved; in addition, the three ROMs are matched with each other to store complete data information, and the read-write speed of the data information can be improved.

Description

Firmware self-recovery device and method and server system
Technical Field
The present invention relates to the field of server systems, and in particular, to a firmware self-recovery apparatus and method, and a server system.
Background
At present, firmware information of a mainstream server system is generally stored in a single ROM (Read-Only Memory), and when the firmware information in the ROM is damaged, the server system has a risk of being unable to boot, which results in low stability and reliability of the server system.
In the prior art, in order to improve the stability and reliability of the server system, some server systems are designed to be a dual ROM scheme, and when one ROM information is damaged, the server system is automatically switched to the other ROM. However, even with the dual ROM design, the fault tolerance is poor, which is not favorable for effectively improving the stability and reliability of the server system.
Therefore, how to provide a solution to the above technical problem is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The invention aims to provide a firmware self-recovery device, a firmware self-recovery method and a server system, wherein data information of the other ROM can be calculated through the data information of two ROMs and a preset verification algorithm, that is, the data information of any two ROMs can be regarded as complete data information, namely, when the data information of any one ROM in the three ROMs is damaged, the integrity of the data cannot be influenced, the data safety can be improved, the fault tolerance is good, and the stability and the reliability of the server system can be effectively improved; in addition, the three ROMs are matched with each other to store complete data information, and the read-write speed of the data information can be improved.
To solve the above technical problem, the present invention provides a firmware self-recovery apparatus, including:
a first ROM for storing first firmware data;
a second ROM for storing second firmware data; the first firmware data and the second firmware data are combined to obtain all effective firmware data of the server system;
the third ROM is used for storing verification information obtained by calculating the first firmware data and the second firmware data through a preset verification algorithm;
and the controller is used for respectively acquiring the data information of the first ROM, the second ROM and the third ROM, and if the data information of one ROM is damaged, the damaged data information is recovered based on the data information of the other two ROMs and the preset verification algorithm so as to be loaded and used by a computing board of the server system when the computer system is started.
Preferably, the controller is further configured to:
and writing the recovered data information into a ROM with the damaged data information to recover the original data information in the ROM with the damaged data information.
Preferably, the first ROM, the second ROM and the third ROM are the same in number and are all multiple in number; one of the first ROM, one of the second ROM and one of the third ROM constitute a ROM group;
the firmware self-recovery device further comprises:
a plurality of management boards provided with a plurality of ROM groups one by one;
the adapter plate is provided with the controller and is connected with the plurality of management plates;
the controller is specifically used for determining a main management board from the in-place management boards, acquiring data information of three ROMs on the main management board, and if the data information of one ROM is damaged, recovering the damaged data information based on the data information of the other two ROMs and the preset verification algorithm; if the data information of two or all the ROMs is damaged, re-determining the main management board from the rest in-place management boards, and re-executing the step of acquiring the data information of the three ROMs on the main management board until the complete data information of the three ROMs is acquired, so that the complete data information of the three ROMs can be loaded and used when the computing board of the server system is started.
Preferably, the controller is further configured to:
the data information of the ROM group on the management board with the damaged data information is erased, and then the data information of the ROM group on the management board without the damaged data information is correspondingly written into the ROM group on the management board with the damaged data information.
Preferably, the number of the first ROM, the second ROM and the third ROM is two; the two management boards comprise a master management board and a slave management board;
the controller is specifically configured to detect in-place situations of the two management boards, acquire data information of three ROMs on the main management board if both management boards are in place, and recover the damaged data information based on the data information of the other two ROMs and the preset verification algorithm if the data information of one ROM is damaged; if the data information of two or all the ROMs is damaged, switching to the acquisition of the data information of three ROMs from the slave management board; if only one management board is in place, the data information of three ROMs is directly obtained from the management board so as to be loaded and used by the computing board of the server system when the server system is started.
Preferably, the first firmware data, the second firmware data and the verification information are binary data;
the preset process of the checking algorithm comprises the following steps:
when the nth data of the first firmware data is 0 and the nth data of the second firmware data is 0, the nth data of the verification information is 0; wherein n is a positive integer;
when the nth data of the first firmware data is 0 and the nth data of the second firmware data is 1, the nth data of the verification information is 1;
when the nth data of the first firmware data is 1 and the nth data of the second firmware data is 0, the nth data of the verification information is 1;
and when the nth data of the first firmware data is 1 and the nth data of the second firmware data is 1, the nth data of the verification information is 0.
Preferably, the controller is further configured to:
accumulating the times of the data information of the ROMs is destroyed, and judging whether the times of the data information of the target ROM is destroyed is larger than a preset time threshold value or not; if yes, carrying out replacement reminding on the target ROM; wherein the target ROM is any one of the ROMs.
Preferably, the number of the computing boards is multiple;
the firmware self-recovery device further comprises:
a switching circuit connected to the controller and the plurality of computing boards, respectively;
the controller is further configured to control the switch circuit to turn on a communication link between the switch circuit and a target computing board having a boot firmware loading requirement according to the boot firmware loading requirements of the plurality of computing boards, so as to transmit complete data information of the three ROMs to the target computing board for loading and using of the target computing board when the target computing board is booted.
In order to solve the above technical problem, the present invention further provides a firmware self-recovery method, which is applied to any one of the above firmware self-recovery apparatuses, and includes:
respectively acquiring data information of the first ROM, the second ROM and the third ROM;
if the data information of one ROM is damaged, the damaged data information is recovered based on the data information of the other two ROMs and the preset verification algorithm so as to be loaded and used when the computing board of the server system is started.
In order to solve the above technical problem, the present invention further provides a server system, including any one of the above firmware self-recovery devices.
The invention provides a firmware self-recovery device, which comprises a first ROM, a second ROM, a third ROM and a controller, wherein the first ROM is connected with the second ROM; the controller is used for respectively acquiring the data information of the first ROM, the second ROM and the third ROM, and if the data information of one ROM is damaged, the damaged data information is recovered based on the data information of the other two ROMs and a preset verification algorithm so as to be loaded and used when the computing board of the server system is started. Therefore, the data information of the other ROM can be calculated through the data information of the two ROMs and a preset check algorithm, that is, the data information of any two ROMs can be regarded as complete data information, namely, the integrity of the data cannot be influenced when the data information of any one of the three ROMs is damaged, the data safety can be improved, the fault tolerance is good, and the stability and the reliability of a server system can be effectively improved; in addition, the three ROMs are matched with each other to store complete data information, and the read-write speed of the data information can be improved.
The invention also provides a firmware self-recovery method and a server system, which have the same beneficial effects as the firmware self-recovery device.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed in the prior art and the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a firmware self-recovery apparatus according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a firmware self-recovery apparatus according to an embodiment of the present invention;
fig. 3 is a flowchart of a firmware self-recovery method according to an embodiment of the present invention.
Detailed Description
The core of the invention is to provide a firmware self-recovery device, a method and a server system, data information of the other ROM can be calculated by data information of two ROMs and a preset check algorithm, that is, the data information of any two ROMs can be regarded as complete data information, namely, when the data information of any one ROM in the three ROMs is damaged, the integrity of the data cannot be influenced, the data safety can be improved, the fault tolerance is good, and the stability and the reliability of the server system can be effectively improved; in addition, the three ROMs are matched with each other to store complete data information, and the read-write speed of the data information can be improved.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a firmware self-recovery apparatus according to an embodiment of the present invention.
The firmware self-recovery device comprises:
a first ROM 1 for storing first firmware data;
a second ROM 2 for storing second firmware data; the first firmware data and the second firmware data are combined to obtain all effective firmware data of the server system;
a third ROM3 for storing verification information calculated by a preset verification algorithm from the first firmware data and the second firmware data;
and the controller 4 is used for respectively acquiring the data information of the first ROM 1, the second ROM 2 and the third ROM3, and if the data information of one of the ROMs is damaged, the damaged data information is recovered based on the data information of the other two ROMs and a preset verification algorithm so as to be loaded and used when a computing board of the server system is started.
Specifically, the firmware self-recovery device of the present application includes a first ROM 1, a second ROM 2, a third ROM3 and a controller 4, and the working principle thereof is as follows:
according to the firmware data combination method and device, all effective firmware data of the server system are divided into first firmware data and second firmware data, the first firmware data are stored in the first ROM 1, the second firmware data are stored in the second ROM 2, and all effective firmware data of the server system can be obtained through the firmware data combination obtained from the first ROM 1 and the second ROM 2.
According to the method and the device, the first firmware data and the second firmware data are calculated through a preset checking algorithm to obtain checking information, and the calculated checking information is stored in the third ROM 3. It can be understood that, if the data information of the first ROM 1 is damaged and the data information of the second ROM 2 and the third ROM3 is intact, the data information of the first ROM 1 can be calculated through the data information of the second ROM 2 and the third ROM3 and a preset verification algorithm to obtain the complete data information of the three ROMs; similarly, if the data information of the second ROM 2 is damaged and the data information of the first ROM 1 and the third ROM3 is intact, the data information of the second ROM 2 can be calculated through the data information of the first ROM 1 and the third ROM3 and a preset verification algorithm to obtain complete data information of the three ROMs; if the data information of the third ROM3 is damaged and the data information of the first ROM 1 and the second ROM 2 is intact, the data information of the third ROM3 can be calculated through the data information of the first ROM 1 and the second ROM 2 and a preset verification algorithm, so that the complete data information of the three ROMs can be obtained and can be loaded and used by a computing board of the server system when the computer system is started.
Based on this, the controller 4 is respectively connected with the first ROM 1, the second ROM 2 and the third ROM3, and is used for respectively acquiring the data information of the first ROM 1, the second ROM 2 and the third ROM3, and if the data information of all three ROMs is not damaged, the complete data information of all three ROMs can be directly obtained; if the data information of one ROM is damaged, the damaged data information is recovered based on the data information of the other two ROMs and a preset verification algorithm, so that complete data information of the three ROMs is obtained and is loaded and used by a computing board of the server system when the computer system is started.
The invention provides a firmware self-recovery device, which comprises a first ROM, a second ROM, a third ROM and a controller, wherein the first ROM is connected with the second ROM; the controller is used for respectively acquiring the data information of the first ROM, the second ROM and the third ROM, and if the data information of one ROM is damaged, the damaged data information is recovered based on the data information of the other two ROMs and a preset verification algorithm so as to be loaded and used when the computing board of the server system is started. Therefore, the data information of the other ROM can be calculated through the data information of the two ROMs and a preset check algorithm, that is, the data information of any two ROMs can be regarded as complete data information, namely, the integrity of the data cannot be influenced when the data information of any one of the three ROMs is damaged, the data safety can be improved, the fault tolerance is good, and the stability and the reliability of a server system can be effectively improved; in addition, the three ROMs are matched with each other to store complete data information, and the read-write speed of the data information can be improved.
On the basis of the above-described embodiment:
as an alternative embodiment, the controller 4 is further configured to:
and writing the recovered data information into a ROM with the damaged data information to recover the original data information in the ROM with the damaged data information.
Further, the controller 4 can write the recovered data information into the ROM with the damaged data information, so as to recover the original data information in the ROM with the damaged data information, thereby automatically recovering the stored content in the ROM with the damaged data information, without manual re-burning, with high operability and easy post-maintenance.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a firmware self-recovery apparatus according to an embodiment of the present invention.
As an alternative embodiment, the number of the first ROM 1, the second ROM 2 and the third ROM3 is the same and the three are all plural; a first ROM 1, a second ROM 2 and a third ROM3 constitute a ROM group;
the firmware self-recovery device further comprises:
a plurality of management boards provided with a plurality of ROM groups one by one;
a switching board 5 provided with a controller 4 and connected to the plurality of management boards;
the controller 4 is specifically configured to determine a main management board from the in-place management boards, acquire data information of three ROMs on the main management board, and if the data information of one ROM is damaged, recover the damaged data information based on the data information of the other two ROMs and a preset verification algorithm; if the data information of two or all the ROMs is damaged, re-determining the main management board from the remaining in-place management boards, and re-executing the step of acquiring the data information of the three ROMs on the main management board until the complete data information of the three ROMs is acquired so as to be loaded and used by the computing board of the server system when the computer system is started.
Furthermore, the number of the first ROM 1, the second ROM 2 and the third ROM3 is the same, and the number of the first ROM 1, the second ROM 2 and the third ROM3 is multiple; one first ROM 1, one second ROM 2 and one third ROM3 constitute one ROM group, thereby obtaining a plurality of ROM groups.
Based on this, the firmware self-recovery apparatus of the present application further includes a plurality of management boards (CMC (CHASIS MANAGEMENT CONTROLLER, chassis management CONTROLLER) and an adaptor board 5, and the working principle thereof is as follows:
a management board is provided with a ROM group, and a plurality of management boards are connected with the adapter board 5, so that the communication between the controller 4 on the adapter board 5 and the ROM group on each management board is realized. Specifically, the plurality of first communication interfaces are arranged on the adapter plate 5, and the second communication interface which is plugged with the first communication interfaces is arranged on each management plate, so that the communication between the controller 4 on the adapter plate 5 and the ROM group on each management plate is realized through the communication interfaces.
The controller 4 determines a main management board from the in-place management boards and acquires the data information of the three ROMs on the main management board, and if the data information of the three ROMs is not damaged, the complete data information of the three ROMs can be directly acquired; if the data information of one ROM is damaged, recovering the damaged data information based on the data information of the other two ROMs and a preset verification algorithm to obtain complete data information of the three ROMs; if the data information of two or all ROMs is damaged and the complete data information cannot be reserved at the moment, re-determining the main management board from the remaining on-site management boards, and acquiring the re-determined data information of three ROMs on the main management board, if the data information of all three ROMs is not damaged, directly acquiring the complete data information of all three ROMs; if the data information of one ROM is damaged, recovering the damaged data information based on the data information of the other two ROMs and a preset verification algorithm to obtain complete data information of the three ROMs; if the data information of two or all the ROMs is damaged, the main management board is continuously determined again from the rest in-place management boards until the complete data information of the three ROMs is obtained, so that the complete data information of the three ROMs can be loaded and used when the computing board of the server system is started.
Therefore, the server system has double insurance that three ROMs are matched with a plurality of management boards to store complete data information (in a striped storage mode), and the safety and the reliability of the server system are greatly improved.
As an alternative embodiment, the controller 4 is further configured to:
the data information of the ROM group on the management board with the damaged data information is erased, and then the data information of the ROM group on the management board without the damaged data information is correspondingly written into the ROM group on the management board with the damaged data information.
Further, the controller 4 further erases the data information in the ROM group on the management board (referred to as a first management board) whose data information is damaged, that is, all the data information in the three ROMs on the first management board is erased, and then writes the data information in the ROM group on the management board whose data information is not damaged into the ROM group on the management board whose data information is damaged, specifically, writes the data information in the first ROM on the second management board into the first ROM on the first management board, writes the data information in the second ROM on the second management board into the second ROM on the first management board, and writes the data information in the third ROM on the second management board into the third ROM on the first management board, thereby automatically recovering the stored content in the ROM on the management board whose data information is damaged, without manually burning and re-operating performance, and high performance, and easy for post-maintenance.
It should be noted that, if the data information of one of the ROMs on the management board is damaged and the data information of the other two ROMs on the management board is intact, the management board is still considered as the management board whose data information is not damaged (data recovery is performed by writing the recovered data information into the ROM whose data information is damaged as proposed in the above embodiment). Only when the data information of two or all the ROMs on the management board is destroyed, the management board is considered as a management board whose data information is destroyed.
As an alternative embodiment, the number of the first ROM 1, the second ROM 2 and the third ROM3 is two; the two management boards comprise a master management board CMC0 and a slave management board CMC 1;
the controller 4 is specifically configured to detect the in-place situations of the two management boards, acquire data information of three ROMs on the main management board CMC0 if both management boards are in place, and recover the damaged data information based on the data information of the other two ROMs and a preset verification algorithm if the data information of one ROM is damaged; if the data information of two or all the ROMs is destroyed, switching to acquiring the data information of three ROMs from the management board CMC 1; if only one management board is in place, the data information of the three ROMs is directly obtained from the management board so as to be loaded and used by the computing board of the server system when the computer system is started.
Specifically, the number of the first ROM 1, the second ROM 2, and the third ROM3 in the present application is two, and the firmware self-recovery apparatus specifically includes two management boards, and the controller 4 defines the master management board CMC0 and the slave management board CMC1 in advance.
Based on this, the controller 4 will automatically detect the presence of the master management board CMC0 and the slave management board CMC1 (via the I2C bus) after being powered on. If the master management board CMC0 and the slave management board CMC1 are both in place, the data information of three ROMs on the master management board CMC0 is obtained, and if the data information of the three ROMs is not damaged, the complete data information of the three ROMs can be directly obtained; if the data information of one ROM is damaged, recovering the damaged data information based on the data information of the other two ROMs and a preset verification algorithm to obtain complete data information of the three ROMs; if the data information of two or all the ROMs is destroyed, the method switches to acquiring the data information of three ROMs from the management board CMC1, and similarly, if the data information of all the three ROMs is not destroyed, the complete data information of the three ROMs can be directly obtained; if the data information of one ROM is damaged, recovering the damaged data information based on the data information of the other two ROMs and a preset verification algorithm to obtain complete data information of the three ROMs; if the data information of two or all the ROMs is damaged, the complete data information of three ROMs cannot be obtained. If only one management board is in place, the data information of the three ROMs is directly obtained from the management board, and similarly, if the data information of the three ROMs is not damaged, the complete data information of the three ROMs can be directly obtained; if the data information of one ROM is damaged, recovering the damaged data information based on the data information of the other two ROMs and a preset verification algorithm to obtain complete data information of the three ROMs; if the data information of two or all the ROMs is damaged, the complete data information of three ROMs cannot be obtained.
As an alternative embodiment, the first firmware data, the second firmware data and the verification information are binary data;
the preset process of the checking algorithm comprises the following steps:
when the nth data of the first firmware data is 0 and the nth data of the second firmware data is 0, the nth data of the verification information is 0; wherein n is a positive integer;
when the nth data of the first firmware data is 0 and the nth data of the second firmware data is 1, the nth data of the verification information is 1;
when the nth data of the first firmware data is 1 and the nth data of the second firmware data is 0, the nth data of the verification information is 1;
when the nth data of the first firmware data is 1 and the nth data of the second firmware data is 1, the nth data of the verification information is 0.
Specifically, the first firmware data, the second firmware data and the verification information of the present application are binary data (only consisting of 0 and 1). When the nth data of the first firmware data is 0 and the nth data of the second firmware data is 0, the nth data of the verification information is 0 through the calculation of the verification algorithm; when the nth data of the first firmware data is 0 and the nth data of the second firmware data is 1, calculating by a verification algorithm, and setting the nth data of the verification information to be 1; when the nth data of the first firmware data is 1 and the nth data of the second firmware data is 0, the nth data of the verification information is 1 through the calculation of the verification algorithm; when the nth data of the first firmware data is 1 and the nth data of the second firmware data is 1, the nth data of the verification information is 0 through the calculation of the verification algorithm, so that the data information of the other ROM can be calculated through the data information of the two ROMs and the preset verification algorithm.
As an alternative embodiment, the controller 4 is further configured to:
accumulating the times of the data information of each ROM being destroyed, and judging whether the times of the data information of the target ROM being destroyed is larger than a preset time threshold; if yes, carrying out replacement reminding of the target ROM; wherein the target ROM is any ROM.
Further, the controller 4 may also accumulate the number of times that the data information of each ROM on each management board is destroyed, and determine whether the number of times that the data information of any ROM (referred to as a target ROM) is destroyed is greater than a preset number threshold; if the target ROM is more than the preset time threshold value, the target ROM is considered to need to be replaced, replacement reminding of the target ROM is carried out, and a user is reminded that the target ROM needs to be replaced at the moment.
As an alternative embodiment, the number of the computing boards is multiple;
the firmware self-recovery device further comprises:
a Switch circuit Switch connected to the controller 4 and the plurality of computing boards, respectively;
the controller 4 is further configured to control the Switch circuit Switch to turn on a communication link between itself and a target computing board having a boot firmware loading requirement according to the boot firmware loading requirements of the multiple computing boards, so as to transmit complete data information of the three ROMs to the target computing board for the target computing board to load and use when the target computing board is booted.
Further, the number of the computing boards in the server system is plural, the firmware self-recovery apparatus further includes a Switch circuit Switch (which may be disposed on the adapter board 5), the Switch circuit Switch is controlled by the controller 4, and the control principle is as follows: according to the boot firmware loading requirements of the plurality of computing boards, the switching circuit Switch is controlled to Switch on a communication link between the controller 4 and the computing board (called a target computing board) with the boot firmware loading requirements, so that the controller 4 transmits complete data information of three ROMs to the target computing board (through an I2C bus) for loading and using by the target computing board when the target computing board is started, the data information of the ROMs on the management board is shared by the plurality of computing boards, and cost saving is facilitated.
In addition, the controller 4 of the present application may select an original FPGA (Field Programmable Gate Array) in the server system, thereby saving cost. It should be noted that, when the server system is powered on, the FPGA is powered on before the computing board, and complete data information of the ROM on the management board can be transmitted to the computing board for loading and use.
Referring to fig. 3, fig. 3 is a flowchart illustrating a firmware self-recovery method according to an embodiment of the present invention.
The firmware self-recovery method is applied to any one of the firmware self-recovery devices, and comprises the following steps:
step S1: data information of a first ROM, a second ROM and a third ROM is acquired respectively.
Step S2: if the data information of one ROM is damaged, the damaged data information is recovered based on the data information of the other two ROMs and a preset verification algorithm so as to be loaded and used when a computing board of the server system is started.
For introduction of the firmware self-recovery method provided in the present application, reference is made to the above-mentioned embodiment of the firmware self-recovery apparatus, and details of the method are not repeated herein.
The application also provides a server system which comprises any one of the firmware self-recovery device.
For the introduction of the server system provided in the present application, please refer to the above-mentioned embodiment of the firmware self-recovery apparatus, which is not described herein again.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A firmware self-recovery apparatus, comprising:
a first ROM for storing first firmware data;
a second ROM for storing second firmware data; the first firmware data and the second firmware data are combined to obtain all effective firmware data of the server system;
the third ROM is used for storing verification information obtained by calculating the first firmware data and the second firmware data through a preset verification algorithm;
and the controller is used for respectively acquiring the data information of the first ROM, the second ROM and the third ROM, and if the data information of one ROM is damaged, the damaged data information is recovered based on the data information of the other two ROMs and the preset verification algorithm so as to be loaded and used by a computing board of the server system when the computer system is started.
2. The firmware self-recovery apparatus of claim 1, wherein the controller is further to:
and writing the recovered data information into a ROM with the damaged data information to recover the original data information in the ROM with the damaged data information.
3. The firmware self-recovery device according to claim 1, wherein the first ROM, the second ROM and the third ROM are equal in number and are plural; one of the first ROM, one of the second ROM and one of the third ROM constitute a ROM group;
the firmware self-recovery device further comprises:
a plurality of management boards provided with a plurality of ROM groups one by one;
the adapter plate is provided with the controller and is connected with the plurality of management plates;
the controller is specifically used for determining a main management board from the in-place management boards, acquiring data information of three ROMs on the main management board, and if the data information of one ROM is damaged, recovering the damaged data information based on the data information of the other two ROMs and the preset verification algorithm; if the data information of two or all the ROMs is damaged, re-determining the main management board from the rest in-place management boards, and re-executing the step of acquiring the data information of the three ROMs on the main management board until the complete data information of the three ROMs is acquired, so that the complete data information of the three ROMs can be loaded and used when the computing board of the server system is started.
4. The firmware self-recovery apparatus of claim 3, wherein the controller is further to:
the data information of the ROM group on the management board with the damaged data information is erased, and then the data information of the ROM group on the management board without the damaged data information is correspondingly written into the ROM group on the management board with the damaged data information.
5. The firmware self-recovery apparatus of claim 3, wherein the number of the first ROM, the second ROM, and the third ROM is two; the two management boards comprise a master management board and a slave management board;
the controller is specifically configured to detect in-place situations of the two management boards, acquire data information of three ROMs on the main management board if both management boards are in place, and recover the damaged data information based on the data information of the other two ROMs and the preset verification algorithm if the data information of one ROM is damaged; if the data information of two or all the ROMs is damaged, switching to the acquisition of the data information of three ROMs from the slave management board; if only one management board is in place, the data information of three ROMs is directly obtained from the management board so as to be loaded and used by the computing board of the server system when the server system is started.
6. The firmware self-recovery apparatus of claim 1, wherein the first firmware data, the second firmware data and the check information are binary data;
the preset process of the checking algorithm comprises the following steps:
when the nth data of the first firmware data is 0 and the nth data of the second firmware data is 0, the nth data of the verification information is 0; wherein n is a positive integer;
when the nth data of the first firmware data is 0 and the nth data of the second firmware data is 1, the nth data of the verification information is 1;
when the nth data of the first firmware data is 1 and the nth data of the second firmware data is 0, the nth data of the verification information is 1;
and when the nth data of the first firmware data is 1 and the nth data of the second firmware data is 1, the nth data of the verification information is 0.
7. The firmware self-recovery apparatus of claim 2, wherein the controller is further to:
accumulating the times of the data information of the ROMs is destroyed, and judging whether the times of the data information of the target ROM is destroyed is larger than a preset time threshold value or not; if yes, carrying out replacement reminding on the target ROM; wherein the target ROM is any one of the ROMs.
8. The firmware self-recovery device of any one of claims 1 to 7, wherein the number of the computing boards is plural;
the firmware self-recovery device further comprises:
a switching circuit connected to the controller and the plurality of computing boards, respectively;
the controller is further configured to control the switch circuit to turn on a communication link between the switch circuit and a target computing board having a boot firmware loading requirement according to the boot firmware loading requirements of the plurality of computing boards, so as to transmit complete data information of the three ROMs to the target computing board for loading and using of the target computing board when the target computing board is booted.
9. A firmware self-recovery method applied to the firmware self-recovery device according to any one of claims 1 to 8, comprising:
respectively acquiring data information of the first ROM, the second ROM and the third ROM;
if the data information of one ROM is damaged, the damaged data information is recovered based on the data information of the other two ROMs and the preset verification algorithm so as to be loaded and used when the computing board of the server system is started.
10. A server system comprising a firmware self-recovery apparatus according to any one of claims 1 to 8.
CN202110808136.5A 2021-07-16 2021-07-16 Firmware self-recovery device, method and server system Active CN113704023B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110808136.5A CN113704023B (en) 2021-07-16 2021-07-16 Firmware self-recovery device, method and server system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110808136.5A CN113704023B (en) 2021-07-16 2021-07-16 Firmware self-recovery device, method and server system

Publications (2)

Publication Number Publication Date
CN113704023A true CN113704023A (en) 2021-11-26
CN113704023B CN113704023B (en) 2023-08-11

Family

ID=78648834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110808136.5A Active CN113704023B (en) 2021-07-16 2021-07-16 Firmware self-recovery device, method and server system

Country Status (1)

Country Link
CN (1) CN113704023B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023230539A1 (en) * 2022-05-25 2023-11-30 Advanced Micro Devices, Inc. Automatic mirrored rom

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070169088A1 (en) * 2006-01-13 2007-07-19 Dell Products, L.P. Automatic firmware corruption recovery and update
CN110109716A (en) * 2019-05-13 2019-08-09 深圳忆联信息系统有限公司 Guarantee that SSD firmware stablizes method, apparatus, computer equipment and the storage medium of load

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070169088A1 (en) * 2006-01-13 2007-07-19 Dell Products, L.P. Automatic firmware corruption recovery and update
CN110109716A (en) * 2019-05-13 2019-08-09 深圳忆联信息系统有限公司 Guarantee that SSD firmware stablizes method, apparatus, computer equipment and the storage medium of load

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023230539A1 (en) * 2022-05-25 2023-11-30 Advanced Micro Devices, Inc. Automatic mirrored rom

Also Published As

Publication number Publication date
CN113704023B (en) 2023-08-11

Similar Documents

Publication Publication Date Title
DE102017124079B4 (en) Storage device for processing corrupted metadata and method of operating the same
CN107844268B (en) Data distribution method, data storage method, related device and system
US9946655B2 (en) Storage system and storage control method
CN101840380B (en) Apparatus and method to protect metadata against unexpected power down
CN105874428B (en) Technology for the operating system transformation in multiple operating system environments
CN103678144A (en) Data storage device and flash memory control method
CN103488498A (en) Computer booting method and computer
US20090132779A1 (en) Storage system and remote copy control method
DE112016003998T5 (en) TECHNOLOGIES FOR THE MANAGEMENT OF A RESERVED HIGH-PERFORMANCE STORAGE AREA OF A SOLID STATE DRIVE
CN101373433A (en) Method for updating BIOS and computer and system using the same
DE102019128491A1 (en) Method of Operation for Open Channel Storage Device
CN108255630A (en) A kind of method for reducing solid state disk powered-off fault processing time
CN103765373A (en) Data storage method, data storage device, and storage equipment
WO2022142544A1 (en) Method for preventing data loss from flash memory, solid state drive controller, solid state drive
CN103534688A (en) Data recovery method, storage equipment and storage system
CN109086078A (en) Android system upgrade method, device, server and mobile terminal
US8015437B2 (en) Restoring data to a distributed storage node
CN113704023A (en) Firmware self-recovery device and method and server system
CN115167782B (en) Temporary storage copy management method, system, equipment and storage medium
CN102025758A (en) Method, device and system fore recovering data copy in distributed system
CN103150224A (en) Electronic equipment and method for improving starting reliability
CN102455979B (en) Data protection method for damaged memory cell
CN111130856A (en) Server configuration method, system, equipment and computer readable storage medium
CN112765151A (en) Random writing method and device based on distributed storage system and distributed storage system
US20200319977A1 (en) Method for backing up and restoring digital data stored on a solid-state storage device and a highly secure solid-state storage device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant