CN115048244A - Hardware repair method and system for server, computer equipment and medium - Google Patents
Hardware repair method and system for server, computer equipment and medium Download PDFInfo
- Publication number
- CN115048244A CN115048244A CN202210655271.5A CN202210655271A CN115048244A CN 115048244 A CN115048244 A CN 115048244A CN 202210655271 A CN202210655271 A CN 202210655271A CN 115048244 A CN115048244 A CN 115048244A
- Authority
- CN
- China
- Prior art keywords
- hardware
- expander
- firmware
- large system
- underlying hardware
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000008439 repair process Effects 0.000 title claims abstract description 27
- 238000005192 partition Methods 0.000 claims abstract description 58
- 230000002159 abnormal effect Effects 0.000 claims abstract description 40
- 230000004044 response Effects 0.000 claims abstract description 24
- 238000004590 computer program Methods 0.000 claims description 9
- 238000001514 detection method Methods 0.000 claims description 7
- 238000000638 solvent extraction Methods 0.000 claims description 4
- 238000012827 research and development Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000001360 synchronised effect Effects 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1415—Saving, restoring, recovering or retrying at system level
- G06F11/1441—Resetting or repowering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
- G06F11/2273—Test methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Stored Programmes (AREA)
- Hardware Redundancy (AREA)
Abstract
本发明公开了一种服务器的硬件修复方法、系统、计算机设备及介质,方法包括:在expander非易失性存储器件的存储空间划分固件分区,并将服务器所有的底层硬件对应的固件文件写入固件分区;通过expander检测各个底层硬件的运行状态;响应于检测到有底层硬件的运行状态异常,判断大系统是否加载成功;响应于大系统未加载成功,通过expander从所述固件分区中读取运行异常的底层硬件对应的固件文件;通过expander将对应的固件文件写入运行异常的底层硬件,并在写入完成后重启大系统以完成运行异常的底层硬件的修复。通过本发明的方案,可以对运行异常的硬件进行快速、正确的修复,保证整个服务器的正常运行。
The invention discloses a hardware repair method, system, computer equipment and medium of a server. The method includes: dividing firmware partitions in the storage space of expander non-volatile storage devices, and writing firmware files corresponding to all underlying hardware of the server into Firmware partition; detect the operating state of each underlying hardware through expander; in response to detecting that the operating state of the underlying hardware is abnormal, determine whether the large system is successfully loaded; in response to the large system not being successfully loaded, read from the firmware partition through expander The firmware file corresponding to the abnormally operating underlying hardware; the corresponding firmware file is written to the abnormally operating underlying hardware through expander, and after the writing is completed, the large system is restarted to complete the repair of the abnormally operating underlying hardware. Through the solution of the present invention, the abnormally running hardware can be repaired quickly and correctly, and the normal operation of the entire server can be ensured.
Description
技术领域technical field
本发明涉及服务器技术领域,尤其涉及一种服务器的硬件修复方法、系统、计算机设备及介质。The present invention relates to the technical field of servers, and in particular, to a hardware repairing method, system, computer device and medium of a server.
背景技术Background technique
在云计算、大数据时代,海量数据存储需要性能更优、传输速率更快的存储产品,满足传输速率更高的同时还要保证数据的完整性、可靠性,这就意味着服务器的系统会更加复杂化,系统的复杂化意味着需要更多的底层硬件协同作业,而固件就是硬件设备的灵魂,多种固件的数据交互、互相依赖、高耦合性,这种情况下,固件的安全显得尤为重要,若在运行中,某个硬件的固件没有升级到有效版本或出厂时直接没有初始版本,亦或者在存储系统运行中因为某种原因都会导致底层硬件的运行异常,出现这种情况,轻则导致大系统拿到的信息错误,严重的话,可能导致整个服务器无法正常运行,带来的后果是不可接受的。In the era of cloud computing and big data, mass data storage requires storage products with better performance and faster transmission rate. While satisfying the higher transmission rate, data integrity and reliability must also be ensured, which means that the server system will More complicated, the complexity of the system means that more underlying hardware needs to work together, and firmware is the soul of the hardware device. The data interaction, interdependence, and high coupling of multiple firmwares. In this case, the security of firmware appears It is especially important. If the firmware of a certain piece of hardware is not upgraded to a valid version or has no initial version when it leaves the factory, or the underlying hardware will run abnormally for some reason during the operation of the storage system. In light cases, the information obtained by the large system is wrong. In serious cases, it may cause the entire server to fail to operate normally, and the consequences are unacceptable.
因此,硬件设备的固件在各种场景下的有效升级是非常重要的,如果在出现固件异常时大系统已经正常加载且运行正常,由于大系统运行在CPU上,有文件系统,因此硬件设备出现固件异常时大系统启动后可以直接去文件系统的目录下取固件文件正常升级即可,固件升级时需要大系统正常启动且将固件文件打包到大系统升级文件中,当大系统启动后,检查目标硬件固件的版本号和运行情况,若目标硬件资源正常启动,但运行固件的版本号与升级包里带的固件版本不一致则会触发固件升级;但是若固件异常未正常启动(固件运行文件受损或者固件运行文件直接为空),这种情况下,大系统也可以直接向目标硬件写入大包里的固件运行文件,从而达到修复问题固件的目的。但在许多情况下,存在未加载大系统的情况,这种情况下若CPLD、FPGA、PSU等底层硬件启动异常或固件为空,而又没有大系统可以对底层硬件进行修复,在这种情况下,整个服务器,就可能运行失败。Therefore, it is very important to effectively upgrade the firmware of hardware devices in various scenarios. If the large system has been loaded and run normally when the firmware is abnormal, since the large system runs on the CPU and has a file system, the hardware device appears When the firmware is abnormal, after the large system is started, you can directly go to the directory of the file system to get the firmware file and upgrade it normally. When upgrading the firmware, you need to start the large system normally and package the firmware file into the large system upgrade file. After the large system is started, check the The version number and operation status of the target hardware firmware. If the target hardware resource starts normally, but the version number of the running firmware is inconsistent with the firmware version in the upgrade package, the firmware upgrade will be triggered; however, if the firmware fails to start normally (the firmware running file is damaged) Or the firmware running file is directly empty), in this case, the large system can also directly write the firmware running file in the large package to the target hardware, so as to achieve the purpose of repairing the problem firmware. However, in many cases, there is a situation where the large system is not loaded. In this case, if the underlying hardware such as CPLD, FPGA, and PSU is abnormally started or the firmware is empty, and there is no large system that can repair the underlying hardware, in this case Under the whole server, it may fail to run.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本发明提出了一种服务器的硬件修复方法、系统、计算机设备及介质,解决了在未加载大系统的情况下,服务器的底层硬件出现异常时,不能对出现异常的底层硬件进行修复,导致该底层硬件无法正常启动或者运行程序错乱,甚至整个服务器无法正常运行的问题。In view of this, the present invention proposes a hardware repairing method, system, computer equipment and medium for a server, which solves the problem that when the underlying hardware of the server is abnormal when the large system is not loaded, the abnormal underlying hardware cannot be repaired. Fix the problem that the underlying hardware cannot be started normally or the running program is disordered, and even the entire server cannot run normally.
基于上述目的,本发明实施例的一方面提供了一种服务器的硬件修复方法,具体包括如下步骤:Based on the above purpose, an aspect of the embodiments of the present invention provides a method for repairing hardware of a server, which specifically includes the following steps:
在expander非易失性存储器件的存储空间划分固件分区,并将服务器所有的底层硬件对应的固件文件写入所述固件分区;Divide the firmware partition in the storage space of the expander non-volatile storage device, and write the firmware files corresponding to all the underlying hardware of the server into the firmware partition;
通过所述expander检测各个底层硬件的运行状态;Detect the running status of each underlying hardware through the expander;
响应于检测到有底层硬件的运行状态异常,判断大系统是否加载成功;In response to detecting that the operating state of the underlying hardware is abnormal, determine whether the large system is successfully loaded;
响应于所述大系统未加载成功,通过所述expander从所述固件分区中读取运行异常的底层硬件对应的固件文件;In response to the large system not being loaded successfully, the firmware file corresponding to the abnormally running underlying hardware is read from the firmware partition by the expander;
通过所述expander将所述对应的固件文件写入运行异常的底层硬件,并在写入完成后重启所述大系统以完成运行异常的底层硬件的修复。The corresponding firmware file is written into the abnormally running underlying hardware through the expander, and after the writing is completed, the large system is restarted to complete the repair of the abnormally running underlying hardware.
在一些实施方式中,在expander非易失性存储器件的存储空间划分固件分区包括:In some embodiments, partitioning the firmware partition in the storage space of the expander non-volatile storage device includes:
在所述expander非易失性存储器件的存储空间划分固件分区和临时存储分区。A firmware partition and a temporary storage partition are divided in the storage space of the expander non-volatile storage device.
在一些实施方式中,在响应于检测到有底层硬件的运行状态异常,判断大系统是否加载成功之后,进一步包括:In some embodiments, after determining whether the large system is successfully loaded in response to detecting that the operating state of the underlying hardware is abnormal, the method further includes:
响应于所述大系统加载成功,通过大系统获取运行异常的底层硬件对应的固件文件,并将通过大系统获取的固件文件发送给expander,并基于所述expander执行以下步骤:In response to the successful loading of the large system, the firmware file corresponding to the abnormally running underlying hardware is obtained through the large system, and the firmware file obtained through the large system is sent to the expander, and the following steps are performed based on the expander:
将所述通过大系统获取的固件文件存储在临时存储分区;storing the firmware file obtained through the large system in a temporary storage partition;
从所述临时存储分区读取所述通过大系统获取的固件文件;Read the firmware file obtained by the large system from the temporary storage partition;
驱动运行异常的底层硬件的JTAG以将所述通过大系统获取的固件文件写入运行异常的底层硬件,并在写入完成后重启大系统。The JTAG of the abnormally running underlying hardware is driven to write the firmware file obtained through the large system into the abnormally operating underlying hardware, and the large system is restarted after the writing is completed.
在一些实施方式中,通过所述expander检测各个底层硬件的运行状态包括:In some embodiments, detecting the running state of each underlying hardware by the expander includes:
通过所述expander周期查询各个底层硬件的心跳信息以检测各个底层硬件的运行状态。The heartbeat information of each underlying hardware is queried through the expander cycle to detect the running state of each underlying hardware.
在一些实施方式中,通过所述expander检测各个底层硬件的运行状态包括:In some embodiments, detecting the running state of each underlying hardware by the expander includes:
通过所述expander读取底层硬件的寄存器以确认底层硬件的运行状态,并将所述底层硬件的运行状态同步给上层大系统。The registers of the underlying hardware are read through the expander to confirm the running status of the underlying hardware, and the running status of the underlying hardware is synchronized to the upper-layer large system.
在一些实施方式中,通过expander将所述对应的固件文件写入运行异常的底层硬件包括:In some embodiments, writing the corresponding firmware file into the abnormally running underlying hardware through expander includes:
通过expander驱动运行异常的底层硬件的JTAG以将所述对应的固件文件写入运行异常的底层硬件。The JTAG of the abnormally operating underlying hardware is driven by the expander to write the corresponding firmware file into the abnormally operating underlying hardware.
在一些实施方式中,所述非易失性存储器件包括以下任意一种:FLASH和NVRAM。In some embodiments, the non-volatile memory device includes any of the following: FLASH and NVRAM.
本发明实施例的另一方面,还提供了一种服务器的硬件修复系统,包括:Another aspect of the embodiments of the present invention further provides a hardware repair system for a server, including:
写入模块,所述写入模块配置为在expander非易失性存储器件的存储空间划分固件分区,并将服务器所有的底层硬件对应的固件文件写入所述固件分区;A writing module, the writing module is configured to divide the firmware partition in the storage space of the expander non-volatile storage device, and write the firmware files corresponding to all the underlying hardware of the server into the firmware partition;
检测模块,所述检测模块配置为通过所述expander检测各个底层硬件的运行状态;a detection module, the detection module is configured to detect the running state of each underlying hardware through the expander;
判断模块,所述判断模块配置为响应于检测到有底层硬件的运行状态异常,判断大系统是否加载成功;a judging module, the judging module is configured to judge whether the large system is loaded successfully in response to detecting that there is an abnormal running state of the underlying hardware;
读取模块,所述读取模块配置为响应于所述大系统未加载成功,通过所述expander从所述固件分区中读取运行异常的底层硬件对应的固件文件;a reading module, the reading module is configured to read, through the expander, the firmware file corresponding to the abnormally operating underlying hardware from the firmware partition in response to the large system not being loaded successfully;
修复模块,所述修复模块配置为通过所述expander将所述对应的固件文件写入运行异常的底层硬件,并在写入完成后重启所述大系统以完成运行异常的底层硬件的修复。A repair module, the repair module is configured to write the corresponding firmware file into the abnormally running underlying hardware through the expander, and restart the large system after the writing is completed to complete the repairing of the abnormally running underlying hardware.
本发明实施例的又一方面,还提供了一种计算机设备,包括:至少一个处理器;以及存储器,所述存储器存储有可在所述处理器上运行的计算机程序,所述计算机程序由所述处理器执行时实现如上方法的步骤。In yet another aspect of the embodiments of the present invention, a computer device is further provided, including: at least one processor; and a memory, where the memory stores a computer program that can be executed on the processor, and the computer program is executed by the processor. The steps of the above method are implemented when the processor is executed.
本发明实施例的再一方面,还提供了一种计算机可读存储介质,计算机可读存储介质存储有被处理器执行时实现如上方法步骤的计算机程序。In yet another aspect of the embodiments of the present invention, a computer-readable storage medium is also provided, where the computer-readable storage medium stores a computer program that implements the above method steps when executed by a processor.
本发明至少具有以下有益技术效果:通过在expander非易失性存储器件的存储空间划分固件分区,并将服务器所有的底层硬件对应的固件文件写入固件分区;通过expander检测各个底层硬件的运行状态;响应于检测到有底层硬件的运行状态异常,判断大系统是否加载成功;响应于大系统未加载成功,通过expander从所述固件分区中读取运行异常的底层硬件对应的固件文件;通过expander将对应的固件文件写入运行异常的底层硬件,并在写入完成后重启大系统以完成运行异常的底层硬件的修复,可以对运行异常的硬件进行快速、正确的修复,保证整个服务器的正常运行。The invention has at least the following beneficial technical effects: by dividing the firmware partition in the storage space of the expander non-volatile storage device, and writing the firmware files corresponding to all the underlying hardware of the server into the firmware partition; detecting the running state of each underlying hardware through the expander ; In response to detecting that there is an abnormal operating state of the underlying hardware, it is judged whether the large system is loaded successfully; In response to the large system not being loaded successfully, the firmware file corresponding to the abnormal underlying hardware is read from the firmware partition by expander; By expander Write the corresponding firmware file to the abnormally operating underlying hardware, and restart the large system after the writing is completed to complete the repair of the abnormally operating underlying hardware, which can quickly and correctly repair the abnormally operating hardware and ensure the normal operation of the entire server. run.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的实施例。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other embodiments can also be obtained according to these drawings without creative efforts.
图1为本发明提供的服务器的硬件修复方法的一实施例的框图;FIG. 1 is a block diagram of an embodiment of a method for repairing hardware of a server provided by the present invention;
图2为本发明提供的服务器的硬件修复方法的又一实施例的流程图;FIG. 2 is a flowchart of another embodiment of a method for repairing hardware of a server provided by the present invention;
图3为本发明提供的服务器的硬件修复系统的一实施例的示意图;3 is a schematic diagram of an embodiment of a hardware repair system for a server provided by the present invention;
图4为本发明提供的计算机设备的一实施例的结构示意图;4 is a schematic structural diagram of an embodiment of a computer device provided by the present invention;
图5为本发明提供的计算机可读存储介质的一实施例的结构示意图。FIG. 5 is a schematic structural diagram of an embodiment of a computer-readable storage medium provided by the present invention.
具体实施方式Detailed ways
为使本发明的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本发明实施例进一步详细说明。In order to make the objectives, technical solutions and advantages of the present invention more clearly understood, the embodiments of the present invention will be further described in detail below with reference to the specific embodiments and the accompanying drawings.
需要说明的是,本发明实施例中所有使用“第一”和“第二”的表述均是为了区分两个相同名称非相同的实体或者非相同的参量,可见“第一”“第二”仅为了表述的方便,不应理解为对本发明实施例的限定,后续实施例对此不再一一说明。It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are for the purpose of distinguishing two entities with the same name but not the same or non-identical parameters. It can be seen that "first" and "second" It is only for the convenience of expression and should not be construed as a limitation to the embodiments of the present invention, and subsequent embodiments will not describe them one by one.
基于上述目的,本发明实施例的第一个方面,提出了一种服务器的硬件修复方法的实施例。如图1所示,其包括如下步骤:Based on the above objective, in a first aspect of the embodiments of the present invention, an embodiment of a method for repairing hardware of a server is provided. As shown in Figure 1, it includes the following steps:
S10、在expander非易失性存储器件的存储空间划分固件分区,并将服务器所有的底层硬件对应的固件文件写入所述固件分区;S10, dividing the firmware partition in the storage space of the expander non-volatile storage device, and writing the firmware files corresponding to all the underlying hardware of the server into the firmware partition;
S20、通过所述expander检测各个底层硬件的运行状态;S20, detecting the running state of each underlying hardware through the expander;
S30、响应于检测到有底层硬件的运行状态异常,判断大系统是否加载成功;S30, in response to detecting that the operating state of the underlying hardware is abnormal, determine whether the large system is successfully loaded;
S40、响应于所述大系统未加载成功,通过所述expander从所述固件分区中读取运行异常的底层硬件对应的固件文件;S40, in response to the large system not being loaded successfully, read the firmware file corresponding to the abnormally running underlying hardware from the firmware partition through the expander;
S50、通过所述expander将所述对应的固件文件写入运行异常的底层硬件,并在写入完成后重启所述大系统以完成运行异常的底层硬件的修复。S50. Write the corresponding firmware file into the abnormally running underlying hardware through the expander, and restart the large system after the writing is completed to complete the repair of the abnormally running underlying hardware.
expander运行于大系统与底层硬件(例如,CPLD、FPGA、PSU等)之间,属于信息的收集和中转站,但是expander不带文件系统,因此在存放expander的非易失性存储器件(例如FLASH或NVRAM)中分出一块区域用于存放底层硬件的固件文件,以便在出现大系统未加载而底层硬件运行异常时,expander可以通过在FLASH中读取相应的固件文件直接写入底层硬件,以修复底层硬件,从而恢复整个服务器的正常运行。The expander runs between the large system and the underlying hardware (for example, CPLD, FPGA, PSU, etc.), and belongs to the collection and transfer station of information, but the expander does not have a file system, so it is stored in the non-volatile storage device (such as FLASH) of the expander. or NVRAM) to separate an area for storing the firmware files of the underlying hardware, so that when the large system is not loaded and the underlying hardware is running abnormally, expander can directly write the underlying hardware by reading the corresponding firmware file in FLASH. Repair the underlying hardware, thereby restoring the normal operation of the entire server.
其中,大系统指的是在服务器CPU上的LUNIX系统上的适配,用于将服务器的信息呈现给客户。Among them, the large system refers to the adaptation on the LUNIX system on the server CPU, which is used to present the information of the server to the client.
如图2所示,为对服务器的硬件进行修复的流程图。具体流程如下:As shown in Figure 2, it is a flowchart of repairing the hardware of the server. The specific process is as follows:
expander检测各个底层硬件的运行状态,假设此时检测到CPLD运行异常;The expander detects the running status of each underlying hardware, assuming that the CPLD is abnormally running at this time;
判断大系统是否加载成功;Determine whether the large system is loaded successfully;
若是大系统加载成功,则基于expander执行以下步骤:If the large system is loaded successfully, perform the following steps based on expander:
读取固件分区中预存的CPLD的固件文件;Read the firmware file of the CPLD pre-stored in the firmware partition;
驱动CPLD的JTAG将固件文件写入CPLD,并在写入完成后重启大系统。The JTAG driving the CPLD writes the firmware file to the CPLD and restarts the large system after the writing is complete.
下面通过以下三个方面对大系统未成功加载时,基于expander对硬件进行修复的应用场景进行说明。The following three aspects describe the application scenario of repairing hardware based on expander when the large system is not successfully loaded.
1)研发调试阶段:研发调试过程中,多硬件的固件的配合处于初级阶段,且大系统适配还未完成,易发生数据采集、数据格式、数据交互等异常情况导致某个固件运行异常甚至卡死,通过expander对异常固件的直接升级修复,可减少手动烧录异常固件的次数,节省研发时间,提高研发效率。1) R&D and debugging stage: During the R&D and debugging process, the cooperation of firmware with multiple hardware is in the initial stage, and the large-scale system adaptation has not been completed. Abnormal situations such as data collection, data format, data interaction, etc. are prone to cause a certain firmware to run abnormally or even If it is stuck, the direct upgrade and repair of abnormal firmware by expander can reduce the number of manual burning of abnormal firmware, save research and development time, and improve research and development efficiency.
2)测试、生产阶段:测试及生产调试阶段,在此阶段时不存在大系统,且因测试及生产线从业人员并不是产品研发人员,水平参差不齐,调试手段多样,会触发众多概率极小且极难定位的异常场景,如果出现固件无法正常启动的情况,手段有限,通常通过替换芯片解决问题,效率低下,通过expander对异常固件的直接升级修复,可提高测试、生产效率;2) Testing and production stage: In the testing and production debugging stage, there is no large system at this stage, and because the testing and production line practitioners are not product R&D personnel, their levels are uneven, and there are various debugging methods, which will trigger many probabilities. In abnormal scenarios that are extremely difficult to locate, if the firmware cannot be started normally, the means are limited. Usually, the problem is solved by replacing the chip, which is inefficient. The direct upgrade and repair of the abnormal firmware by expander can improve the test and production efficiency;
3)客户现场阶段:发往客户现场的产品往往不是最终固件版本,大系统可能并没有进行适配,这个时候如果发生底层固件启动异常的情况,因为是客户现场,又不能采用直接替换芯片或固件FLASH的粗暴方法,通过expander对异常固件的直接升级修复,减少客户对异常的感知,可大大提高客户体验,为公司带来口碑效益。3) Customer site stage: The products sent to the customer site are often not the final firmware version, and the large system may not be adapted. If the underlying firmware starts abnormally at this time, because it is the customer site, it is not possible to use direct replacement chips or The rough method of firmware FLASH, through the direct upgrade and repair of abnormal firmware by expander, reduces customers' perception of abnormality, which can greatly improve customer experience and bring word-of-mouth benefits to the company.
本发明实施例,通过在expander非易失性存储器件的存储空间划分固件分区,并将服务器所有的底层硬件对应的固件文件写入固件分区;通过expander检测各个底层硬件的运行状态;响应于检测到有底层硬件的运行状态异常,判断大系统是否加载成功;响应于大系统未加载成功,通过expander从所述固件分区中读取运行异常的底层硬件对应的固件文件;通过expander将对应的固件文件写入运行异常的底层硬件,并在写入完成后重启大系统以完成运行异常的底层硬件的修复,可以对运行异常的硬件进行快速、正确的修复,保证整个服务器的正常运行。In the embodiment of the present invention, the firmware partition is divided in the storage space of the expander non-volatile storage device, and the firmware files corresponding to all the underlying hardware of the server are written into the firmware partition; the operating state of each underlying hardware is detected by the expander; When there is an abnormal operating state of the underlying hardware, it is judged whether the large system is loaded successfully; in response to the large system not being successfully loaded, the firmware file corresponding to the abnormally operating underlying hardware is read from the firmware partition through expander; The file is written to the abnormally operating underlying hardware, and after the writing is completed, the large system is restarted to complete the repair of the abnormally operating underlying hardware. The abnormally operating hardware can be quickly and correctly repaired to ensure the normal operation of the entire server.
在一些实施方式中,在expander非易失性存储器件的存储空间划分固件分区包括:In some embodiments, partitioning the firmware partition in the storage space of the expander non-volatile storage device includes:
在所述expander非易失性存储器件的存储空间划分固件分区和临时存储分区。A firmware partition and a temporary storage partition are divided in the storage space of the expander non-volatile storage device.
具体的,固件分区用于预存底层硬件的固件文件,以便大系统未加载时,expander可以从固件分区读取运行异常的底层硬件的固件文件,写入运行异常的底层硬件;临时存储分区用于在大系统加载成功,正常启动时,底层硬件出现异常,暂存大系统从文件系统目录下读取的固件文件,并在固件文件写入异常的底层硬件后,清除暂存的固件文件。Specifically, the firmware partition is used to pre-store the firmware files of the underlying hardware, so that when the large system is not loaded, expander can read the firmware files of the underlying hardware running abnormally from the firmware partition, and write the firmware files of the underlying hardware running abnormally; the temporary storage partition is used for When the large system is loaded successfully and starts normally, the underlying hardware is abnormal. The firmware files read by the large system from the file system directory are temporarily stored, and the temporarily stored firmware files are cleared after the firmware files are written to the abnormal underlying hardware.
在一些实施方式中,在响应于检测到有底层硬件的运行状态异常,判断大系统是否加载成功之后,进一步包括:In some embodiments, after determining whether the large system is successfully loaded in response to detecting that the operating state of the underlying hardware is abnormal, the method further includes:
响应于所述大系统加载成功,通过大系统获取运行异常的底层硬件对应的固件文件,并将通过大系统获取的固件文件发送给expander,并基于所述expander执行以下步骤:In response to the successful loading of the large system, the firmware file corresponding to the abnormally running underlying hardware is obtained through the large system, and the firmware file obtained through the large system is sent to the expander, and the following steps are performed based on the expander:
将所述通过大系统获取的固件文件存储在临时存储分区;storing the firmware file obtained through the large system in a temporary storage partition;
从所述临时存储分区读取所述通过大系统获取的固件文件;Read the firmware file obtained by the large system from the temporary storage partition;
驱动运行异常的底层硬件的JTAG以将所述通过大系统获取的固件文件写入运行异常的底层硬件,并在写入完成后重启大系统。The JTAG of the abnormally running underlying hardware is driven to write the firmware file obtained through the large system into the abnormally operating underlying hardware, and the large system is restarted after the writing is completed.
结合图2,对大系统成功加载时,硬件的修复过程进行说明。具体流程如下:With reference to Figure 2, the hardware repair process when the large system is successfully loaded will be described. The specific process is as follows:
expander检测各个底层硬件的运行状态,假设此时检测到CPLD运行异常;The expander detects the running status of each underlying hardware, assuming that the CPLD is abnormally running at this time;
判断大系统是否加载成功;Determine whether the large system is loaded successfully;
若是大系统加载成功,则基于大系统执行以下步骤:If the large system is loaded successfully, perform the following steps based on the large system:
获取固件文件并解压;Get the firmware file and extract it;
将解压后的固件文件发送到expander,发送到expander后基于expander执行以下步骤:Send the decompressed firmware file to expander, and perform the following steps based on expander after sending to expander:
将解压后的固件文件存入临时存储分区;Save the decompressed firmware file to the temporary storage partition;
驱动CPLD的JTAG从临时存储分区读取解压后的固件文件写入CPLD,并在写入完成后重启大系统。The JTAG driving the CPLD reads the decompressed firmware file from the temporary storage partition and writes it to the CPLD, and restarts the large system after the writing is completed.
本发明实施例可用于服务器的存储系统。存储系统的某个底层硬件固件异常时,若大系统已经加载完成,大系统可以直接读取文件系统中的底层硬件对应的固件文件,然后通过expander对异常硬件的固件进行升级,从而对问题硬件进行修复。若大系统未成功加载时,以expander为临时处理器,将服务器中可以被expander监测且有交互链路的硬件(例如CPLD、FPGA、PSU)的固件文件预存一份在存放expander的非易失性存储器件的固件分区中,当存储系统的某个硬件异常且大系统未成功加载时,通过在expander的固件中设置逻辑,直接读取固件分区中预存的固件文件对问题硬件进行升级,从而快速、正确的修复问题固件,保证存储系统的正常运行和大系统的正常加载。The embodiments of the present invention can be used in a storage system of a server. When a certain underlying hardware firmware of the storage system is abnormal, if the large system has been loaded, the large system can directly read the firmware file corresponding to the underlying hardware in the file system, and then upgrade the firmware of the abnormal hardware through expander, so as to correct the faulty hardware. make a repair. If the large system is not successfully loaded, the expander is used as a temporary processor, and the firmware files of the hardware (such as CPLD, FPGA, PSU) that can be monitored by the expander and have interactive links (such as CPLD, FPGA, PSU) in the server are pre-stored in the non-volatile storage expander. In the firmware partition of the flexible storage device, when a certain hardware of the storage system is abnormal and the large system is not successfully loaded, by setting the logic in the firmware of expander, the firmware file pre-stored in the firmware partition is directly read to upgrade the faulty hardware. Fix problem firmware quickly and correctly to ensure the normal operation of the storage system and the normal loading of the large system.
在一些实施方式中,通过所述expander检测各个底层硬件的运行状态包括:In some embodiments, detecting the running state of each underlying hardware by the expander includes:
通过所述expander周期查询各个底层硬件的心跳信息以检测各个底层硬件的运行状态。The heartbeat information of each underlying hardware is queried through the expander cycle to detect the running state of each underlying hardware.
在一些实施方式中,通过所述expander检测各个底层硬件的运行状态包括:In some embodiments, detecting the running state of each underlying hardware by the expander includes:
通过所述expander读取底层硬件的寄存器以确认底层硬件的运行状态,并将所述底层硬件的运行状态同步给上层大系统。The registers of the underlying hardware are read through the expander to confirm the running status of the underlying hardware, and the running status of the underlying hardware is synchronized to the upper-layer large system.
在本发明的具体实施例中,在expander程序中为尽可能多的底层硬件提供固件运行状态查询的接口,通过例测(即周期查询)CPLD、FPGA、PSU等底层硬件的心跳信息或者通过I2C、GPIO等总线实际读取底层硬件的寄存器,确认硬件的运行状况,并将信息同步给上层大系统。In a specific embodiment of the present invention, in the expander program, an interface for querying the firmware running status is provided for as many underlying hardware as possible, and the heartbeat information of the underlying hardware such as CPLD, FPGA, and PSU is tested by example (ie, periodic query) or through I2C. , GPIO and other buses actually read the registers of the underlying hardware, confirm the operating status of the hardware, and synchronize the information to the upper-level large system.
大系统需要的机箱管理相关数据是通过访问主板expander,主板expander通过CPLD、I2C等底层协议拿到相关数据进行处理,然后再将相关数据通过组包的形式与大系统进行数据交互,基于此,expander作为信息的收集处理平台和信息中转站,expander与其他固件信息交互通道基本都是打通的,因此当底层硬件运行异常,而大系统又没加载时,通过在expander的非易失性存储器件(例如FLASH)中预先存放的固件文件,让expander直接取升级文件写入CPLD等硬件,进而对底层硬件起到修复的作用。The chassis management-related data required by the large system is obtained by accessing the motherboard expander. The motherboard expander obtains the relevant data through the underlying protocols such as CPLD and I2C for processing, and then exchanges the relevant data with the large system in the form of packets. Based on this, The expander acts as an information collection and processing platform and an information transfer station. The expander and other firmware information exchange channels are basically open. Therefore, when the underlying hardware runs abnormally and the large system is not loaded, the expander's non-volatile storage device is used. (For example, FLASH) pre-stored firmware files, let expander directly take the upgrade files and write them into hardware such as CPLD, and then play a role in repairing the underlying hardware.
在一些实施方式中,通过expander将所述对应的固件文件写入运行异常的底层硬件包括:In some embodiments, writing the corresponding firmware file into the abnormally running underlying hardware through expander includes:
通过expander驱动运行异常的底层硬件的JTAG以将所述对应的固件文件写入运行异常的底层硬件。The JTAG of the abnormally operating underlying hardware is driven by the expander to write the corresponding firmware file into the abnormally operating underlying hardware.
在一些实施方式中,所述非易失性存储器件包括以下任意一种:FLASH和NVRAM。In some embodiments, the non-volatile memory device includes any of the following: FLASH and NVRAM.
基于同一发明构思,根据本发明的另一个方面,如图3所示,本发明的实施例还提供了一种服务器的硬件修复系统,包括:Based on the same inventive concept, according to another aspect of the present invention, as shown in FIG. 3 , an embodiment of the present invention further provides a hardware repair system for a server, including:
写入模块110,所述写入模块110配置为在expander非易失性存储器件的存储空间划分固件分区,并将服务器所有的底层硬件对应的固件文件写入所述固件分区;A
检测模块120,所述检测模块120配置为通过所述expander检测各个底层硬件的运行状态;A
判断模块130,所述判断模块130配置为响应于检测到有底层硬件的运行状态异常,判断大系统是否加载成功;Judging
读取模块140,所述读取模块140配置为响应于所述大系统未加载成功,通过所述expander从所述固件分区中读取运行异常的底层硬件对应的固件文件;A
修复模块150,所述修复模块150配置为通过所述expander将所述对应的固件文件写入运行异常的底层硬件,并在写入完成后重启所述大系统以完成运行异常的底层硬件的修复。
基于同一发明构思,根据本发明的另一个方面,如图4所示,本发明的实施例还提供了一种计算机设备30,在该计算机设备30中包括处理器310以及存储器320,存储器320存储有可在处理器上运行的计算机程序321,处理器310执行程序时执行如下的方法的步骤。Based on the same inventive concept, according to another aspect of the present invention, as shown in FIG. 4 , an embodiment of the present invention further provides a
在expander非易失性存储器件的存储空间划分固件分区,并将服务器所有的底层硬件对应的固件文件写入所述固件分区;Divide the firmware partition in the storage space of the expander non-volatile storage device, and write the firmware files corresponding to all the underlying hardware of the server into the firmware partition;
通过所述expander检测各个底层硬件的运行状态;Detect the running status of each underlying hardware through the expander;
响应于检测到有底层硬件的运行状态异常,判断大系统是否加载成功;In response to detecting that the operating state of the underlying hardware is abnormal, determine whether the large system is successfully loaded;
响应于所述大系统未加载成功,通过所述expander从所述固件分区中读取运行异常的底层硬件对应的固件文件;In response to the large system not being loaded successfully, the firmware file corresponding to the abnormally running underlying hardware is read from the firmware partition by the expander;
通过所述expander将所述对应的固件文件写入运行异常的底层硬件,并在写入完成后重启所述大系统以完成运行异常的底层硬件的修复。The corresponding firmware file is written into the abnormally running underlying hardware through the expander, and after the writing is completed, the large system is restarted to complete the repair of the abnormally running underlying hardware.
其中,存储器作为一种非易失性计算机可读存储介质,可用于存储非易失性软件程序、非易失性计算机可执行程序以及模块,如本申请实施例中的所述服务器的硬件修复方法对应的程序指令/模块。处理器通过运行存储在存储器中的非易失性软件程序、指令以及模块,从而执行装置的各种功能应用以及数据处理,即实现上述方法实施例的服务器的硬件修复方法。The memory, as a non-volatile computer-readable storage medium, can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the hardware restoration of the server in the embodiments of the present application. The program instruction/module corresponding to the method. The processor executes various functional applications and data processing of the device by running the non-volatile software programs, instructions and modules stored in the memory, that is, implements the hardware repair method of the server in the above method embodiments.
存储器可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储根据装置的使用所创建的数据等。此外,存储器可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实施例中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至本地模块。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the device, and the like. Additionally, the memory may include high speed random access memory, and may also include nonvolatile memory, such as at least one magnetic disk storage device, flash memory device, or other nonvolatile solid state storage device. In some embodiments, the memory may optionally include memory located remotely from the processor, which may be connected to the local module via a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
在一些实施方式中,在expander非易失性存储器件的存储空间划分固件分区包括:In some embodiments, partitioning the firmware partition in the storage space of the expander non-volatile storage device includes:
在所述expander非易失性存储器件的存储空间划分固件分区和临时存储分区。A firmware partition and a temporary storage partition are divided in the storage space of the expander non-volatile storage device.
在一些实施方式中,在响应于检测到有底层硬件的运行状态异常,判断大系统是否加载成功之后,进一步包括:In some embodiments, after determining whether the large system is successfully loaded in response to detecting that the operating state of the underlying hardware is abnormal, the method further includes:
响应于所述大系统加载成功,通过大系统获取运行异常的底层硬件对应的固件文件,并将通过大系统获取的固件文件发送给expander,并基于所述expander执行以下步骤:In response to the successful loading of the large system, the firmware file corresponding to the abnormally running underlying hardware is obtained through the large system, and the firmware file obtained through the large system is sent to the expander, and the following steps are performed based on the expander:
将所述通过大系统获取的固件文件存储在临时存储分区;storing the firmware file obtained through the large system in a temporary storage partition;
从所述临时存储分区读取所述通过大系统获取的固件文件;Read the firmware file obtained by the large system from the temporary storage partition;
驱动运行异常的底层硬件的JTAG以将所述通过大系统获取的固件文件写入运行异常的底层硬件,并在写入完成后重启大系统。The JTAG of the abnormally running underlying hardware is driven to write the firmware file obtained through the large system into the abnormally operating underlying hardware, and the large system is restarted after the writing is completed.
在一些实施方式中,通过所述expander检测各个底层硬件的运行状态包括:In some embodiments, detecting the running state of each underlying hardware by the expander includes:
通过所述expander周期查询各个底层硬件的心跳信息以检测各个底层硬件的运行状态。The heartbeat information of each underlying hardware is queried through the expander cycle to detect the running state of each underlying hardware.
在一些实施方式中,通过所述expander检测各个底层硬件的运行状态包括:In some embodiments, detecting the running state of each underlying hardware by the expander includes:
通过所述expander读取底层硬件的寄存器以确认底层硬件的运行状态,并将所述底层硬件的运行状态同步给上层大系统。The registers of the underlying hardware are read through the expander to confirm the running status of the underlying hardware, and the running status of the underlying hardware is synchronized to the upper-layer large system.
在一些实施方式中,通过expander将所述对应的固件文件写入运行异常的底层硬件包括:In some embodiments, writing the corresponding firmware file into the abnormally running underlying hardware through expander includes:
通过expander驱动运行异常的底层硬件的JTAG以将所述对应的固件文件写入运行异常的底层硬件。The JTAG of the abnormally operating underlying hardware is driven by the expander to write the corresponding firmware file into the abnormally operating underlying hardware.
在一些实施方式中,所述非易失性存储器件包括以下任意一种:FLASH和NVRAM。In some embodiments, the non-volatile memory device includes any of the following: FLASH and NVRAM.
基于同一发明构思,根据本发明的另一个方面,如图5所示,本发明的实施例还提供了一种计算机可读存储介质40,计算机可读存储介质40存储有被处理器执行时执行如上方法的计算机程序410。Based on the same inventive concept, according to another aspect of the present invention, as shown in FIG. 5, an embodiment of the present invention further provides a computer-
最后需要说明的是,本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关硬件来完成,程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,程序的存储介质可为磁碟、光盘、只读存储记忆体(ROM)或随机存储记忆体(RAM)等。上述计算机程序的实施例,可以达到与之对应的前述任意方法实施例相同或者相类似的效果。Finally, it should be noted that those of ordinary skill in the art can understand that all or part of the process in the method of the above-mentioned embodiments can be implemented by instructing the relevant hardware through a computer program, and the program can be stored in a computer-readable storage medium. When the program is executed, it may include the flow of the embodiments of the above-mentioned methods. Wherein, the storage medium of the program may be a magnetic disk, an optical disk, a read only memory (ROM) or a random access memory (RAM) or the like. The above computer program embodiments can achieve the same or similar effects as any of the foregoing method embodiments corresponding thereto.
本领域技术人员还将明白的是,结合这里的公开所描述的各种示例性逻辑块、模块、电路和算法步骤可以被实现为电子硬件、计算机软件或两者的组合。为了清楚地说明硬件和软件的这种可互换性,已经就各种示意性组件、方块、模块、电路和步骤的功能对其进行了一般性的描述。这种功能是被实现为软件还是被实现为硬件取决于具体应用以及施加给整个系统的设计约束。本领域技术人员可以针对每种具体应用以各种方式来实现的功能,但是这种实现决定不应被解释为导致脱离本发明实施例公开的范围。Those skilled in the art will also appreciate that the various exemplary logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends on the specific application and design constraints imposed on the overall system. Those skilled in the art may implement the functions in various ways for each specific application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments of the present invention.
以上是本发明公开的示例性实施例,但是应当注意,在不背离权利要求限定的本发明实施例公开的范围的前提下,可以进行多种改变和修改。根据这里描述的公开实施例的方法权利要求的功能、步骤和/或动作不需以任何特定顺序执行。上述本发明实施例公开实施例序号仅仅为了描述,不代表实施例的优劣。此外,尽管本发明实施例公开的元素可以以个体形式描述或要求,但除非明确限制为单数,也可以理解为多个。The above are exemplary embodiments of the present disclosure, but it should be noted that various changes and modifications may be made without departing from the scope of the disclosure of the embodiments of the present invention as defined in the claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. The above-mentioned embodiments of the present invention disclose the serial numbers of the embodiments only for description, and do not represent the advantages and disadvantages of the embodiments. Furthermore, although elements disclosed in the embodiments of the present invention may be described or claimed in the singular, unless expressly limited to the singular, the plural may also be construed.
应当理解的是,在本文中使用的,除非上下文清楚地支持例外情况,单数形式“一个”旨在也包括复数形式。还应当理解的是,在本文中使用的“和/或”是指包括一个或者一个以上相关联地列出的项目的任意和所有可能组合。It should be understood that, as used herein, the singular form "a" is intended to include the plural form as well, unless the context clearly supports an exception. It will also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
所属领域的普通技术人员应当理解:以上任何实施例的讨论仅为示例性的,并非旨在暗示本发明实施例公开的范围(包括权利要求)被限于这些例子;在本发明实施例的思路下,以上实施例或者不同实施例中的技术特征之间也可以进行组合,并存在如上的本发明实施例的不同方面的许多其它变化,为了简明它们没有在细节中提供。因此,凡在本发明实施例的精神和原则之内,所做的任何省略、修改、等同替换、改进等,均应包含在本发明实施例的保护范围之内。Those of ordinary skill in the art should understand that the discussion of any of the above embodiments is only exemplary, and is not intended to imply that the scope (including the claims) disclosed by the embodiments of the present invention is limited to these examples; under the idea of the embodiments of the present invention , the technical features in the above embodiments or different embodiments can also be combined, and there are many other changes in different aspects of the above embodiments of the present invention, which are not provided in detail for the sake of brevity. Therefore, any omission, modification, equivalent replacement, improvement, etc. made within the spirit and principle of the embodiments of the present invention should be included within the protection scope of the embodiments of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210655271.5A CN115048244B (en) | 2022-06-10 | 2022-06-10 | A server hardware repair method, system, computer equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210655271.5A CN115048244B (en) | 2022-06-10 | 2022-06-10 | A server hardware repair method, system, computer equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115048244A true CN115048244A (en) | 2022-09-13 |
CN115048244B CN115048244B (en) | 2024-06-07 |
Family
ID=83160479
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210655271.5A Active CN115048244B (en) | 2022-06-10 | 2022-06-10 | A server hardware repair method, system, computer equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115048244B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140115386A1 (en) * | 2012-10-24 | 2014-04-24 | Hon Hai Precision Industry Co., Ltd. | Server and method for managing server |
US20200201568A1 (en) * | 2018-12-20 | 2020-06-25 | Micron Technology, Inc. | Exception handling based on responses to memory requests in a memory subsystem |
CN112230939A (en) * | 2020-09-01 | 2021-01-15 | 西安广和通无线软件有限公司 | Hardware module repairing method and device, computer equipment and storage medium |
CN113448760A (en) * | 2021-06-05 | 2021-09-28 | 山东英信计算机技术有限公司 | Method, system, equipment and medium for recovering abnormal state of hard disk |
-
2022
- 2022-06-10 CN CN202210655271.5A patent/CN115048244B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140115386A1 (en) * | 2012-10-24 | 2014-04-24 | Hon Hai Precision Industry Co., Ltd. | Server and method for managing server |
US20200201568A1 (en) * | 2018-12-20 | 2020-06-25 | Micron Technology, Inc. | Exception handling based on responses to memory requests in a memory subsystem |
CN112230939A (en) * | 2020-09-01 | 2021-01-15 | 西安广和通无线软件有限公司 | Hardware module repairing method and device, computer equipment and storage medium |
CN113448760A (en) * | 2021-06-05 | 2021-09-28 | 山东英信计算机技术有限公司 | Method, system, equipment and medium for recovering abnormal state of hard disk |
Also Published As
Publication number | Publication date |
---|---|
CN115048244B (en) | 2024-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106445577A (en) | Update method, server system, and non-transitory computer-readable medium | |
WO2019056475A1 (en) | Automated test task management method and apparatus, device, and storage medium | |
TW202030602A (en) | The method and system of bios recovery and update | |
WO2016206514A1 (en) | Startup processing method and device | |
CN111722954A (en) | Server abnormality locating method, device, storage medium and server | |
CN113805925A (en) | Online upgrade method, device, device and medium for distributed cluster management software | |
CN108897646A (en) | A kind of switching method and baseboard management controller of BIOS chip | |
CN116382968B (en) | Fault detection method and device for external equipment | |
CN119356716A (en) | Firmware upgrade method and device, storage medium, electronic device and program product | |
US8689048B1 (en) | Non-logging resumable distributed cluster | |
CN109857583B (en) | Processing method and device | |
CN114116330B (en) | Server performance testing method, system, terminal and storage medium | |
CN111124724A (en) | A node fault testing method and device for a distributed block storage system | |
CN113268206A (en) | Network target range resource hot plug implementation method and system | |
CN107992420A (en) | Put forward the management method and system of survey project | |
JP7389877B2 (en) | Network optimal boot path method and system | |
CN115048244B (en) | A server hardware repair method, system, computer equipment and medium | |
JP2006065440A (en) | Process management system | |
CN105159810A (en) | Method and device for testing BIOS of computer system | |
CN116841585A (en) | Firmware upgrading method and device | |
US11354109B1 (en) | Firmware updates using updated firmware files in a dedicated firmware volume | |
CN115291925A (en) | A BMC upgrade method, system, device and storage medium | |
CN103186403A (en) | Node replacement processing method and server system using the method | |
CN114201393A (en) | Processing methods, apparatus, equipment, media and program products for software testing | |
US12333293B2 (en) | Online update compatibility verification prior to update implementation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |