CN115373936A - A monitoring method, device, equipment and medium for storing the power-on state of a hard disk - Google Patents
A monitoring method, device, equipment and medium for storing the power-on state of a hard disk Download PDFInfo
- Publication number
- CN115373936A CN115373936A CN202210908162.XA CN202210908162A CN115373936A CN 115373936 A CN115373936 A CN 115373936A CN 202210908162 A CN202210908162 A CN 202210908162A CN 115373936 A CN115373936 A CN 115373936A
- Authority
- CN
- China
- Prior art keywords
- power
- hard disk
- chassis
- information
- storage hard
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3037—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a memory, e.g. virtual memory, cache
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3055—Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/32—Monitoring with visual or acoustical indication of the functioning of the machine
- G06F11/324—Display of status information
- G06F11/327—Alarm or error message display
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Debugging And Monitoring (AREA)
- Power Sources (AREA)
Abstract
本发明涉及一种存储硬盘上电状态的监控方法、装置、设备及介质,若干个存储硬盘分别设置于若干个机箱中,监控方法包括:SAS扩展器获取所有存储硬盘的上电状态信息,并向机箱管理器发送上电异常通知信息;机箱管理器通过SAS扩展器获取各个机箱的各项组件属性信息;其中,各项组件属性信息包括每个机箱中每个存储硬盘的上电状态信息;机箱管理器根据每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘;机箱管理器获取掉电硬盘复电后的上电状态信息,并在掉电硬盘复电后的上电状态信息出现异常时向用户端发出硬盘掉电告警信息。通过上述技术方案,可解决目前存储硬盘易发生掉电情况、无法实时监控的问题。
The invention relates to a monitoring method, device, equipment and medium for storing the power-on status of hard disks. Several storage hard disks are respectively arranged in several chassis. The monitoring method includes: a SAS expander obtains the power-on status information of all storage hard disks, and Send power-on abnormality notification information to the chassis manager; the chassis manager obtains each component attribute information of each chassis through the SAS expander; wherein, each component attribute information includes power-on status information of each storage hard disk in each chassis; The chassis manager obtains the power-on status information of each storage hard disk according to the power-on status information of each storage hard disk, and sends the power recovery command to the corresponding power-off hard disk; the chassis manager obtains the power-on status information of the power-off hard disk When the power-on status information of the power-on hard disk after power-off is abnormal, it will send a hard disk power-off warning message to the client. Through the above-mentioned technical solution, the problem that the current storage hard disk is prone to power failure and cannot be monitored in real time can be solved.
Description
技术领域technical field
本发明涉及存储硬盘技术领域,尤其是指一种存储硬盘上电状态的监控方法、装置、设备及介质。The present invention relates to the technical field of storage hard disks, in particular to a method, device, equipment and medium for monitoring the power-on status of a storage hard disk.
背景技术Background technique
在大数据时代,海量数据需要专业的存储设备进行存放,由于数据的宝贵性,有效保证数据在系统中的完整性也成为了存储设备的一个最基本要求。In the era of big data, massive data requires professional storage devices for storage. Due to the preciousness of data, effectively ensuring the integrity of data in the system has become one of the most basic requirements for storage devices.
硬盘作为存储设备中十分重要的部分,在存储数据时需要对硬盘的各项属性进行实时监控。具体的,硬盘接入存储设备后,其上电状态应该是稳定的;但是长期的使用过程中,难免会出现硬盘与存储设备接触不良或其他不良问题,从而导致硬盘掉电发生,此时硬盘则成为故障硬盘。As a very important part of the storage device, the hard disk needs to monitor the attributes of the hard disk in real time when storing data. Specifically, after the hard disk is connected to the storage device, its power-on state should be stable; however, during long-term use, it is inevitable that there will be poor contact between the hard disk and the storage device or other bad problems, which will cause the hard disk to lose power. It becomes a failed hard disk.
可见,在服务器运行过程中,当检测到故障硬盘时,需要实时监控、及时对其进行修复或者更换,以保证数据完整性和业务正常进行。It can be seen that during the operation of the server, when a faulty hard disk is detected, it needs to be monitored in real time and repaired or replaced in time to ensure data integrity and normal business operations.
发明内容Contents of the invention
为了解决上述技术问题,本发明提供了一种存储硬盘上电状态的监控方法、装置、设备及介质,所述存储硬盘上电状态的监控方法用于解决目前存储硬盘易发生掉电情况、无法实时监控的问题。In order to solve the above technical problems, the present invention provides a method, device, equipment and medium for monitoring the power-on state of a storage hard disk. The method for monitoring the power-on state of a storage hard disk is used to solve the problem that the current The problem of real-time monitoring.
为实现上述目的,本发明提供一种存储硬盘上电状态的监控方法,若干个存储硬盘分别设置于若干个机箱中,所述监控方法包括步骤:In order to achieve the above object, the present invention provides a method for monitoring the power-on state of a storage hard disk. Several storage hard disks are respectively arranged in several chassis. The monitoring method includes the steps:
SAS扩展器获取所有存储硬盘的上电状态信息,并在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息;The SAS expander obtains the power-on status information of all storage hard disks, and sends a power-on exception notification message to the chassis manager when at least one storage hard disk has an abnormal power-on status;
所述机箱管理器通过所述SAS扩展器获取各个机箱的各项组件属性信息;其中,所述各项组件属性信息包括每个机箱中每个存储硬盘的上电状态信息;The chassis manager obtains each component attribute information of each chassis through the SAS expander; wherein, the various component attribute information includes power-on status information of each storage hard disk in each chassis;
所述机箱管理器根据所述每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘;The chassis manager obtains the power-down hard disk information according to the power-on status information of each storage hard disk, and sends the power-recovery instruction to the corresponding power-down hard disk;
所述机箱管理器按照预定时间获取所述掉电硬盘复电后的上电状态信息,并在所述掉电硬盘复电后的上电状态信息出现异常时向用户端发出硬盘掉电告警信息。The chassis manager obtains the power-on status information after the power-down hard disk is restored according to a predetermined time, and sends a hard disk power-off warning message to the user terminal when the power-on state information after the power-down hard disk is restored is abnormal. .
进一步的,所述机箱管理器根据所述每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘,具体包括:Further, the chassis manager obtains information on the power-off hard disk according to the power-on status information of each storage hard disk, and sends a power-recovery instruction to the corresponding power-off hard disk, specifically including:
所述机箱管理器获取当前存储硬盘的上电状态信息中的上电状态标志与在位状态标志;The chassis manager obtains the power-on status flag and the in-position status flag in the power-on status information of the currently stored hard disk;
当所述在位状态标志正确、且所述上电状态标志正常时,所述机箱管理器判别所述当前存储硬盘处于掉电状态;When the in-position status flag is correct and the power-on status flag is normal, the chassis manager determines that the current storage hard disk is in a power-off state;
所述机箱管理器获取所述当前存储硬盘的掉电硬盘信息,并将复电指令发送给所述当前存储硬盘。The chassis manager acquires the information of the power-down hard disk of the current storage hard disk, and sends a power recovery instruction to the current storage hard disk.
进一步的,所述机箱管理器获取所述当前存储硬盘的掉电硬盘信息,并将复电指令发送给所述当前存储硬盘,具体包括:Further, the chassis manager obtains the power-down hard disk information of the current storage hard disk, and sends a power recovery instruction to the current storage hard disk, specifically including:
所述机箱管理器获取所述当前存储硬盘对应的机箱信息与插盘槽位信息,并将所述复电指令中的上电指令发送给所述机箱信息与插盘槽位信息对应的机箱与插盘槽位。The chassis manager obtains the chassis information and disk slot information corresponding to the current storage hard disk, and sends the power-on command in the power-on command to the chassis and disk slot information corresponding to the chassis information and the disk slot information. Insertion slot.
进一步的,所述机箱管理器通过所述SAS扩展器获取各个机箱的各项组件属性信息,具体包括:Further, the chassis manager acquires various component attribute information of each chassis through the SAS expander, specifically including:
所述机箱管理器将发现请求发送给所述SAS扩展器;The chassis manager sends a discovery request to the SAS expander;
所述SAS扩展器通过发现过程获取各个机箱的各项组件属性信息。The SAS expander obtains attribute information of each component of each chassis through a discovery process.
进一步的,在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息,具体包括:Further, when there is an abnormal power-on state of at least one storage hard disk, a power-on abnormal notification message is sent to the chassis manager, specifically including:
在出现至少一个存储硬盘上电状态异常时,所述SAS扩展器发布广播事件;其中,所述广播事件包括请求机箱管理器接收数据通知信息;When the power-on state of at least one storage hard disk is abnormal, the SAS expander issues a broadcast event; wherein the broadcast event includes requesting the chassis manager to receive data notification information;
在所述广播事件依次经过驱动层、协议层并到达业务层之后,所述业务层中的所述机箱管理器接收所述广播事件。After the broadcast event passes through the driver layer, the protocol layer and reaches the service layer in sequence, the chassis manager in the service layer receives the broadcast event.
进一步的,在所述机箱管理器根据所述每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘之前,所述监控方法还包括:Further, before the chassis manager obtains the power-down hard disk information according to the power-on status information of each storage hard disk, and sends the power-recovery instruction to the corresponding power-down hard disk, the monitoring method further includes:
所述机箱管理器根据所述各项组件属性信息获取当前机箱的机箱在位状态信息;The chassis manager acquires the chassis in-position status information of the current chassis according to the attribute information of each component;
在所述当前机箱的机箱在位状态信息正常时,所述机箱管理器获取所述当前机箱中每个存储硬盘的上电状态信息。When the chassis presence status information of the current chassis is normal, the chassis manager acquires the power-on status information of each storage hard disk in the current chassis.
进一步的,在所述机箱管理器根据所述每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘之前,所述监控方法还包括:Further, before the chassis manager obtains the power-down hard disk information according to the power-on status information of each storage hard disk, and sends the power-recovery instruction to the corresponding power-down hard disk, the monitoring method further includes:
所述机箱管理器根据所述各项组件属性信息获取所述当前机箱的各个供电单元状态信息;The chassis manager obtains status information of each power supply unit of the current chassis according to the attribute information of each component;
当所述各个供电单元状态信息正常时,所述机箱管理器获取所述当前机箱中每个存储硬盘的上电状态信息。When the status information of each power supply unit is normal, the chassis manager acquires the power-on status information of each storage hard disk in the current chassis.
本发明还提供一种存储硬盘上电状态的监控装置,用于实现前述所述的存储硬盘上电状态的监控方法,所述监控装置包括:The present invention also provides a monitoring device for storing the power-on state of a hard disk, which is used to implement the aforementioned method for monitoring the power-on state of a storage hard disk. The monitoring device includes:
SAS扩展器,用于获取所有存储硬盘的上电状态信息,并在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息;The SAS expander is configured to obtain power-on status information of all storage hard disks, and send power-on exception notification information to the chassis manager when at least one storage hard disk has an abnormal power-on status;
所述机箱管理器,用于通过所述SAS扩展器获取各个机箱的各项组件属性信息;其中,所述各项组件属性信息包括每个机箱中每个存储硬盘的上电状态信息;The chassis manager is configured to obtain each component attribute information of each chassis through the SAS expander; wherein, the various component attribute information includes power-on status information of each storage hard disk in each chassis;
所述机箱管理器还用于遍历每个机箱中每个存储硬盘,并根据所述每个机箱中每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘;The chassis manager is also used to traverse each storage hard disk in each chassis, and obtain power-down hard disk information according to the power-on status information of each storage hard disk in each chassis, and send the power recovery instruction to the corresponding power off hard drive;
所述机箱管理器还用于按照预定时间获取所述掉电硬盘复电后的上电状态信息,并在所述掉电硬盘复电后的上电状态信息出现异常时向用户端发出硬盘掉电告警信息。The chassis manager is also used to obtain the power-on state information after the power-off hard disk is restored according to a predetermined time, and send a hard disk failure message to the user terminal when the power-on state information after the power-off hard disk is restored is abnormal. Electric warning information.
本发明又提供一种计算机设备,包括存储器、处理器及计算机程序,所述计算机程序存储在所述存储器上并可在所述处理器上运行,所述处理器执行所述计算机程序时实现以下步骤:The present invention also provides a computer device, including a memory, a processor and a computer program, the computer program is stored on the memory and can run on the processor, and the processor implements the following when executing the computer program step:
SAS扩展器获取所有存储硬盘的上电状态信息,并在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息;The SAS expander obtains the power-on status information of all storage hard disks, and sends a power-on exception notification message to the chassis manager when at least one storage hard disk has an abnormal power-on status;
所述机箱管理器通过所述SAS扩展器获取各个机箱的各项组件属性信息;其中,所述各项组件属性信息包括每个机箱中每个存储硬盘的上电状态信息;The chassis manager obtains each component attribute information of each chassis through the SAS expander; wherein, the various component attribute information includes power-on status information of each storage hard disk in each chassis;
所述机箱管理器根据所述每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘;The chassis manager obtains the power-down hard disk information according to the power-on status information of each storage hard disk, and sends the power-recovery instruction to the corresponding power-down hard disk;
所述机箱管理器按照预定时间获取所述掉电硬盘复电后的上电状态信息,并在所述掉电硬盘复电后的上电状态信息出现异常时向用户端发出硬盘掉电告警信息。The chassis manager obtains the power-on status information after the power-down hard disk is restored according to a predetermined time, and sends a hard disk power-off warning message to the user terminal when the power-on state information after the power-down hard disk is restored is abnormal. .
本发明再提供一种计算机可读存储介质,其存储有计算机程序,所述计算机程序被处理器执行时实现以下步骤:The present invention further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the following steps are implemented:
SAS扩展器获取所有存储硬盘的上电状态信息,并在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息;The SAS expander obtains the power-on status information of all storage hard disks, and sends a power-on exception notification message to the chassis manager when at least one storage hard disk has an abnormal power-on status;
所述机箱管理器通过所述SAS扩展器获取各个机箱的各项组件属性信息;其中,所述各项组件属性信息包括每个机箱中每个存储硬盘的上电状态信息;The chassis manager obtains each component attribute information of each chassis through the SAS expander; wherein, the various component attribute information includes power-on status information of each storage hard disk in each chassis;
所述机箱管理器根据所述每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘;The chassis manager obtains the power-down hard disk information according to the power-on status information of each storage hard disk, and sends the power-recovery instruction to the corresponding power-down hard disk;
所述机箱管理器按照预定时间获取所述掉电硬盘复电后的上电状态信息,并在所述掉电硬盘复电后的上电状态信息出现异常时向用户端发出硬盘掉电告警信息。The chassis manager obtains the power-on status information after the power-down hard disk is restored according to a predetermined time, and sends a hard disk power-off warning message to the user terminal when the power-on state information after the power-down hard disk is restored is abnormal. .
本发明的上述技术方案,相比现有技术具有以下技术效果:The technical scheme of the present invention has the following technical effects compared with the prior art:
本发明中,采用SAS扩展器来监控各个硬盘的上电状态,SAS扩展器可读取硬盘上电状态并进行记录;当发现硬盘上电状态异常时,SAS扩展器向机箱管理器发送上电异常通知信息;In the present invention, a SAS expander is used to monitor the power-on state of each hard disk, and the SAS expander can read and record the power-on state of the hard disk; when the power-on state of the hard disk is found to be abnormal, the SAS expander sends a power-on Exception notification information;
机箱管理器收到上电异常通知信息后,会向SAS扩展器获取机箱当前的各项属性,其中包括各个硬盘的上电状态;After receiving the abnormal power-on notification, the chassis manager will obtain the current attributes of the chassis from the SAS expander, including the power-on status of each hard disk;
机箱管理器获取到硬盘上电状态后,则判定是否出现了异常掉电;若是存在异常掉电问题,则会对故障硬盘进行重新上电动作;After the chassis manager obtains the power-on status of the hard disk, it will determine whether there is an abnormal power-off; if there is an abnormal power-off problem, it will power on the faulty hard disk again;
接着,机箱管理器根据故障硬盘的实际上电结果判断是否需要上报告警,用于通知用户更换对应的故障硬盘;Next, the chassis manager judges whether it is necessary to report an alarm according to the actual power-on result of the faulty hard disk, so as to notify the user to replace the corresponding faulty hard disk;
综上,通过SAS扩展器来实时监控上电状态、通过机箱管理器进行掉电诊断以及复电操作,可及时发现掉电的硬盘、进而方便用户对问题硬盘进行及时有效处理,可有效保证存储系统的安全性和数据传输效率。In summary, real-time monitoring of the power-on status through the SAS expander, and power-off diagnosis and power-recovery operations through the chassis manager can detect power-off hard drives in a timely manner, thereby facilitating users to deal with problem hard drives in a timely and effective manner, and effectively ensuring storage capacity. System security and data transmission efficiency.
附图说明Description of drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained based on these drawings without creative effort.
图1是本发明实施例一中存储硬盘上电状态的监控方法的流程示意图;1 is a schematic flow diagram of a method for monitoring the power-on state of a storage hard disk in
图2是本发明实际实施例中存储系统的架构图;Fig. 2 is the architectural diagram of storage system in the actual embodiment of the present invention;
图3是本发明实际实施例中机箱管理器解析供电状态数据的示意图;Fig. 3 is a schematic diagram of analyzing power supply state data by a chassis manager in an actual embodiment of the present invention;
图4是本发明实际实施例中机箱管理器解析上电状态标志、在位状态标志的示意图;Fig. 4 is a schematic diagram of analyzing the power-on state flag and the in-position state flag by the chassis manager in an actual embodiment of the present invention;
图5是本发明实际实施例中监控方法的具体流程图;Fig. 5 is the specific flowchart of monitoring method in the actual embodiment of the present invention;
图6是本发明实施例二中存储硬盘上电状态的监控装置的结构框图;6 is a structural block diagram of a monitoring device for storing the power-on state of a hard disk in
图7为本发明实施例二中计算机设备的内部结构图。FIG. 7 is an internal structure diagram of a computer device in
具体实施方式Detailed ways
为使本发明的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only Some, but not all, embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
实施例一:Embodiment one:
如图1所示,本发明实施例提供一种存储硬盘上电状态的监控方法,若干个存储硬盘分别设置于若干个机箱中,监控方法包括步骤:As shown in Figure 1, the embodiment of the present invention provides a method for monitoring the power-on status of a storage hard disk. Several storage hard disks are respectively arranged in several chassis. The monitoring method includes the steps:
S1、SAS扩展器获取所有存储硬盘的上电状态信息,并在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息;S1. The SAS expander obtains the power-on status information of all storage hard disks, and sends a power-on exception notification message to the chassis manager when at least one storage hard disk is abnormal in power-on status;
S2机箱管理器通过SAS扩展器获取各个机箱的各项组件属性信息;其中,各项组件属性信息包括每个机箱中每个存储硬盘的上电状态信息;The S2 chassis manager obtains the attribute information of each component of each chassis through the SAS expander; wherein, the attribute information of each component includes the power-on status information of each storage hard disk in each chassis;
S5机箱管理器根据每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘;The S5 chassis manager obtains the power-off hard disk information according to the power-on status information of each storage hard disk, and sends the power-off command to the corresponding power-off hard disk;
S6机箱管理器按照预定时间获取掉电硬盘复电后的上电状态信息,并在掉电硬盘复电后的上电状态信息出现异常时向用户端发出硬盘掉电告警信息。The S6 chassis manager obtains the power-on status information after the power-off hard disk is restored according to the predetermined time, and sends a hard disk power-off warning message to the user terminal when the power-on state information after the power-off hard disk is abnormal.
在具体实施例中,采用SAS扩展器来监控各个硬盘的上电状态,SAS扩展器可读取硬盘上电状态并进行记录;当发现硬盘上电状态异常时,SAS扩展器向机箱管理器发送上电异常通知信息;In a specific embodiment, a SAS expander is used to monitor the power-on status of each hard disk, and the SAS expander can read and record the power-on status of the hard disk; when it is found that the power-on state of the hard disk is abnormal, the SAS expander sends Power-on abnormal notification information;
机箱管理器收到上电异常通知信息后,会向SAS扩展器获取机箱当前的各项属性,其中包括各个硬盘的上电状态;After receiving the abnormal power-on notification, the chassis manager will obtain the current attributes of the chassis from the SAS expander, including the power-on status of each hard disk;
机箱管理器获取到硬盘上电状态后,则判定是否出现了异常掉电;若是存在异常掉电问题,则会对故障硬盘进行重新上电动作;After the chassis manager obtains the power-on status of the hard disk, it will determine whether there is an abnormal power-off; if there is an abnormal power-off problem, it will power on the faulty hard disk again;
接着,机箱管理器根据故障硬盘的实际上电结果判断是否需要上报告警,用于通知用户更换对应的故障硬盘;Next, the chassis manager judges whether it is necessary to report an alarm according to the actual power-on result of the faulty hard disk, so as to notify the user to replace the corresponding faulty hard disk;
综上,通过SAS扩展器来实时监控上电状态、通过机箱管理器进行掉电诊断以及复电操作,可及时发现掉电的硬盘、进而方便用户对问题硬盘进行及时有效处理,可有效保证存储系统的安全性和数据传输效率。In summary, real-time monitoring of the power-on status through the SAS expander, and power-off diagnosis and power-recovery operations through the chassis manager can detect power-off hard drives in a timely manner, thereby facilitating users to deal with problem hard drives in a timely and effective manner, and effectively ensuring storage capacity. System security and data transmission efficiency.
在实际应用中,SAS(即Serial Attached SCSI)为一种电脑集线的技术,其功能主要是做周边零件的数据传输,如应用于硬盘、CD-ROM等设备的设计接口。In practical applications, SAS (Serial Attached SCSI) is a computer hub technology, and its function is mainly for data transmission of peripheral parts, such as the design interface for hard disks, CD-ROMs and other devices.
SAS扩展器(即SAS Expander)为一种遵循SAS协议的扩展器,可用于机箱管理,其下行口可与硬盘相连。The SAS expander (SAS Expander) is an expander that follows the SAS protocol and can be used for chassis management, and its downlink port can be connected to a hard disk.
SES(即SCSI Enclosure Services)是T10技术委员会制定的用于机箱管理的标准。SES (that is, SCSI Enclosure Services) is a standard for chassis management formulated by the T10 technical committee.
EN(即Enclosure Mangement)代表机箱管理;slot代表插硬盘的槽位。EN (Enclosure Management) stands for enclosure management; slot stands for the slot where the hard disk is inserted.
在一个优选的实施方式中,S5具体包括:In a preferred embodiment, S5 specifically includes:
S51机箱管理器获取当前存储硬盘的上电状态信息中的上电状态标志与在位状态标志;The S51 chassis manager obtains the power-on state flag and the in-position state flag in the power-on state information of the current storage hard disk;
S52当在位状态标志正确、且上电状态标志正常时,机箱管理器判别当前存储硬盘处于掉电状态;S52 When the in-position status flag is correct and the power-on status flag is normal, the chassis manager determines that the current storage hard disk is in a power-off state;
S53机箱管理器获取当前存储硬盘的掉电硬盘信息,并将复电指令发送给当前存储硬盘。The S53 chassis manager obtains the power-down hard disk information of the current storage hard disk, and sends a power recovery instruction to the current storage hard disk.
在具体实施例中,当前存储硬盘的上电状态信息中,包括有当前存储硬盘的上电状态标志(DEVICE OFF)和在位状态标志(COMMON STATUS);当硬盘在位且DEVICE OFF这个bit被置为1时,认为此硬盘出现掉电,也就是掉电状态需要满足条件:((COMMON STATUS!=0x05)&&(DEVICE OFF=1))。In a specific embodiment, the power-on status information of the current storage hard disk includes a power-on status flag (DEVICE OFF) and an in-position status flag (COMMON STATUS) of the current storage hard disk; when the hard disk is in place and the bit DEVICE OFF is set When it is set to 1, it is considered that the hard disk is powered off, that is, the power-off state needs to meet the conditions: ((COMMON STATUS!=0x05)&&(DEVICE OFF=1)).
在一个优选的实施方式中,S53具体包括:In a preferred embodiment, S53 specifically includes:
机箱管理器获取当前存储硬盘对应的机箱信息与插盘槽位信息,并将复电指令中的上电指令发送给机箱信息与插盘槽位信息对应的机箱与插盘槽位。The chassis manager obtains the chassis information and disk slot information corresponding to the current storage hard disk, and sends the power-on command included in the power-on instruction to the chassis and disk slot corresponding to the chassis information and the disk slot information.
在具体实施例中,当前存储硬盘满足上述掉电判断条件、出现掉电时,则记录硬盘所在机箱和硬盘所在插硬盘槽位信息。In a specific embodiment, when the current storage hard disk satisfies the above-mentioned power failure judgment condition and power failure occurs, information about the chassis where the hard disk is located and the hard disk slot where the hard disk is located is recorded.
接着,机箱管理器则对上述机箱中的上述插硬盘槽位下发重新上电指令,以进行复电操作;若硬盘上电正常,代表上电恢复成功;若硬盘未正常上电,则需要上报硬盘掉电的告警,以通知用户及时更换硬盘。Then, the chassis manager issues a power-on command to the above-mentioned hard disk slots in the above-mentioned chassis to perform power-recovery operations; if the hard disk is powered on normally, it means that the power-on recovery is successful; Report the hard disk power failure alarm to notify the user to replace the hard disk in time.
在一个优选的实施方式中,S2具体包括:In a preferred embodiment, S2 specifically includes:
S21机箱管理器将发现请求发送给SAS扩展器;The S21 chassis manager sends the discovery request to the SAS expander;
S22、SAS扩展器通过发现过程获取各个机箱的各项组件属性信息。S22. The SAS expander obtains attribute information of each component of each chassis through a discovery process.
在具体实施例中,当机箱管理器收到上电异常通知信息时,立即向SAS扩展器发起发现请求;其中,发现请求过程可遵循SES协议要求。In a specific embodiment, when the chassis manager receives the power-on abnormality notification information, it immediately initiates a discovery request to the SAS expander; wherein, the discovery request process can follow the requirements of the SES protocol.
通过发现请求过程,机箱管理器可以获取所有机箱内所有组件的信息,其中包括各个硬盘的上电状态;接着,机箱管理器再以机箱为单位,对每个机箱内的每块硬盘的上电状态进行判断,从而实现掉电诊断。Through the discovery request process, the enclosure manager can obtain information about all components in all enclosures, including the power-on status of each hard disk; The status is judged to realize power-down diagnosis.
在一个优选的实施方式中,S1中,在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息,具体包括:In a preferred embodiment, in S1, when at least one storage hard disk has an abnormal power-on state, send power-on abnormality notification information to the chassis manager, specifically including:
S12在出现至少一个存储硬盘上电状态异常时,SAS扩展器发布广播事件;其中,广播事件包括请求机箱管理器接收数据通知信息;S12 When the power-on state of at least one storage hard disk is abnormal, the SAS expander issues a broadcast event; wherein, the broadcast event includes requesting the chassis manager to receive data notification information;
S13在广播事件依次经过驱动层、协议层并到达业务层之后,业务层中的机箱管理器接收广播事件。S13 After the broadcast event passes through the driver layer, the protocol layer and reaches the service layer in sequence, the chassis manager in the service layer receives the broadcast event.
在具体实施例中,当出现硬盘掉电异常时,SAS扩展器立即发布广播事件、请求机箱管理器接收监测数据。In a specific embodiment, when the hard disk is abnormally powered off, the SAS expander immediately issues a broadcast event and requests the chassis manager to receive monitoring data.
广播事件依次经过驱动层和协议层到达业务层,在业务层机箱管理器对广播事件进行处理,并通过向SAS扩展器发起发现请求、来获取各个机箱的各项组件属性信息。The broadcast event reaches the business layer through the driver layer and the protocol layer in turn, and the chassis manager in the business layer processes the broadcast event, and obtains the attribute information of each component of each chassis by initiating a discovery request to the SAS expander.
在一个优选的实施方式中,在S5之前,监控方法还包括:In a preferred embodiment, before S5, the monitoring method also includes:
S31机箱管理器根据各项组件属性信息获取当前机箱的机箱在位状态信息;The S31 chassis manager obtains the chassis in-position status information of the current chassis according to the attribute information of each component;
S32在当前机箱的机箱在位状态信息正常时,机箱管理器获取当前机箱中每个存储硬盘的上电状态信息。S32 When the chassis presence status information of the current chassis is normal, the chassis manager acquires the power-on status information of each storage hard disk in the current chassis.
在具体实施例中,在对各个存储硬盘进行掉电诊断之前,还可先检测机箱的状态;若机箱为offline状态,则不必对硬盘进行上电情况检测;当前机箱的机箱在位状态信息正常时,才对硬盘进行上电情况检查。In a specific embodiment, before carrying out power-off diagnosis to each storage hard disk, the state of the chassis can also be detected earlier; if the chassis is in an offline state, it is not necessary to detect the power-on situation of the hard disk; the chassis presence status information of the current chassis is normal Then, check the power-on status of the hard disk.
在一个优选的实施方式中,在S5之前,监控方法还包括:In a preferred embodiment, before S5, the monitoring method also includes:
S41机箱管理器根据各项组件属性信息获取当前机箱的各个供电单元状态信息;The S41 chassis manager obtains the state information of each power supply unit of the current chassis according to the attribute information of each component;
S42当各个供电单元状态信息正常时,机箱管理器获取当前机箱中每个存储硬盘的上电状态信息。S42 When the status information of each power supply unit is normal, the chassis manager acquires the power-on status information of each storage hard disk in the current chassis.
在具体实施例中,在检测机箱状态之后,还可继续检查机箱内供电单元(即PSU,Power Supply Unit)的供电状态;其中,供电单元的供电状态也可以通过机箱管理器的发现过程(即discovery)采集得到。In a specific embodiment, after detecting the state of the chassis, the power supply status of the power supply unit (i.e. PSU, Power Supply Unit) in the chassis can also be continuously checked; wherein, the power supply state of the power supply unit can also be detected through the chassis manager (i.e. discovery) collected.
在对机箱内所有供电单元进行供电状态检查之后,若发现供电单元出现供电异常时,则不必对硬盘进行上电情况检测;当各个供电单元状态正常时,才对硬盘进行上电情况检查。After checking the power supply status of all power supply units in the chassis, if the power supply unit is found to have abnormal power supply, it is not necessary to check the power-on status of the hard disk; only when the status of each power supply unit is normal, the power-on status of the hard disk is checked.
在实际实施例中,如图5所示,上述监控方法包括如下步骤:In an actual embodiment, as shown in FIG. 5, the above-mentioned monitoring method includes the following steps:
1)将各个硬盘接入存储设备,设备上电后,存储系统正常启动运行。1) Connect each hard disk to the storage device. After the device is powered on, the storage system starts and runs normally.
如图2中的存储系统架构图所示。其中,SAS扩展器芯片是一种可编程芯片,可以设置该芯片每5s进行一次例测,每次例测都可以采集到机箱内硬盘的上电状态。若某种原因导致硬盘掉电,SAS扩展器将在芯片自己的数据库中更新硬盘的上电状态。As shown in the storage system architecture diagram in FIG. 2 . Among them, the SAS expander chip is a programmable chip, which can be set to perform a sample test every 5s, and each sample test can collect the power-on status of the hard disk in the chassis. If the hard disk is powered off for some reason, the SAS expander will update the power-on status of the hard disk in the chip's own database.
2)当出现硬盘掉电异常时,SAS扩展器立即发布广播事件、请求机箱管理器接收数据。广播事件经SAS驱动和接口协议层到达机箱管理器。2) When an abnormal hard disk power failure occurs, the SAS expander immediately issues a broadcast event and requests the chassis manager to receive data. The broadcast event reaches the chassis manager through the SAS driver and the interface protocol layer.
3)机箱管理器收到广播事件时,立即向SAS扩展器发起发现请求,发现过程遵循SES协议要求。3) When the chassis manager receives the broadcast event, it immediately initiates a discovery request to the SAS expander, and the discovery process follows the requirements of the SES protocol.
发现过程完成后机箱管理器可以拿到所有机箱内所有组件的信息,其中也包括硬盘的上电状态。After the discovery process is completed, the enclosure manager can obtain information about all components in all enclosures, including the power-on status of hard disks.
4)机箱管理器再以机箱为单位,对每个机箱内的每块硬盘的上电状态进行判断。4) The chassis manager judges the power-on status of each hard disk in each chassis by taking the chassis as a unit.
5)首先检测机箱的状态;若机箱为offline状态,则不必对硬盘进行上电情况检测;否则进行下一步检测。5) First check the state of the chassis; if the chassis is in the offline state, it is not necessary to check the power-on status of the hard disk; otherwise, proceed to the next step of detection.
6)然后对机箱内所有供电单元进行供电状态检查,供电单元的供电状态也可以通过机箱管理器的发现过程采集到。如图3所示,机箱管理器可根据SES协议解析拿到数据,具体可见SES协议里Power Supply status element。6) Then check the power supply status of all power supply units in the chassis, and the power supply status of the power supply units can also be collected through the discovery process of the chassis manager. As shown in Figure 3, the chassis manager can analyze and obtain data according to the SES protocol. For details, see the Power Supply status element in the SES protocol.
当机框内的供电单元都供电异常时,则不必对硬盘进行上电情况检测;否则进行下一步检测。When the power supply units in the chassis are abnormal, it is not necessary to check the power-on status of the hard disk; otherwise, check the next step.
7)同样的,发现过程完成后,硬盘的数据也已经被机箱管理器拿到。7) Similarly, after the discovery process is completed, the data of the hard disk has also been obtained by the chassis manager.
如图4所示,经过SES协议解析后,主要使用硬盘的上电状态标志(DEVICE OFF)和在位状态标志(COMMON STATUS)。As shown in Figure 4, after the SES protocol is analyzed, the power-on status flag (DEVICE OFF) and the in-position status flag (COMMON STATUS) of the hard disk are mainly used.
当硬盘在位、且DEVICE OFF这个bit被置为1时,认为此硬盘出现掉电,也就是掉电状态需要满足:((COMMON STATUS!=0x05)&&(DEVICE OFF=1))。此时,记录硬盘所在机箱和硬盘所在插硬盘槽位信息。When the hard disk is in place and the DEVICE OFF bit is set to 1, it is considered that the hard disk is powered off, that is, the power-off state needs to meet: ((COMMON STATUS!=0x05)&&(DEVICE OFF=1)). At this point, record the information about the chassis where the hard disk is located and the hard disk slot where the hard disk is located.
8)对上述机箱中的上述插硬盘槽位下发上电指令(即复电指令),上电指令可通过SES协议下发;同时,记录复电频率值;理论上这个复电频率值在24h内不能超过5次,超过5次时需要上报一个告警、并在用户界面上显示,表示这块硬盘出现太多次掉电。8) Issue a power-on command (that is, a power-on command) to the above-mentioned hard disk slot in the above-mentioned chassis, and the power-on command can be issued through the SES protocol; at the same time, record the power-on frequency value; theoretically, the power-on frequency value is in It cannot exceed 5 times within 24 hours. When it exceeds 5 times, an alarm needs to be reported and displayed on the user interface, indicating that the hard disk has suffered too many power failures.
9)启动一个定时器,定时器时长可根据需求设置(如设置为2min)。9) Start a timer, and the duration of the timer can be set according to requirements (for example, set to 2 minutes).
在定时器时长到达后,机箱管理器主动发起发现请求,来获取当前硬盘复电后的最新上电状态。After the timer expires, the chassis manager actively initiates a discovery request to obtain the latest power-on status of the current hard disk after power recovery.
10)若硬盘上电正常,代表上电恢复成功。若硬盘未正常上电,则需要上报硬盘掉电的告警,用于通知用户及时更换硬盘。10) If the hard disk is powered on normally, it means that the power-on recovery is successful. If the hard disk is not powered on normally, a hard disk power failure alarm needs to be reported to notify the user to replace the hard disk in time.
综上,上述监控方法中,通过采用SAS扩展器来获取硬盘的上电状态,提交给上层进行分析,再由上层做出抉择,对硬盘进行相应处理,以便及时换出处理故障硬盘,从而可极大提高存储系统的安全性和可靠性。To sum up, in the above monitoring method, the power-on status of the hard disk is obtained by using the SAS expander, and submitted to the upper layer for analysis, and then the upper layer makes a decision and performs corresponding processing on the hard disk, so as to replace the faulty hard disk in time, so that Greatly improve the security and reliability of the storage system.
需要注意的是,虽然流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be noted that although the various steps in the flow chart are displayed sequentially according to the arrows, these steps are not necessarily executed sequentially in the order indicated by the arrows. Unless otherwise specified herein, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in the flowchart may include multiple sub-steps or multiple stages, these sub-steps or stages are not necessarily executed at the same time, but may be executed at different times, the execution of these sub-steps or stages The order is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
实施例二:Embodiment two:
如图6所示,本发明实施例还提供一种存储硬盘上电状态的监控装置,用于实现前述的存储硬盘上电状态的监控方法,监控装置包括:As shown in FIG. 6, the embodiment of the present invention also provides a monitoring device for storing the power-on state of a hard disk, which is used to implement the aforementioned method for monitoring the power-on state of a storage hard disk. The monitoring device includes:
SAS扩展器,用于获取所有存储硬盘的上电状态信息,并在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息;The SAS expander is configured to obtain power-on status information of all storage hard disks, and send power-on exception notification information to the chassis manager when at least one storage hard disk has an abnormal power-on status;
机箱管理器,用于通过SAS扩展器获取各个机箱的各项组件属性信息;其中,各项组件属性信息包括每个机箱中每个存储硬盘的上电状态信息;The chassis manager is used to obtain the attribute information of each component of each chassis through the SAS expander; wherein, the attribute information of each component includes the power-on status information of each storage hard disk in each chassis;
机箱管理器还用于遍历每个机箱中每个存储硬盘,并根据每个机箱中每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘;The chassis manager is also used to traverse each storage hard disk in each chassis, and obtain the power-off hard disk information according to the power-on status information of each storage hard disk in each chassis, and send the power recovery command to the corresponding power-down hard disk;
机箱管理器还用于按照预定时间获取掉电硬盘复电后的上电状态信息,并在掉电硬盘复电后的上电状态信息出现异常时向用户端发出硬盘掉电告警信息。The chassis manager is also used to obtain the power-on status information of the power-off hard disk after power recovery according to the predetermined time, and send a hard disk power-off alarm message to the user terminal when the power-on state information of the power-failure hard disk after power recovery is abnormal.
在一个优选的实施方式中,机箱管理器还用于获取当前存储硬盘的上电状态信息中的上电状态标志与在位状态标志;In a preferred embodiment, the chassis manager is also used to obtain the power-on state flag and the presence state flag in the power-on state information of the current storage hard disk;
当在位状态标志正确、且上电状态标志正常时,机箱管理器还用于判别当前存储硬盘处于掉电状态;When the in-position status flag is correct and the power-on status flag is normal, the chassis manager is also used to determine that the current storage hard disk is in a power-off state;
机箱管理器还用于获取当前存储硬盘的掉电硬盘信息,并将复电指令发送给当前存储硬盘。The chassis manager is also used to obtain the power-down hard disk information of the current storage hard disk, and send the power recovery instruction to the current storage hard disk.
在一个优选的实施方式中,机箱管理器还用于获取当前存储硬盘对应的机箱信息与插盘槽位信息,并将复电指令中的上电指令发送给机箱信息与插盘槽位信息对应的机箱与插盘槽位。In a preferred embodiment, the chassis manager is also used to obtain the chassis information corresponding to the current storage hard disk and the insertion disk slot information, and send the power-on command in the power-on instruction to the chassis information corresponding to the insertion disk slot information. Chassis and slots for slots.
在一个优选的实施方式中,机箱管理器还用于将发现请求发送给SAS扩展器;In a preferred embodiment, the chassis manager is also used to send the discovery request to the SAS expander;
SAS扩展器还用于通过发现过程获取各个机箱的各项组件属性信息;其中,各项组件属性信息包括每个机箱中每个存储硬盘的上电状态信息。The SAS expander is also used to obtain various component attribute information of each chassis through a discovery process; wherein, each component attribute information includes power-on status information of each storage hard disk in each chassis.
在一个优选的实施方式中,在出现至少一个存储硬盘上电状态异常时,SAS扩展器还用于发布广播事件;其中,广播事件包括请求机箱管理器接收数据通知信息;In a preferred embodiment, when the power-on state of at least one storage hard disk is abnormal, the SAS expander is also used to issue a broadcast event; wherein, the broadcast event includes requesting the chassis manager to receive data notification information;
在广播事件依次经过驱动层、协议层并到达业务层之后,业务层中的机箱管理器还用于接收广播事件。After the broadcast event passes through the driver layer, the protocol layer and reaches the business layer in sequence, the chassis manager in the business layer is also used to receive the broadcast event.
在一个优选的实施方式中,机箱管理器还用于根据各项组件属性信息获取当前机箱的机箱在位状态信息;In a preferred embodiment, the chassis manager is also used to obtain the chassis presence status information of the current chassis according to the component attribute information;
在当前机箱的机箱在位状态信息正常时,机箱管理器还用于获取当前机箱中每个存储硬盘的上电状态信息。When the chassis presence status information of the current chassis is normal, the chassis manager is further configured to obtain the power-on status information of each storage hard disk in the current chassis.
在一个优选的实施方式中,机箱管理器还用于根据各项组件属性信息获取当前机箱的各个供电单元状态信息;In a preferred embodiment, the chassis manager is also used to acquire the status information of each power supply unit of the current chassis according to the attribute information of each component;
当各个供电单元状态信息正常时,机箱管理器还用于获取当前机箱中每个存储硬盘的上电状态信息。When the status information of each power supply unit is normal, the chassis manager is also used to obtain the power-on status information of each storage hard disk in the current chassis.
关于上述装置的具体限定,可以参见上文中对于方法的限定,在此不再赘述。For the specific limitations of the above-mentioned apparatus, refer to the above-mentioned limitations on the method, and details will not be repeated here.
上述装置中的各个模块,可全部或部分通过软件、硬件及其组合来实现。上述各模块可以以硬件形式内嵌于、或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。Each module in the above-mentioned device can be fully or partially realized by software, hardware and a combination thereof. The above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can call and execute the corresponding operations of the above modules.
其中,如图7所示,上述计算机设备可以是终端,其包括通过系统总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。Wherein, as shown in FIG. 7 , the above-mentioned computer equipment may be a terminal, which includes a processor, a memory, a network interface, a display screen and an input device connected through a system bus. Wherein, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal via a network connection. The display screen of the computer device may be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer device may be a touch layer covered on the display screen, or a button, a trackball or a touch pad provided on the casing of the computer device , and can also be an external keyboard, touchpad, or mouse.
可以理解的是,上述图中示出的结构,仅仅是与本发明方案相关的部分结构的框图,并不构成对本发明方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。It can be understood that the structure shown in the above figure is only a block diagram of a part of the structure related to the solution of the present invention, and does not constitute a limitation on the computer equipment to which the solution of the present invention is applied. The specific computer equipment may include More or fewer components are shown in the figures, or certain components are combined, or have different component arrangements.
实施例三:Embodiment three:
本发明实施例又提供一种计算机设备,包括存储器、处理器及计算机程序,计算机程序存储在存储器上并可在处理器上运行,处理器执行计算机程序时实现以下步骤:The embodiment of the present invention further provides a computer device, including a memory, a processor and a computer program, the computer program is stored in the memory and can run on the processor, and the processor implements the following steps when executing the computer program:
S1、SAS扩展器获取所有存储硬盘的上电状态信息,并在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息;S1. The SAS expander obtains the power-on status information of all storage hard disks, and sends a power-on exception notification message to the chassis manager when at least one storage hard disk is abnormal in power-on status;
S2机箱管理器通过SAS扩展器获取各个机箱的各项组件属性信息;其中,各项组件属性信息包括每个机箱中每个存储硬盘的上电状态信息;The S2 chassis manager obtains the attribute information of each component of each chassis through the SAS expander; wherein, the attribute information of each component includes the power-on status information of each storage hard disk in each chassis;
S5机箱管理器根据每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘;The S5 chassis manager obtains the power-off hard disk information according to the power-on status information of each storage hard disk, and sends the power-off command to the corresponding power-off hard disk;
S6机箱管理器按照预定时间获取掉电硬盘复电后的上电状态信息,并在掉电硬盘复电后的上电状态信息出现异常时向用户端发出硬盘掉电告警信息。The S6 chassis manager obtains the power-on status information after the power-off hard disk is restored according to the predetermined time, and sends a hard disk power-off warning message to the user terminal when the power-on state information after the power-off hard disk is abnormal.
在一个优选的实施方式中,处理器执行计算机程序时还实现以下步骤:In a preferred embodiment, the following steps are also implemented when the processor executes the computer program:
S5具体包括:S51机箱管理器获取当前存储硬盘的上电状态信息中的上电状态标志与在位状态标志;S52当在位状态标志正确、且上电状态标志正常时,机箱管理器判别当前存储硬盘处于掉电状态;S53机箱管理器获取当前存储硬盘的掉电硬盘信息,并将复电指令发送给当前存储硬盘。S5 specifically includes: S51 The chassis manager obtains the power-on status flag and the in-position status flag in the power-on status information of the current storage hard disk; S52 When the in-position status flag is correct and the power-on status flag is normal, the chassis manager determines the current The storage hard disk is in the power-off state; the S53 chassis manager obtains the information of the power-off hard disk of the current storage hard disk, and sends the power recovery command to the current storage hard disk.
在一个优选的实施方式中,处理器执行计算机程序时还实现以下步骤:In a preferred embodiment, the following steps are also implemented when the processor executes the computer program:
S53具体包括:机箱管理器获取当前存储硬盘对应的机箱信息与插盘槽位信息,并将复电指令中的上电指令发送给机箱信息与插盘槽位信息对应的机箱与插盘槽位。S53 specifically includes: the chassis manager obtains the chassis information and slot information corresponding to the current storage hard disk, and sends the power-on command in the power-on command to the chassis and the slot corresponding to the chassis information and the slot information .
在一个优选的实施方式中,处理器执行计算机程序时还实现以下步骤:In a preferred embodiment, the following steps are also implemented when the processor executes the computer program:
S2具体包括:S21机箱管理器将发现请求发送给SAS扩展器;S22、SAS扩展器通过发现过程获取各个机箱的各项组件属性信息。S2 specifically includes: S21 , the chassis manager sends the discovery request to the SAS expander; S22 , the SAS expander obtains attribute information of each component of each chassis through the discovery process.
在一个优选的实施方式中,处理器执行计算机程序时还实现以下步骤:In a preferred embodiment, the following steps are also implemented when the processor executes the computer program:
S1中,在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息,具体包括:S12在出现至少一个存储硬盘上电状态异常时,SAS扩展器发布广播事件;其中,广播事件包括请求机箱管理器接收数据通知信息;S13在广播事件依次经过驱动层、协议层并到达业务层之后,业务层中的机箱管理器接收广播事件。In S1, when the power-on state of at least one storage hard disk is abnormal, the abnormal power-on notification information is sent to the chassis manager, specifically including: S12, when the power-on state of at least one storage hard disk is abnormal, the SAS expander issues a broadcast event; wherein, The broadcast event includes requesting the chassis manager to receive data notification information; S13 After the broadcast event passes through the driver layer, the protocol layer and reaches the service layer in sequence, the chassis manager in the business layer receives the broadcast event.
在一个优选的实施方式中,处理器执行计算机程序时还实现以下步骤:In a preferred embodiment, the following steps are also implemented when the processor executes the computer program:
在S5之前,还包括:S31根据各项组件属性信息获取当前机箱的机箱在位状态信息;S32在当前机箱的机箱在位状态信息正常时,获取当前机箱中每个存储硬盘的上电状态信息。Before S5, it also includes: S31 obtains the chassis in-position status information of the current chassis according to the component attribute information; S32 obtains the power-on status information of each storage hard disk in the current chassis when the chassis in-position status information of the current chassis is normal .
在一个优选的实施方式中,处理器执行计算机程序时还实现以下步骤:In a preferred embodiment, the following steps are also implemented when the processor executes the computer program:
在S5之前,还包括:S41根据各项组件属性信息获取当前机箱的各个供电单元状态信息;S42当各个供电单元状态信息正常时,获取当前机箱中每个存储硬盘的上电状态信息。Before S5, it also includes: S41 obtaining the state information of each power supply unit of the current chassis according to the attribute information of each component; S42 obtaining the power-on state information of each storage hard disk in the current chassis when the state information of each power supply unit is normal.
实施例四:Embodiment four:
本发明实施例再提供一种计算机可读存储介质,存储有计算机程序,计算机程序被处理器执行时实现以下步骤:The embodiment of the present invention further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the following steps are implemented:
S1、SAS扩展器获取所有存储硬盘的上电状态信息,并在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息;S1. The SAS expander obtains the power-on status information of all storage hard disks, and sends a power-on exception notification message to the chassis manager when at least one storage hard disk is abnormal in power-on status;
S2机箱管理器通过SAS扩展器获取各个机箱的各项组件属性信息;其中,各项组件属性信息包括每个机箱中每个存储硬盘的上电状态信息;The S2 chassis manager obtains the attribute information of each component of each chassis through the SAS expander; wherein, the attribute information of each component includes the power-on status information of each storage hard disk in each chassis;
S5机箱管理器根据每个存储硬盘的上电状态信息获取掉电硬盘信息,并将复电指令发送给相应的掉电硬盘;The S5 chassis manager obtains the power-off hard disk information according to the power-on status information of each storage hard disk, and sends the power-off command to the corresponding power-off hard disk;
S6机箱管理器按照预定时间获取掉电硬盘复电后的上电状态信息,并在掉电硬盘复电后的上电状态信息出现异常时向用户端发出硬盘掉电告警信息。The S6 chassis manager obtains the power-on status information after the power-off hard disk is restored according to the predetermined time, and sends a hard disk power-off warning message to the user terminal when the power-on state information after the power-off hard disk is abnormal.
在一个优选的实施方式中,计算机程序被处理器执行时还实现以下步骤:In a preferred embodiment, when the computer program is executed by the processor, the following steps are also implemented:
S5具体包括:S51机箱管理器获取当前存储硬盘的上电状态信息中的上电状态标志与在位状态标志;S52当在位状态标志正确、且上电状态标志正常时,机箱管理器判别当前存储硬盘处于掉电状态;S53机箱管理器获取当前存储硬盘的掉电硬盘信息,并将复电指令发送给当前存储硬盘。S5 specifically includes: S51 The chassis manager obtains the power-on status flag and the in-position status flag in the power-on status information of the current storage hard disk; S52 When the in-position status flag is correct and the power-on status flag is normal, the chassis manager determines the current The storage hard disk is in the power-off state; the S53 chassis manager obtains the information of the power-off hard disk of the current storage hard disk, and sends the power recovery command to the current storage hard disk.
在一个优选的实施方式中,计算机程序被处理器执行时还实现以下步骤:In a preferred embodiment, when the computer program is executed by the processor, the following steps are also implemented:
S53具体包括:机箱管理器获取当前存储硬盘对应的机箱信息与插盘槽位信息,并将复电指令中的上电指令发送给机箱信息与插盘槽位信息对应的机箱与插盘槽位。S53 specifically includes: the chassis manager obtains the chassis information and slot information corresponding to the current storage hard disk, and sends the power-on command in the power-on command to the chassis and the slot corresponding to the chassis information and the slot information .
在一个优选的实施方式中,计算机程序被处理器执行时还实现以下步骤:In a preferred embodiment, when the computer program is executed by the processor, the following steps are also implemented:
S2具体包括:S21机箱管理器将发现请求发送给SAS扩展器;S22、SAS扩展器通过发现过程获取各个机箱的各项组件属性信息。S2 specifically includes: S21 , the chassis manager sends the discovery request to the SAS expander; S22 , the SAS expander obtains attribute information of each component of each chassis through the discovery process.
在一个优选的实施方式中,计算机程序被处理器执行时还实现以下步骤:In a preferred embodiment, when the computer program is executed by the processor, the following steps are also implemented:
S1中,在出现至少一个存储硬盘上电状态异常时向机箱管理器发送上电异常通知信息,具体包括:S12在出现至少一个存储硬盘上电状态异常时,SAS扩展器发布广播事件;其中,广播事件包括请求机箱管理器接收数据通知信息;S13在广播事件依次经过驱动层、协议层并到达业务层之后,业务层中的机箱管理器接收广播事件。In S1, when the power-on state of at least one storage hard disk is abnormal, the abnormal power-on notification information is sent to the chassis manager, specifically including: S12, when the power-on state of at least one storage hard disk is abnormal, the SAS expander issues a broadcast event; wherein, The broadcast event includes requesting the chassis manager to receive data notification information; S13 After the broadcast event passes through the driver layer, the protocol layer and reaches the service layer in sequence, the chassis manager in the business layer receives the broadcast event.
在一个优选的实施方式中,计算机程序被处理器执行时还实现以下步骤:In a preferred embodiment, when the computer program is executed by the processor, the following steps are also implemented:
在S5之前,还包括:S31根据各项组件属性信息获取当前机箱的机箱在位状态信息;S32在当前机箱的机箱在位状态信息正常时,获取当前机箱中每个存储硬盘的上电状态信息。Before S5, it also includes: S31 obtains the chassis in-position status information of the current chassis according to the component attribute information; S32 obtains the power-on status information of each storage hard disk in the current chassis when the chassis in-position status information of the current chassis is normal .
在一个优选的实施方式中,计算机程序被处理器执行时还实现以下步骤:In a preferred embodiment, when the computer program is executed by the processor, the following steps are also implemented:
在S5之前,还包括:S41根据各项组件属性信息获取当前机箱的各个供电单元状态信息;S42当各个供电单元状态信息正常时,获取当前机箱中每个存储硬盘的上电状态信息。Before S5, it also includes: S41 obtaining the state information of each power supply unit of the current chassis according to the attribute information of each component; S42 obtaining the power-on state information of each storage hard disk in the current chassis when the state information of each power supply unit is normal.
可以理解的是,上述实施例方法中的全部或部分流程的实现,可以通过计算机程序来指令相关的硬件来完成,计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。It can be understood that the implementation of all or part of the processes in the methods of the above embodiments can be completed by instructing related hardware through computer programs, and the computer programs can be stored in a non-volatile computer-readable storage medium. When the program is executed, it may include the processes of the embodiments of the above-mentioned methods.
其中,本发明所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Wherein, any reference to memory, storage, database or other media used in the various embodiments provided by the present invention may include non-volatile and/or volatile memory. Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
需要注意的是,上述仅为本发明的较佳实施例及所运用技术原理。本领域技术人员会理解,本发明不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本发明的保护范围。因此,虽然通过以上实施例对本发明进行了较为详细的说明,但是本发明不仅仅限于以上实施例,在不脱离本发明构思的情况下,还可以包括更多其它等效实施例,而本发明的范围由所附的权利要求范围决定。It should be noted that the above are only preferred embodiments and technical principles used in the present invention. Those skilled in the art will understand that the present invention is not limited to the specific embodiments described herein, and that various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the protection scope of the present invention. Therefore, although the present invention has been described in detail through the above embodiments, the present invention is not limited to the above embodiments, and can also include more other equivalent embodiments without departing from the concept of the present invention, and the present invention The scope is determined by the scope of the appended claims.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210908162.XA CN115373936A (en) | 2022-07-29 | 2022-07-29 | A monitoring method, device, equipment and medium for storing the power-on state of a hard disk |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210908162.XA CN115373936A (en) | 2022-07-29 | 2022-07-29 | A monitoring method, device, equipment and medium for storing the power-on state of a hard disk |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115373936A true CN115373936A (en) | 2022-11-22 |
Family
ID=84064486
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210908162.XA Pending CN115373936A (en) | 2022-07-29 | 2022-07-29 | A monitoring method, device, equipment and medium for storing the power-on state of a hard disk |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115373936A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103455395A (en) * | 2013-08-08 | 2013-12-18 | 华为技术有限公司 | Method and device for detecting hard disk failures |
CN111124785A (en) * | 2019-12-22 | 2020-05-08 | 广东浪潮大数据研究有限公司 | Hard disk fault checking method, device, equipment and storage medium |
CN113868085A (en) * | 2021-09-27 | 2021-12-31 | 中国长城科技集团股份有限公司 | Hard disk monitoring method, device and system |
-
2022
- 2022-07-29 CN CN202210908162.XA patent/CN115373936A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103455395A (en) * | 2013-08-08 | 2013-12-18 | 华为技术有限公司 | Method and device for detecting hard disk failures |
CN111124785A (en) * | 2019-12-22 | 2020-05-08 | 广东浪潮大数据研究有限公司 | Hard disk fault checking method, device, equipment and storage medium |
CN113868085A (en) * | 2021-09-27 | 2021-12-31 | 中国长城科技集团股份有限公司 | Hard disk monitoring method, device and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9195561B2 (en) | System and method for proactive management of an information handling system with in-situ measurement of end user actions | |
CN104769554B (en) | System, method, apparatus, and computer program product for providing mobile device support services | |
CN105607973B (en) | Method, device and system for processing equipment fault in virtual machine system | |
CN108536548A (en) | A kind of processing method of Bad Track, device and computer storage media | |
CN114816022B (en) | Method, system and storage medium for monitoring server power supply abnormality | |
CN114675791B (en) | Disk processing method and system and electronic equipment | |
CN111796959A (en) | Host machine container self-healing method, device and system | |
JP2017091077A (en) | Pseudo-fault generation program, generation method, and generator | |
CN116795600A (en) | Data recovery method and system for baseboard management controller | |
CN108519940A (en) | A storage device alarm method, system, and computer-readable storage medium | |
CN110968456B (en) | Method and device for processing fault disk in distributed storage system | |
CN115098291A (en) | Method, system, storage medium and equipment for recording system restart reason | |
CN105955864B (en) | Power failure processing method, power module, monitoring management module and server | |
CN115373936A (en) | A monitoring method, device, equipment and medium for storing the power-on state of a hard disk | |
JP2018180982A (en) | INFORMATION PROCESSING APPARATUS AND LOG RECORDING METHOD | |
CN116010199A (en) | Application service self-adjustment method, device, computer equipment and storage medium | |
CN116501705A (en) | Method, system, device and medium for collecting and analyzing memory information based on RAS | |
CN109947602A (en) | PowerVM-based partition recovery method, device, device and medium | |
CN115686890A (en) | Processor fault early warning method, system, electronic equipment and medium | |
CN114816267A (en) | Method and system for monitoring storage device | |
CN113515400B (en) | A method and device for abnormal monitoring of critical information infrastructure | |
CN111190781A (en) | Test self-check method of server system | |
CN119883773B (en) | A BIOS operation and maintenance method, system and storage medium | |
CN113986142B (en) | Disk fault monitoring method, device, computer equipment and storage medium | |
US20230004476A1 (en) | Application failure tracking features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |