CN106325761B - System and method for managing storage resources - Google Patents

System and method for managing storage resources Download PDF

Info

Publication number
CN106325761B
CN106325761B CN201510369563.2A CN201510369563A CN106325761B CN 106325761 B CN106325761 B CN 106325761B CN 201510369563 A CN201510369563 A CN 201510369563A CN 106325761 B CN106325761 B CN 106325761B
Authority
CN
China
Prior art keywords
hard disk
rmc
information
sas
node board
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510369563.2A
Other languages
Chinese (zh)
Other versions
CN106325761A (en
Inventor
吴筱苏
徐东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201510369563.2A priority Critical patent/CN106325761B/en
Priority to PCT/CN2016/080051 priority patent/WO2017000639A1/en
Publication of CN106325761A publication Critical patent/CN106325761A/en
Application granted granted Critical
Publication of CN106325761B publication Critical patent/CN106325761B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring

Abstract

The invention provides a system and a method for managing storage resources, wherein the system comprises: the equipment cabinet management board RMC is used for managing storage resources according to one or more of received self-monitoring analysis and reporting technology SMART information, serial connection small computer system interface SAS topological information and hard disk information; the server node board is connected with the RMC and used for acquiring the SMART information and reporting the SMART information to the RMC; the switching node board is connected with the RMC and used for acquiring the SAS topology information and reporting the SAS topology information to the RMC; and the storage node board is connected with the RMC and used for acquiring the hard disk information and reporting the hard disk information to the RMC, wherein the storage node board comprises: the system comprises a first substrate management controller BMC connected to a storage node board of the RMC and a JBOD (just a bunch of disks) respectively connected to the first BMC, wherein the first BMC is used for managing the disk information acquired from the JBOD.

Description

System and method for managing storage resources
Technical Field
The present invention relates to storage technologies, and in particular, to a system and a method for managing storage resources.
Background
The data center has the geometric progression increase of the storage capacity requirement, the traditional blade server cannot meet the market requirement by using local storage, a cabinet type server is born for the purpose, only a small amount of storage versions and logs are reserved on the server, the storage resources are independent and used as a resource pool for unified management, and all the servers can share the storage resource pool.
SAS (Serial Attached SCSI, Small Computer System Interface (SCSI), which is the most widely used hard disk connection technology in storage networks, uses Serial communication as a protocol infrastructure, and adopts a SCSI-3 extended instruction set, thereby combining the advantages of the existing parallel SCSI and Serial connection technologies, and being compatible with SATA devices. The SAS has the highest interface rate in the current hard disk channel technology, the rate of the SAS3.0 standard reaches 12Gb/s, and the special wide port technology of the SAS enables the transmission bandwidth to be doubled, and taking a 12Gbps SAS channel as an example, the bandwidth of a 4-SAS wide port can reach 48 Gbps.
When multiple SAS devices need to communicate with each other, they must be connected through a SAS expander. The SASexpander has a plurality of SAS ports, is provided with an MIPS (Million Instructions Per Second, single-length fixed point instruction average execution speed) processor inside, and is divided into SAS switch exchange and JBOD (Just Bundle of Disk, simple hard Disk cluster) according to functions, wherein the JBOD is only used for hard Disk expansion, and a plurality of hard disks are connected together to form a large logical hard Disk. The SASSwitch connects several JBODs to form a SAS Switch domain, and the SAS Switch domain can manage the whole SAS Switch domain.
FIG. 1 is a typical rack server system with a JBOD built in the rack and several server servers connected to the JBOD via a SASswitch.
The cabinet server system has the following characteristics: 1. transmission isolation: and the server and the storage resource are isolated, so that illegal access is prevented. 2. And (3) access control: limiting the storage resources that each server can access, for example, in fig. 1, each server can only access a part of the JBOD hard disk; 3. flexible resource allocation: if a server needs more resources, a new hard disk can be allocated. For this reason, the ANSI (american national standards institute) T10 technical committee proposed SAS zoning technology, similar to VLAN (Virtual Local Area Network) technology for ethernet, in which SAS switch domains composed of SAS switch and JBOD are divided into different ZONEs, which are not visible from each other. In fig. 1, a server1 and a plurality of hard disks are divided into ZONE1, the server1 can only access the hard disks in the ZONE1, and if more hard disks are needed, only new hard disks need to be added to the ZONE 1. SAS Zoning specifically controls whether communication between different ZONEs is possible through a ZPT Table (ZONE Permission Table) in SAS expander.
There are two communication planes in a rack server system, a data plane and a management plane. The data planes are interconnected by adopting an in-band channel (SAS), the current networking usually adopts a multi-path redundancy design, at least two exchange nodes are arranged in the network, two SAS expanders are arranged in JBOD and are connected with two ports of a hard disk, each exchange node is crossly interconnected with a server and the JBOD, and multi-path software of a host runs on the server to realize the link redundancy and load balance of a plurality of IO paths. The management plane is generally divided into an in-band management and an out-of-band management, the in-band management uses HBA (Host Bus Adapter) to manage SAS expander, and implements Enclosure management by ses (scsi Enclosure services) protocol, and the out-of-band management channel includes eth (ethernet), UART (Universal Asynchronous Receiver/Transmitter, or Asynchronous Receiver/Transmitter), I2C (Inter-Integrated Circuit, two-wire serial Bus), and the like. Since the hardware cost of HBAs is expensive, the switching node typically employs out-of-band management. Fig. 1 shows that a management client 5(PC) is connected to one SAS switch through an out-of-band channel, a web GUI (Graphical user interface) is provided on the management client 5, and a storage resource used by a server is managed through the Graphical interface.
Fig. 2 is a Management plane of a conventional rack server system, where HBA24, JBOD42 are connected to saswitch 32, and BMCs 21, 31, 41(Board Management Controller) provide out-of-band supervision, firmware Management, and sensor 23, 33, 43 Management. The BMC21 on the server node board 2 is responsible for functions of the sensor 23, and performs data exchange with a BIOS22(Basic Input Output System) to obtain System information of the board, receive a System event from the BIOS22, and report the System event to an RMC1(Rack Management Controller). The BMC31 of switching node board 3 manages only sensors 33, and SAS Switch32 communicates directly with RMC1 over an out-of-band channel. The BMC41 of the storage node board 4 is responsible for the sensor 43 functions only, and the JBOD42 has only in-band channels connecting the switching node board 3.
Disclosure of Invention
The embodiment of the invention aims to provide a system and a method for managing storage resources, which can directly obtain JBOD storage resources and improve the transfer efficiency.
In order to achieve the above object, an embodiment of the present invention provides a system for storage resource management, including:
the equipment cabinet management board RMC is used for managing storage resources according to one or more of received self-monitoring analysis and reporting technology SMART information, serial connection small computer system interface SAS topological information and hard disk information;
the server node board is connected with the RMC and used for acquiring the SMART information and reporting the SMART information to the RMC;
the switching node board is connected with the RMC and used for acquiring the SAS topology information and reporting the SAS topology information to the RMC;
and the storage node board is connected with the RMC and used for acquiring the hard disk information and reporting the hard disk information to the RMC, wherein the storage node board comprises: the system comprises a first baseboard management controller BMC connected to a storage node board of the RMC and a JBOD connected to the first BMC, wherein the first BMC is used for managing hard disk information acquired from the JBOD.
In the system for storage resource management described herein,
the server node board includes: the system comprises a second baseboard management controller BMC connected to a server node board of the RMC, a host bus adapter HBA respectively connected to the second BMC in an out-of-band management mode, a basic input output system BIOS and a temperature sensor, wherein the second BMC is used for managing temperature information in SMART information recorded by the HBA.
In the system for storage resource management described herein,
the switching node board includes: a third baseboard management controller BMC connected to a switching node board of the RMC, and a serial attached small computer system interface switching SAS Switch and a temperature sensor respectively connected to the third BMC in an out-of-band management, wherein the third BMC is configured to manage SAS topology information acquired from the SAS Switch.
In the system for storage resource management described herein,
the SAS topology information comprises Port number Port ID of the HBA connected with SAS Switch and address of the SAS.
In the system for storage resource management described herein,
the hard disk information comprises port information, capacity, equipment type and interface type of the hard disk.
In the system for storage resource management described herein,
the RMC is also used for analyzing the port and capacity information of the first hard disk in the allocation request when the allocation request of the first hard disk is obtained; and adding the port of the first hard disk into a partner hard disk partition matched with the capacity of the first hard disk according to the port of the first hard disk and the capacity information, wherein the partner hard disk partition is a partition in which a one-to-one correspondence relation with a server node board partition is established in advance.
In the system for storage resource management described herein,
the RMC is further configured to obtain a deletion allocation request after the first hard disk is allocated, and quit the first hard disk from the partner hard disk partition according to the deletion allocation request.
In the system for storage resource management described herein,
the first BMC is further used for acquiring a second hard disk pull-out event, reporting the second hard disk pull-out event to the RMC, and deleting the second hard disk information by the RMC.
In the system for storage resource management described herein,
the first BMC is further configured to acquire a second hard disk insertion event, allocate the second hard disk to a preset default partition, and report the insertion event to the RMC.
The embodiment of the invention also provides a method for managing the storage resources, which comprises the following steps:
the server node board acquires self-monitoring analysis and reporting technology SMART information and reports the information to the cabinet management board RMC;
the switching node board acquires SAS topology information of a serial connection small computer system interface and reports the SAS topology information to the RMC;
the storage node board acquires hard disk information and reports the hard disk information to the RMC, wherein the storage node board comprises: the system comprises a first substrate management controller BMC connected to a storage node board of the RMC and a JBOD connected to the first BMC, wherein the first BMC is used for managing hard disk information acquired from the JBOD;
and the RMC manages storage resources according to one or more of the received SMART information, the SAS topology information and the hard disk information.
In the method for storage resource management described herein,
the server node board includes: the system comprises a second baseboard management controller BMC connected to a server node board of the RMC, a host bus adapter HBA respectively connected to the second BMC in an out-of-band management mode, a basic input output system BIOS and a temperature sensor, wherein the second BMC manages temperature information in SMART information recorded by the HBA.
In the method for storage resource management described herein,
the switching node board includes: a third baseboard management controller BMC connected to a switching node board of the RMC, and a serial attached small computer system interface switching SAS Switch and a temperature sensor respectively connected to the third BMC in an out-of-band management, wherein the third BMC is configured to manage SAS topology information acquired from the SAS Switch.
In the method for storage resource management described herein,
the SAS topology information comprises Port number Port ID of the HBA connected with SAS Switch and address of the SAS.
In the method for storage resource management described herein,
the hard disk information comprises port information, capacity, equipment type and interface type of the hard disk.
In the method for storage resource management described herein,
the RMC obtains an allocation request of a first hard disk and analyzes port and capacity information of the first hard disk in the allocation request; and adding the port of the first hard disk into a partner hard disk partition matched with the capacity of the first hard disk according to the port of the first hard disk and the capacity information, wherein the partner hard disk partition is a partition in which a one-to-one correspondence relation with a server node board partition is established in advance.
In the method for storage resource management described herein,
and the RMC acquires a deletion distribution request after the first hard disk is distributed, and quits the first hard disk from a partner hard disk partition according to the deletion distribution request.
In the method for storage resource management described herein,
and the first BMC acquires a second hard disk pull-out event, reports the second hard disk pull-out event to the RMC, and the RMC deletes the second hard disk information.
In the method for storage resource management described herein,
and the first BMC acquires a second hard disk insertion event, allocates the second hard disk to a preset default partition, and reports the insertion event to the RMC.
In the method for storage resource management described herein,
the RMC sends a command for inquiring the running state of a third hard disk to the second BMC;
the RMC receives SMATR information fed back by the second BMC in the HBA, and regulates and controls an integral heat dissipation system according to the actual temperature of the third hard disk in the SMATR information;
the RMC receives SMATR information fed back by the second BMC in the HBA, and judges whether a fault hard disk of the third hard disk occurs according to the running state of the third hard disk in the SMATR information;
and the RMC alarms and isolates after the third hard disk fails.
The technical scheme of the embodiment of the invention has the following beneficial effects:
in the scheme of the embodiment of the invention, the self-monitoring analysis and reporting technology SMART information, serial connection small computer system interface SAS topology information and hard disk information are directly and respectively acquired from the server node board, the exchange node board and the storage node board to be managed, the embodiment of the invention optimizes the management plane of the existing cabinet server system, realizes out-of-band resource management and distribution by using the first BMC, and can directly acquire JBOD storage resources by managing JBOD through the first BMC of the storage node board, thereby simplifying the software complexity of SAS Switch and improving the transmission efficiency.
Drawings
FIG. 1 is a schematic diagram of a prior SAS Zoning;
FIG. 2 is a schematic diagram of a management plane of a prior art rack server system;
FIG. 3 is a schematic diagram of a rack server system management plane according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating allocation of storage resources according to an embodiment of the present invention;
FIG. 5 is a storage resource management state machine according to an embodiment of the present invention;
FIG. 6 is a flowchart illustrating a method for storage resource management according to an embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
The embodiment of the invention aims at the problems that in the prior art, an SAS Switch32 is directly connected with an RMC1, a BMC31 in a switching node board 3 is connected with an RMC1, the information of a sensor 43 can be obtained only from the BMC31 in the switching node board 3, and the resource management information cannot be obtained, and the existing method for forwarding JBOD42 to the RMC1 through an SAS Switch32 has poor efficiency and higher condition requirements.
The embodiment of the invention provides a system and a method for managing storage resources, wherein BMC is respectively connected with JBOD and HBA to obtain resource management information and sensor information, so that the direct acquisition of resources and the management are realized, the software complexity of RMC is reduced, and the forwarding of JBOD by BMC is safe and reliable and has high efficiency.
As shown in fig. 3 and fig. 4, a system for storage resource management according to an embodiment of the present invention includes:
the equipment cabinet management board RMC31 is used for managing storage resources according to one or more of received self-monitoring analysis and reporting technology SMART information, serial connection small computer system interface SAS topological information and hard disk information;
the server node board 32 is connected with the RMC31 and is used for acquiring the SMART information and reporting the SMART information to the RMC 31;
the switching node board 33 connected with the RMC31 is configured to acquire the SAS topology information and report the SAS topology information to the RMC 31;
the storage node board 34 connected to the RMC31 is configured to acquire the hard disk information and report the hard disk information to the RMC31, where the storage node board 34 includes: the first baseboard management controller BMC341 connected to the storage node board of the RMC, and the JBOD342 connected to the simple hard disk cluster on the first BMC341, wherein the first BMC341 is configured to manage the hard disk information obtained from the JBOD 342.
In the embodiment of the invention, by directly and respectively acquiring self-monitoring analysis and reporting technology SMART information, serial connection small computer system interface SAS topology information and hard disk information from a server node board 32, a switching node board 33 and a storage node board 34 for management, compared with the problem that the traditional RMC1 in FIG. 2 acquires JBOD42 hard disk information (the management of the JBOD42 by RMC1 requires the SAS switch32 to carry out conversion from out-band to in-band, when the storage resource information is acquired, SAS switch32 needs to analyze out-of-band management commands first and convert the out-of-band management commands into SCSI commands to send to JBOD42, and has higher requirements on SAS switch32 software and needs special customization for a plurality of functions), the embodiment of the invention optimizes the management plane of the existing cabinet server system, realizes the management and distribution of out-of-band resources by using a first BMC341, manages JBOD342 by the first BMC341 of the storage node board, and can directly acquire JBOD storage resources of J342, the software complexity of the SAS Switch is simplified, and the transmission efficiency is improved.
It should be noted that: as shown in fig. 3, the storage node board 34 further includes: a temperature sensor 343 coupled to the first BMC341, wherein one or more of the storage node boards 34 may be configured.
The JBOD and SAS expander on the storage node board automatically and topologically discover information of all hard disks in the SAS switched domain, including Port ID, SAS address, device type, interface type, capacity, hard disk serial number, etc. connected to the SAS expander.
Also, the above-mentioned simple hard disk cluster JBOD includes a serial attached small computer system interface expander sasexexpander and a hard disk.
The JBOD and the temperature sensor connected to the first BMC are connected through out-of-band management, wherein the out-of-band management includes but is not limited to ethernet, UART (Universal Asynchronous Receiver/Transmitter ), I2C (Inter-integrated circuit), the specific connection depends on HBA, SAS switch and SAS expander chip capabilities, and the server node board, the switch node board and the storage node board connected to the RMC are all connected through ethernet.
The SAS topology information includes a Port number Port ID of the HBA connected SAS Switch and an address of the SAS. The hard disk information includes port information, capacity, device type and interface type of the hard disk.
In addition, the SMART (Self Monitoring Analysis and Reporting Technology) information is information including the operation time, operation parameters, and operation temperature of the hard disk, and is used for recording the state of the hard disk; the information is directly obtained by inquiring the SMART information of the hard disk, the heat dissipation effect of the system is not influenced, and the hardware temperature is conveniently and reliably monitored.
The RMC serves as a cabinet management node in the system, and the BMCs of all the node boards are exchanged and aggregated through the ethernet to form a management plane of the cabinet server system.
The management client 35 serves as a network management background in the system, is connected with the RMC through the ethernet, and is responsible for version management, fault management, resource management and the like of the node board of the whole server cabinet. A web GUI is provided and a user can graphically manage storage resources in the rack server system.
In addition, if the cabinet management board RMC is only one of SMART information, SAS topology information, and hard disk information, it indicates that an abnormal IO path detection occurs in other paths without reported information, and the embodiment of the present invention can implement a function of isolating and replacing a failed hard disk, for a case that the conventional RMC cannot detect a hard disk and cover an entire IO path of a server node (for example, the RMC in fig. 2 cannot detect a broken link occurring between the HBA and the SAS Switch).
As shown in fig. 3, in order to implement reporting of the temperature information in the SMART information recorded by the HBA324 to the RMC, in the system for storage resource management according to the embodiment of the present invention, the server node board includes: a second BMC321 connected to the server node board of the RMC, a host bus adapter HBA324 connected to the second BMC321, a BIOS322, and a temperature sensor 323, wherein the second BMC321 is configured to manage temperature information in the SMART information recorded by the HBA 324.
In the embodiment of the invention, the second BMC of the server node board manages HBA, so that the defect that an SAS expander out-of-band management channel cannot directly acquire SMART information of a hard disk can be overcome.
It should be noted that: one or more server node boards can be configured, the state of the hard disk is monitored through the second BMC, the IO path of the server node is detected, the heat dissipation system is adjusted according to the actual temperature of the hard disk, and the failed hard disk is isolated.
As shown in fig. 3, in order to implement the SAS topology information obtained from the SAS Switch332 to be reported to the RMC, in the system for storage resource management according to an embodiment of the present invention, the switching node board includes: a third BMC331 connected to the switching node board of the RMC, a serial attached small computer system interface Switch SAS Switch332 connected to the third BMC331, and a temperature sensor 333, wherein the third BMC331 is configured to manage SAS topology information acquired from the SAS Switch 332.
In the embodiment of the invention, the third BMC of the switching node board manages the SAS Switch, and the difference of SAS switches of different manufacturers is shielded for RMC.
It should be noted that: the switching node board is configured with at least two blocks to form a multi-path redundancy design, and the SAS Switch connected to the third BMC automatically discovers information of all SAS devices and SASexpanders in the SAS switching domain, including Port ID and SAS address of the SAS Switch connected by HBA.
In order to manage all resources by managing one end of the SAS Switch through the RMC, the JBOD and the HBA on the storage node board are both connected to the SAS Switch, and the first BMC, the third BMC, and the second BMC are both connected to the RMC, so that all resources can be managed by managing one end of the SAS Switch through the RMC, and the problem that management cannot be performed because the RMC is not connected to the HBA and the JBOD as shown in fig. 2 is avoided. The specific detection method in the embodiment of the invention is as follows:
step 101: and the RMC issues a command to a second BMC in the server node board to detect the IO path.
Step 102: the second BMC command HBA acquires the SAS address of the switching node board SAS Switch through the SCSI command. And reporting an exception if the acquisition fails.
Step 103: and the second BMC commands HBA acquire the SAS address of the storage node board SAS expander through SCSI commands. And reporting an exception if the acquisition fails.
Step 104: and the second BMC command HBA acquires the serial number of the hard disk of the storage node board through the SCSI command. And reporting an exception if the acquisition fails.
The system of the embodiment of the invention is illustrated in a plane.
For ease of description, only two server node boards 32, one switch node board 33, and two storage node boards 34 are shown as shown in FIG. 4. For example, two storage node boards 34 have 3 Hard disks, and a user wishes to have no Hard disks visible to all server node boards in the initial state, and Hard disks are assigned by the RMC to a particular server node board 32 based on user configuration, such as Hard Disk HDD1(Hard Disk Drive) to HBA1 and Hard Disk HDD2 to HBA 2. The unallocated hard disk is referred to as a masterless disk, and in FIG. 4 hard disk HDD3 is a masterless disk. Because the slot position of the server node board is fixed, the port of the HBA for connecting the SAS Switch is also determined, the SAS Switch does not need to care whether the slot position server node board is in place or not, the ZONE of all slot positions of the server node board is well allocated during initialization, and the server node board can see the allocated hard disk after being electrified.
In order to implement resource allocation, in the system for managing storage resources according to the embodiment of the present invention, the RMC obtains an allocation request of a first hard disk, and analyzes port and capacity information of the first hard disk in the allocation request; and adding the port of the first hard disk into a partner hard disk partition matched with the capacity of the first hard disk according to the port of the first hard disk and the capacity information, wherein the partner hard disk partition is a partition in which a one-to-one correspondence relation with a server node board partition is established in advance.
In the embodiment of the invention, dynamic resource allocation is realized before the server node is powered on, the adopted storage resource allocation method only needs to synchronize the ZPT table once during initialization, and SAS zoning configuration can be conveniently realized by changing the group identification code group ID of the SAS PHY corresponding to the hard disk when subsequent SAS topology changes.
It should be noted that: the server node ZONE and the partner hard disk ZONE are set through the SAS Switch, where the server node ZONE refers to an attribute group ID of a SAS PHY (physical layer) of a server node board slot corresponding to the switching node board SAS Switch, such as the group ID of a port P1 corresponding to the SAS Switch modified HBA1 is group 8(SAS switching specifies that groups 0-7 are reserved groups), and the group ID of a port P2 corresponding to the HBA2 is group 9. The partner hard disk ZONE refers to a ZONE with which the server node can communicate. For example, the partner storage node ZONE of the HBA1 is group64, and the partner storage node ZONE of the HBA2 is group 65.
The SAS expander sets all hard disks to belong to the same default ZONE, and usually selects a reserved group as the default ZONE. For example, SAS expander1 sets the group ID of port P3 corresponding to hard disk HDD1 to group0, sasexexpander 2 sets the group ID of port P4 corresponding to hard disk HDD2 to group0, and sets the group ID of port P5 corresponding to hard disk HDD3 to group 0.
The pre-establishment means that the switching node board SAS Switch sets a server node board ZONE and a partner hard disk ZONE at initialization, so that the server node board ZONE and the partner hard disk ZONE are in a one-to-one correspondence relationship.
And the storage node board SAS expander sets all hard disks to belong to the same default ZONE.
The SAS Switch sets a ZPT table, allowing only the server node board ZONE to communicate with a partner hard disk ZONE.
The SAS Switch synchronizes ZPT to all SAS expanders upon topology discovery.
The SAS Switch creates a ZPT table (the protocol uses this table to divide the communication), and the server node ZONE and the partner hard disk ZONE establish communication. For example, the ZPT table created by SAS Switch is shown in table 1 below.
Figure BDA0000748380870000111
The X-axis of the ZPT table represents the source ZONE, the Y-axis represents the destination ZONE, ZP (X, Y) is 0 to indicate that groupX and groupY cannot communicate, and 1 to indicate that groupX and groupY can communicate. M represents the maximum group, which is related to the SAS expander chip capability and generally supports 128 groups. The example of fig. 4 sets ZP (8, 64) and ZP (64, 8) to 1 so that group8 and group64 can communicate with each other, and sets ZP (9, 65) and ZP (65, 9) to 1 so that group9 and group65 can communicate with each other.
SAS Switch synchronizes ZPT to SAS expander, such as SAS Switch1 and SAS Switch2 synchronizes ZPT to SAS expander1 and SAS expander 2.
The embodiment of the invention is realized as follows.
Step 201: the user assigns a hard disk to the server node board at the management client 35, such as the hard disk HDD1 assigned to the HBA1 and the hard disk HDD2 assigned to the HBA2, and issues commands to the RMC.
Step 202: the RMC issues a command to the first BMC.
Step 203: the first BMC instructs the SAS expander to modify the partner hard disk ZONE. The partner hard disk ZONE refers to a group ID of the SAS expander of the storage node board corresponding to the PHY attribute of the hard disk SAS. For example, SAS expander1 modifies the group ID of port P3 corresponding to hard disk HDD1 to group64, and SAS expander2 modifies the group ID of port P4 corresponding to hard disk HDD2 to group 65.
Step 204: the resource allocation is successful and the RMC saves the user data.
In the embodiment of the invention, hardware resources need to be reported first, then hardware is allocated to the server node according to user configuration, the hard disk loading group0 is a default group, the RMC is used for judging the hardware capacity and reporting the resources to the RMC by the interface information machine, and the RMC allocates the hard disk as the group ID of the serving partner ZONE according to the reporting information (namely, the original group0 is modified into group 64), so that the size of the server and the capacity required by the user client instruction forwarded by the RMC can be obtained.
In order to implement deletion of allocated resources, in the system for storage resource management according to the embodiment of the present invention,
the RMC is further configured to obtain a deletion allocation request after the first hard disk is allocated, and quit the first hard disk from the partner hard disk partition according to the deletion allocation request.
The embodiment of the invention is realized as follows.
Step 301: the user issues a command to the RMC at the management client 35 to delete the hard disk assigned to the server node board, such as the hard disk HDD1 assigned to the HBA 1.
Step 302: the RMC issues a command to the first BMC.
Step 303: the first BMC instructs the SAS expander to exit the hard disk from the partner storage node ZONE, such as the sasexexpander 1, to modify the group ID of the port P3 corresponding to the hard disk HDD1 to group0(0 to 9), exiting group 64.
In order to facilitate a user to maintain or replace a hard disk and then pull out the hard disk, in the system for managing storage resources according to the embodiment of the present invention, the first BMC is further configured to acquire a second hard disk pull-out event, report the second hard disk pull-out event to the RMC, and delete the second hard disk information by the RMC.
In order to facilitate a user to add a hard disk or replace a hard disk and then insert a new hard disk, in the system for managing storage resources according to the embodiment of the present invention, the first BMC is further configured to obtain a second hard disk insertion event, allocate the second hard disk to a preset default partition, and report the insertion event to the RMC.
It should be noted that: the default partition is group 0.
As shown in fig. 5, the RMC in the embodiment of the present invention may implement the above-mentioned multiple functions, such as hard disk management, resource deletion, IO path detection, and hard disk insertion.
As shown in fig. 6, an embodiment of the present invention provides a method for storage resource management, including:
step 61, the server node board obtains self-monitoring analysis and reporting technology SMART information and reports the information to a cabinet management board RMC;
step 62, the switching node board obtains serial connection small computer system interface SAS topology information and reports the information to the RMC;
step 63, the storage node board obtains the hard disk information and reports the hard disk information to the RMC, wherein the storage node board comprises: the system comprises a first substrate management controller BMC connected to a storage node board of the RMC and a JBOD connected to the first BMC, wherein the first BMC is used for managing hard disk information acquired from the JBOD;
and step 64, the RMC manages storage resources according to one or more of the received SMART information, the received SAS topology information and the received hard disk information.
In the embodiment of the invention, the self-monitoring analysis and reporting technology SMART information, serial connection small computer system interface SAS topology information and hard disk information are directly and respectively acquired from the server node board, the exchange node board and the storage node board for management, so that the embodiment of the invention optimizes the management plane of the existing cabinet server system, realizes out-of-band resource management and distribution by using the first BMC, and directly acquires JBOD storage resources by managing JBOD through the first BMC of the storage node board, thereby simplifying the software complexity of SAS Switch and improving the transmission efficiency.
In order to better obtain the topology information, the specific process of obtaining the topology information of the present invention is as follows:
in step 401, the third BMC obtains SAS topology information from the SAS Switch, for example, the SAS Switch obtains port information of the HBA1 and the HBA2 and port information of the connection SAS expander1 and the SAS expander 2.
Step 402: and the third BMC reports the SAS topology information to the RMC.
Step 403: the RMC collects all the SAS topologies reported by the switching node boards, such as the SAS topologies reported respectively to the SAS switches, alarms if the SAS topologies are inconsistent, and selects the optimal topology.
Step 404: the first BMC acquires SAS topology information from the SAS expander, such as the SAS expander1 and the SASexpander2, wherein the SAS topology information comprises port information, capacity, equipment type and interface type of a hard disk.
Step 405: and the first BMC reports the SAS topology to the RMC.
Step 406: the RMC collects the SAS topologies reported by all storage node boards, for example, comparing the SAS topologies reported by the SAS expander1 and the SAS expander2 respectively, alarming if the two SAS topologies are inconsistent, and selecting the optimal topology (the optimal topology refers to the highest number, for example, 3 hard disks are detected through one SAS expander1, 2 hard disks are detected through one SAS expander2, and in order to avoid missing the hard disks, the number of the hard disks is 3.
In a method of storage resource management according to still another embodiment of the present invention, the server node board includes: the system comprises a second baseboard management controller BMC connected to a server node board of the RMC, a host bus adapter HBA respectively connected to the second BMC in an out-of-band management mode, a basic input output system BIOS and a temperature sensor, wherein the second BMC manages temperature information in SMART information recorded by the HBA.
In the embodiment of the invention, the second BMC in the server node board manages the HBA, so that the defect that the SMART information of the hard disk cannot be directly acquired by the SASexpander out-of-band management channel can be overcome.
In a method of storage resource management according to still another embodiment of the present invention, the switching node board includes: a third baseboard management controller BMC connected to a switching node board of the RMC, and a serial attached small computer system interface switching SAS Switch and a temperature sensor respectively connected to the third BMC in an out-of-band management, wherein the third BMC is configured to manage SAS topology information acquired from the SAS Switch.
In the embodiment of the invention, the third BMC in the switching node board manages the SAS Switch, and the difference of SAS switches of different manufacturers is shielded for RMC.
In the method for storage resource management according to still another embodiment of the present invention, the SAS topology information includes a Port number Port ID of the HBA connected to the SAS Switch and an address of the SAS.
In the embodiment of the invention, the first BMC and the second BMC are respectively connected with the JBOD and the HBA, the first BMC manages the JBOD, and can also directly acquire JBOD storage resources, thereby simplifying the software complexity of the SAS Switch.
In the method for storage resource management according to still another embodiment of the present invention, the hard disk information includes port information, capacity, device type, and interface type of the hard disk.
In the method for managing storage resources according to another embodiment of the present invention, the RMC obtains an allocation request of a first hard disk, and analyzes port and capacity information of the first hard disk in the allocation request; and adding the port of the first hard disk into a partner hard disk partition matched with the capacity of the first hard disk according to the port of the first hard disk and the capacity information, wherein the partner hard disk partition is a partition in which a one-to-one correspondence relation with a server node board partition is established in advance.
The storage resource allocation method adopted by the invention can conveniently realize SAS zoning configuration by only synchronizing the ZPT table once during initialization and changing the group ID of the SAS PHY (Physical layer) corresponding to the hard disk when the subsequent SAS topology changes.
In the method for storage resource management according to still another embodiment of the present invention, after the first hard disk is allocated, the RMC obtains a deletion allocation request, and exits the first hard disk from a partner hard disk partition according to the deletion allocation request.
In the method for managing storage resources according to still another embodiment of the present invention, the first BMC acquires a second hard disk pull-out event, reports the second hard disk pull-out event to the RMC, and the RMC deletes second hard disk information.
In the method for managing storage resources according to still another embodiment of the present invention, the first BMC acquires an insertion event of a second hard disk, allocates the second hard disk to a preset default partition, and reports the insertion event to the RMC.
In a method of storage resource management of yet another embodiment of the present invention,
the RMC sends a command for inquiring the running state of a third hard disk to the second BMC;
the RMC receives SMATR information fed back by the second BMC in the HBA, and regulates and controls an integral heat dissipation system according to the actual temperature of the third hard disk in the SMATR information;
the RMC receives SMATR information fed back by the second BMC in the HBA, and judges whether a fault hard disk of the third hard disk occurs according to the running state of the third hard disk in the SMATR information;
and the RMC alarms and isolates after the third hard disk fails.
In the embodiment of the invention, the system can be dynamically regulated to dissipate heat according to the temperature of the hard disk and the running state of the hard disk can be monitored.
It should be noted that: the SMART information is information including the operation time, operation parameters, operation temperature, etc. of the hard disk, and is used for recording the state of the hard disk.
The flow of hard disk management for an RMC is exemplified as follows:
step 501: and the RMC issues a command to the second BMC to inquire the running state of the hard disk.
Step 502: and the second BMC in the server node board commands the HBA to inquire the SMART information of the hard disk.
Step 503: the RMC acquires the running state of the hard disk.
Step 504: the RMC regulates and controls the heat dissipation system according to the actual temperature of the hard disk.
Step 505: the RMC discovers a failed disk (as opposed to normal parameters), alarms and isolates (i.e., exits the hard disk).
In the embodiment of the present invention, for the problem that the error of the reported data of the temperature data performed by the conventional data of a single sensor (specifically, because there are many storage boards, there cannot be a sensor on each corresponding storage board, or the distance between each sensor and each storage board may also affect the measured data, so that the temperature of the hard disk monitored by the temperature sensor and the actual temperature inside the hard disk may have an error, which affects the heat dissipation effect of the System and shortens the life of the hard disk), as shown in fig. 2, only the HBA in the server node may be used as an initiator Device to obtain the SMART information through an SCSI (small computer System Interface) protocol Information does not influence the heat dissipation effect of the system, and the hardware temperature is conveniently and reliably monitored.
It should be noted that, the apparatus provided by the present invention is an apparatus applying the method for managing storage resources, and all embodiments of the method for managing storage resources are applicable to the apparatus and can achieve the same or similar beneficial effects.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (17)

1. A system for storage resource management, comprising:
the equipment cabinet management board RMC is used for managing storage resources according to one or more of received self-monitoring analysis and reporting technology SMART information, serial connection small computer system interface SAS topological information and hard disk information; the RMC is also used for analyzing the port and capacity information of the first hard disk in the allocation request when the allocation request of the first hard disk is obtained; adding the port of the first hard disk into a partner hard disk partition matched with the capacity of the first hard disk according to the port of the first hard disk and the capacity information, wherein the partner hard disk partition is a partition in which a one-to-one correspondence relation with a server node board partition is established in advance;
the server node board is connected with the RMC and used for acquiring the SMART information and reporting the SMART information to the RMC;
the switching node board is connected with the RMC and used for acquiring the SAS topology information and reporting the SAS topology information to the RMC;
and the storage node board is connected with the RMC and used for acquiring the hard disk information and reporting the hard disk information to the RMC, wherein the storage node board comprises: the system comprises a first substrate management controller BMC connected to a storage node board of the RMC and a JBOD connected to the JBOD, wherein the JBOD is used for managing hard disk information acquired from the JBOD.
2. The storage resource management system of claim 1, wherein the server node board comprises: the system comprises a second baseboard management controller BMC connected to the server node board of the RMC, and a host bus adapter HBA, a basic input output system BIOS and a temperature sensor which are respectively connected to the second baseboard management controller BMC in an out-of-band management mode, wherein the second baseboard management controller BMC is used for managing temperature information recorded by the HBA in the SMART information.
3. The system for storage resource management according to claim 1, wherein said switch node board comprises: the third baseboard management controller BMC is connected to a third baseboard management controller BMC in the switching node board of the RMC, and the serial attached small computer system interface switching SAS Switch and the temperature sensor are respectively connected to the third baseboard management controller BMC in an out-of-band management manner, wherein the third baseboard management controller BMC is configured to manage SAS topology information acquired from the SAS Switch.
4. The system for storage resource management according to claim 1 or 3, wherein the SAS topology information comprises Port number Port ID of HBA connection SAS Switch, address of the SAS.
5. The system for storage resource management according to claim 1, wherein the hard disk information comprises port information, capacity, device type, and interface type of the hard disk.
6. The storage resource management system of claim 1,
the RMC is further configured to obtain a deletion allocation request after the first hard disk is allocated, and quit the first hard disk from the partner hard disk partition according to the deletion allocation request.
7. The storage resource management system of claim 1,
the first baseboard management controller BMC is further used for acquiring a second hard disk pull-out event, reporting the second hard disk pull-out event to the RMC, and deleting second hard disk information by the RMC.
8. The storage resource management system of claim 1,
the first baseboard management controller BMC is further configured to acquire a second hard disk insertion event, allocate the second hard disk to a preset default partition, and report the insertion event to the RMC.
9. A method of storage resource management, comprising:
the server node board acquires self-monitoring analysis and reporting technology SMART information and reports the information to the cabinet management board RMC;
the switching node board acquires SAS topology information of a serial connection small computer system interface and reports the SAS topology information to the RMC;
the storage node board acquires hard disk information and reports the hard disk information to the RMC, wherein the storage node board comprises: the system comprises a first substrate management controller BMC connected to a storage node board of the RMC and a JBOD connected to the first substrate management controller BMC, wherein the first substrate management controller BMC is used for managing hard disk information acquired from the JBOD;
the RMC manages storage resources according to one or more of the received SMART information, the received SAS topology information and the received hard disk information;
the RMC obtains an allocation request of a first hard disk and analyzes port and capacity information of the first hard disk in the allocation request; and adding the port of the first hard disk into a partner hard disk partition matched with the capacity of the first hard disk according to the port of the first hard disk and the capacity information, wherein the partner hard disk partition is a partition in which a one-to-one correspondence relation with a server node board partition is established in advance.
10. The method of storage resource management according to claim 9, wherein said server node board comprises: the system comprises a second baseboard management controller BMC connected to a server node board of the RMC, a host bus adapter HBA respectively connected to the second baseboard management controller BMC in an out-of-band management mode, a basic input output system BIOS and a temperature sensor, wherein the second baseboard management controller BMC manages temperature information in the SMART information recorded by the HBA.
11. The method of storage resource management according to claim 9, wherein said switching node board comprises: the third baseboard management controller BMC is connected to a third baseboard management controller BMC in the switching node board of the RMC, and the serial attached small computer system interface switching SAS Switch and the temperature sensor are respectively connected to the third baseboard management controller BMC in an out-of-band management manner, wherein the third baseboard management controller BMC is configured to manage SAS topology information acquired from the SAS Switch.
12. The method for storage resource management according to claim 9 or 11, wherein the SAS topology information includes Port number Port ID of HBA connected SAS Switch, address of the SAS.
13. The method of storage resource management according to claim 9, wherein the hard disk information comprises port information, capacity, device type, and interface type of the hard disk.
14. The method of storage resource management according to claim 9,
and the RMC acquires a deletion distribution request after the first hard disk is distributed, and quits the first hard disk from a partner hard disk partition according to the deletion distribution request.
15. The method of storage resource management according to claim 9,
and the first baseboard management controller BMC acquires a second hard disk pull-out event, reports the second hard disk pull-out event to the RMC, and deletes the second hard disk information by the RMC.
16. The method of storage resource management according to claim 9,
the first baseboard management controller BMC acquires a second hard disk insertion event, allocates the second hard disk to a preset default partition, and reports the insertion event to the RMC.
17. The method of storage resource management according to claim 10,
the RMC sends a command for inquiring the running state of a third hard disk to the second baseboard management controller BMC;
the RMC receives SMATR information fed back by the second baseboard management controller BMC in the HBA, and regulates and controls an integral heat dissipation system according to the actual temperature of the third hard disk in the SMATR information;
the RMC receives SMATR information fed back by the second baseboard management controller BMC in the HBA, and judges whether a fault hard disk of the third hard disk occurs according to the running state of the third hard disk in the SMATR information;
and the RMC alarms and isolates after the third hard disk fails.
CN201510369563.2A 2015-06-29 2015-06-29 System and method for managing storage resources Active CN106325761B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201510369563.2A CN106325761B (en) 2015-06-29 2015-06-29 System and method for managing storage resources
PCT/CN2016/080051 WO2017000639A1 (en) 2015-06-29 2016-04-22 Storage resource management system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510369563.2A CN106325761B (en) 2015-06-29 2015-06-29 System and method for managing storage resources

Publications (2)

Publication Number Publication Date
CN106325761A CN106325761A (en) 2017-01-11
CN106325761B true CN106325761B (en) 2020-04-28

Family

ID=57607686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510369563.2A Active CN106325761B (en) 2015-06-29 2015-06-29 System and method for managing storage resources

Country Status (2)

Country Link
CN (1) CN106325761B (en)
WO (1) WO2017000639A1 (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106817413A (en) * 2017-01-11 2017-06-09 郑州云海信息技术有限公司 A kind of control node method for start-up and shutdown, system and RMC
US10310745B2 (en) 2017-05-19 2019-06-04 Samsung Electronics Co., Ltd. Method and apparatus for fine tuning and optimizing NVMe-oF SSDs
CN107315542B (en) * 2017-06-29 2020-11-20 苏州浪潮智能科技有限公司 JBOD cascade system
CN107577580A (en) * 2017-09-18 2018-01-12 郑州云海信息技术有限公司 A kind of cabinet management system and method
CN107783841B (en) * 2017-11-10 2020-07-21 苏州浪潮智能科技有限公司 RACK server pooling product management optimization method
CN108197009A (en) * 2018-02-02 2018-06-22 郑州云海信息技术有限公司 A kind of Jbod system log acquisition methods, system, medium and equipment
CN108460126A (en) * 2018-02-28 2018-08-28 郑州云海信息技术有限公司 A kind of acquisition methods and device of the daily record of equipment cabinet server unit of memory allocation
CN108616390B (en) * 2018-04-11 2019-09-13 新华三信息技术有限公司 The realization device of girff management method, device and girff management
CN108762667A (en) * 2018-04-20 2018-11-06 烽火通信科技股份有限公司 The method that the multi node server of disk can be dynamically distributed and dynamically distribute disk
CN109189326B (en) * 2018-07-25 2020-09-08 华为技术有限公司 Management method and device of distributed cluster
CN109376052A (en) * 2018-09-10 2019-02-22 联想(北京)有限公司 It is a kind of to monitor the method for disk state, electronic equipment
CN109189644B (en) * 2018-09-17 2021-10-22 郑州云海信息技术有限公司 Whole cabinet RMC, and method and system for automatically configuring number of newly added nodes of whole cabinet
CN109586994B (en) * 2018-11-01 2022-02-18 郑州云海信息技术有限公司 Aging test monitoring method and system for server of whole cabinet
CN110515540B (en) * 2019-07-26 2022-07-22 苏州浪潮智能科技有限公司 Method and device for topology of hard disk
CN110601887B (en) * 2019-09-06 2021-11-26 苏州浪潮智能科技有限公司 Out-of-band management method, server, computer storage medium and terminal
CN110941535A (en) * 2019-11-22 2020-03-31 山东超越数控电子股份有限公司 Hard disk load balancing method
CN111045602B (en) * 2019-11-25 2024-01-26 浙江大华技术股份有限公司 Cluster system control method and cluster system
CN111092759B (en) * 2019-12-13 2022-12-16 苏州浪潮智能科技有限公司 Log management method, device and medium in JBOD (just in Bunch) out-of-band management system
TWI796211B (en) * 2022-04-29 2023-03-11 神雲科技股份有限公司 Port setting method
CN117311634B (en) * 2023-10-08 2024-04-12 无锡众星微系统技术有限公司 Method and device for processing disk exception of SAS (serial attached small computer system interface) storage system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011227841A (en) * 2010-04-23 2011-11-10 Hitachi Ltd Computer system and identifier management method
CN102510344A (en) * 2011-11-23 2012-06-20 华为技术有限公司 Rack server system
CN103154927A (en) * 2010-10-16 2013-06-12 惠普发展公司,有限责任合伙企业 Device hardware agent
CN103793307A (en) * 2012-10-31 2014-05-14 英业达科技有限公司 Electronic device, management method thereof and cabinet servo system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI403884B (en) * 2010-11-30 2013-08-01 Inventec Corp Rack server system
TWI437426B (en) * 2011-07-08 2014-05-11 Quanta Comp Inc Rack server system
CN104683147B (en) * 2015-01-27 2018-10-30 加弘科技咨询(上海)有限公司 It is a kind of to large-scale data central hardware management method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011227841A (en) * 2010-04-23 2011-11-10 Hitachi Ltd Computer system and identifier management method
CN103154927A (en) * 2010-10-16 2013-06-12 惠普发展公司,有限责任合伙企业 Device hardware agent
CN102510344A (en) * 2011-11-23 2012-06-20 华为技术有限公司 Rack server system
CN103793307A (en) * 2012-10-31 2014-05-14 英业达科技有限公司 Electronic device, management method thereof and cabinet servo system

Also Published As

Publication number Publication date
WO2017000639A1 (en) 2017-01-05
CN106325761A (en) 2017-01-11

Similar Documents

Publication Publication Date Title
CN106325761B (en) System and method for managing storage resources
US8713295B2 (en) Fabric-backplane enterprise servers with pluggable I/O sub-system
US8046627B2 (en) Server failover control method and apparatus and computer system group
US8429446B2 (en) Method and apparatus for managing virtual ports on storage systems
CN100414542C (en) System and method for managing memory source in cluster computer system
US7328223B2 (en) Storage management system and method
KR101531741B1 (en) Redundant intermediary switch solution for detecting and managing fibre channel over ethernet (fcoe) switch failures
EP1779590B1 (en) Port aggregation for fibre channel interfaces
TWI394403B (en) Obtaining multiple-port addresses by a fibre channel switch from a network fabric
US6944152B1 (en) Data storage access through switched fabric
US20070047536A1 (en) Input/output router for storage networks
KR101498413B1 (en) Fibre channel forwarder fabric login sequence
CN105808158B (en) Method and apparatus for grouping multiple SAS expanders to form a single cohesive SAS expander
CN110572439B (en) Cloud monitoring method based on metadata service and virtual forwarding network bridge
CA2419000A1 (en) Method and apparatus for imparting fault tolerance in a switch or the like
US11411753B2 (en) Adding network controller sideband interface (NC-SI) sideband and management to a high power consumption device
US11792098B2 (en) Link detection method and system
CN113645047B (en) Out-of-band management system and server based on intelligent network card
CN107704206B (en) Method, device, equipment and storage medium for online migration of heterogeneous system data
CN110838942B (en) Ultra-high-speed multi-terminal server configuration method based on optical fiber network
US7103711B2 (en) Data logging by storage area network devices to a reserved storage area on the network
US20160246746A1 (en) Sas configuration management
CN115277348A (en) Server management method, server and server management system
US9143435B1 (en) Management of paths of switches of distributed computing systems
US9729470B2 (en) Method, apparatus, and system for implementing node port virtualization on fibre channel

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant