WO2016203565A1 - Computer system and control method - Google Patents

Computer system and control method Download PDF

Info

Publication number
WO2016203565A1
WO2016203565A1 PCT/JP2015/067434 JP2015067434W WO2016203565A1 WO 2016203565 A1 WO2016203565 A1 WO 2016203565A1 JP 2015067434 W JP2015067434 W JP 2015067434W WO 2016203565 A1 WO2016203565 A1 WO 2016203565A1
Authority
WO
WIPO (PCT)
Prior art keywords
failure
pci express
wiring
server
storage
Prior art date
Application number
PCT/JP2015/067434
Other languages
French (fr)
Japanese (ja)
Inventor
勝美 大内
功 大原
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2015/067434 priority Critical patent/WO2016203565A1/en
Publication of WO2016203565A1 publication Critical patent/WO2016203565A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus

Definitions

  • the present invention relates to a computer system.
  • FC Fiber Channel
  • Non-patent Document 1 in the PCI Express (PCIe) standard, a Root Complex (RC) and an Endpoint (EP) connected to the RC form a PCIe tree.
  • the EP detects an error by the Advanced Error Reporting (AER) function defined by PCIe
  • the EP transmits an error message to the RC upstream of the PCIe tree.
  • the RC issues an interrupt to the CPU, and the CPU confirms the register value indicating the failure, thereby identifying the failure site and the failure type.
  • the computer system is managed by dividing it into a server side management range and a storage side management range.
  • a failure detected in the server-side management range is notified to the server, and a failure detected in the storage-side management range is notified to the storage.
  • PCI Express Base Specification Revision 3.1 PCI Express Base Specification Revision 3.1
  • PCI-SIG September 8, 2014 (50-55 pages, 506-520 pages)
  • the PCIe device when a failure of a PCIe device is detected within the storage-side management range, the PCIe device may not be a failure occurrence site, but the PCIe device within the server-side management range may be a failure occurrence site. In this case, the storage recognizes the failure, but the server does not recognize the failure because no failure is detected within the server-side management range. As described above, in the computer system including the server-side management range and the storage-side management range, when a PCIe path failure occurs, the administrator may not be able to identify the failed part.
  • a computer system includes a storage device, a storage controller connected to the storage device, and a switch connected to the storage controller via a first PCI-Express wiring.
  • the storage controller includes a Root Complex of a first PCI Express tree
  • the server processor includes a Root Complex of a second PCI Express tree
  • the switch module belongs to the first PCI Express tree
  • the interface device is A first endpoint connected to the second PCI Express wiring and belonging to the first PCI Express tree, and a second Endpoint connected to the third PCI Express wiring and belonging to the second PCI Express tree.
  • the interface device When the interface device detects the first failure of the second PCI Express wiring, the interface device executes a first change process for changing the state of the second PCI Express wiring and indicates the first failure A first server side message is sent to the server processor via the third PCI Express wiring, and the storage controller detects the first failure based on a change in the state of the second PCI Express wiring, The server processor detects the first failure based on the first server side message.
  • the configuration of the computer system is shown.
  • the path configuration inside and between the server chassis 100 and the storage chassis 200 is shown.
  • the structure of the server side PCIe tree and the storage side PCIe tree is shown.
  • the configuration of the connection between the I / F-LSI 105 and the PCIe-SW 112 is shown.
  • the program of each part in a computer system is shown.
  • the first failure process is shown.
  • the second failure processing is shown.
  • the third failure processing is shown.
  • the structure for control of the power supply of PCIe-SW112 is shown.
  • the 1st part of a 1st power activation process is shown.
  • the 2nd part following the 1st part of a 1st power activation process is shown.
  • the 1st part of a 2nd power-on process is shown.
  • xxx table information may be described using the expression “xxx table”, but the information may be expressed in any data structure. That is, “xxx table” can be referred to as “xxx information” to indicate that the information does not depend on the data structure.
  • xxx information information may be described using the expression “xxx table”, but the information may be expressed in any data structure. That is, “xxx table” can be referred to as “xxx information” to indicate that the information does not depend on the data structure.
  • the configuration of each table is an example, and one table may be divided into two or more tables, or all or part of the two or more tables may be a single table. Good.
  • an ID is used as element identification information, but other types of identification information may be used instead of or in addition thereto.
  • a reference number or a common number in the reference number is used, and when a description is made by distinguishing the same type of element, the reference number of the element is used.
  • an ID assigned to the element may be used instead of the reference code.
  • an I / O (Input / Output) request is a write request or a read request, and may be referred to as an access request.
  • the process may be described using “program” as a subject.
  • the program is executed by a processor (for example, a CPU (Central Processing Unit)), so that a predetermined processing is appropriately performed. Since processing is performed using a storage resource (for example, a memory) and / or an interface device (for example, a communication port), the subject of processing may be a processor.
  • the process described with the program as the subject may be a process or system performed by a processor or an apparatus having the processor.
  • the processor may include a hardware circuit that performs a part or all of the processing.
  • the program may be installed in a computer-like device from a program source.
  • the program source may be, for example, a storage medium that can be read by a program distribution server or a computer.
  • the program distribution server may include a processor (for example, a CPU) and a storage resource, and the storage resource may further store a distribution program and a program to be distributed. Then, the processor of the program distribution server executes the distribution program, so that the processor of the program distribution server may distribute the distribution target program to other computers.
  • a processor for example, a CPU
  • the storage resource may further store a distribution program and a program to be distributed. Then, the processor of the program distribution server executes the distribution program, so that the processor of the program distribution server may distribute the distribution target program to other computers.
  • two or more programs may be realized as one program, or one program may be realized as two or more programs.
  • the management system may be composed of one or more computers.
  • the management computer displays information (specifically, for example, the management computer displays information on its own display device, or the management computer displays display information in a remote display computer)
  • Management computer is the management system.
  • the plurality of computers may include a display computer when the display computer performs display
  • the management computer (eg, management system) may include an interface device connected to the I / O system including the display system, a storage resource (eg, memory), and a processor connected to the interface device and the storage resource.
  • the display system may be a display device included in the management computer or a display computer connected to the management computer.
  • the I / O system may be an I / O device (for example, a keyboard and a pointing device or a touch panel) included in the management computer, a display computer connected to the management computer, or another computer.
  • “Displaying display information” by the management computer means displaying the display information on the display system, which may be displaying the display information on a display device included in the management computer.
  • the management computer may transmit display information to the display computer (in the latter case, the display information is displayed by the display computer).
  • the management computer inputting / outputting information may be inputting / outputting information to / from an I / O device of the management computer, or a remote computer connected to the management computer (for example, a display) Information may be input / output to / from the computer.
  • the information output may be a display of information.
  • Fig. 1 shows the configuration of the computer system.
  • the computer system includes one or more server enclosures 100, one or more storage enclosures 200, one or more drive enclosures 250, and a management client 50 (management computer).
  • the server chassis 100 is connected to the storage chassis 200 via a PCIe wiring.
  • the PCIe wiring may be a cable or a substrate.
  • the storage chassis 200 is connected to the drive chassis 250 via an interface such as PCIe, SAS, or SATA.
  • the management client 50 is connected to the server casing 100 and the storage casing 200 via a communication network such as Local Area Network (LAN).
  • LAN Local Area Network
  • the server housing 100 includes one or more server units 101, a plurality of switch modules (SWM) 111, a plurality of Service Processors (SVP) 121, and a plurality of Power Supply Units (PSU) 122.
  • the server unit 101 is connected to the SWM 111 via the PCIe wiring, and is connected to the SVP 121 via the LAN wiring.
  • the SWM 111 is connected to the storage chassis 200.
  • the server unit 101 includes a central processing unit (CPU) 102, a memory (MEM) 103, an interface (I / F) board 104, and a baseboard management controller (BMC) 106.
  • the server unit 101 is, for example, a server blade.
  • the CPU 102 may be referred to as a server CPU 102 (server processor)
  • the SVP 121 may be referred to as a server SVP 121 (server management processor)
  • the PSU 122 may be referred to as a server PSU 122 (server power supply circuit).
  • the server CPU 102 is connected to the memory 103 and the BMC 106.
  • the memory 103 stores programs and data such as Operating System (OS) and applications.
  • the server CPU 102 executes processing such as I / O for the storage chassis 200 in accordance with a program stored in the memory 103.
  • the server CPU 102 manages configuration information in the server unit 101 and a model name for each component in the server unit 101. When there is a discrepancy between the configuration information detected in the server unit 101 and the configuration information set from the server SVP 121, the server CPU 102 notifies the server SVP 121 via the BMC 106 of an error.
  • the BMC 106 is connected to the server SVP 121.
  • the BMC 106 monitors the operating status of the server unit 101, such as the state of the server CPU 102, the temperature, voltage, and FAN rotation speed in the server unit 101, detects a fault in the server unit 101, and information on the detected fault Is transmitted to the server SVP 121.
  • the server CPU 102 is connected to the I / F board 104 via the PCIe wiring.
  • the server CPU 102, the memory 103, and the BMC 106 are mounted on the motherboard (server blade) of the server unit 101.
  • the I / F board 104 is, for example, a mezzanine board connected to a motherboard and can be replaced.
  • the I / F board 104 includes an I / F-Large-Scale integrated circuit (LSI) 105.
  • the I / F-LSI 105 is connected to the SWM 111 via a PCIe wiring.
  • the I / F-LSI 105 includes a memory that stores a program and data, and a processor that executes processing according to the program.
  • the I / F-LSI 105 includes a DMA transfer control circuit, and executes DMA transfer between the memory 103 in the server unit 101 and the memory 203 in the storage controller 201 based on an instruction from the storage CPU 202. Further, when the upper layer protocol between the CPU 102 and the I / F-LSI 105 in the server unit 101 and the upper layer protocol between the CPU 202 and the I / F-LSI 105 in the storage controller 201 are different, the I / F-LSI 105 Performs protocol conversion.
  • the SWM 111 includes at least one PCIe switch (PCIe-SW) 112.
  • the PCIe-SW 112 is connected to the I / F-LSI 105 of the server housing 100 via a PCIe wiring.
  • the PCIe-SW 112 is further connected to the storage chassis 200.
  • another PCIe device such as a PCIe bridge may be used.
  • the server SVP 121 sets configuration information of modules in the server housing 100 such as the server unit 101 based on information from the management client 50. Further, the server SVP 121 executes power-on and shutdown of each server unit 101 based on information from the management client 50. The server SVP 121 also monitors the operating status of the server chassis 100 such as the state of the server unit 101, the temperature, voltage, and FAN rotation speed in the server chassis 100, and detects a failure in the server chassis 100. Information of the detected failure is transmitted to the management client 50.
  • the server PSU 122 supplies power to each part of the server chassis 100 from an external power source.
  • the storage chassis 200 includes a plurality of storage controllers (CTL) 201, a plurality of back-end (BE) interfaces 211, a plurality of SVPs 221 and a plurality of PSUs 222.
  • the storage controller (CTL) 201 is connected to the PCIe-SW 112 via the PCIe wiring, and is connected to the BE interface 211 via the PCIe wiring.
  • One or more drive housings 250 are connected to the BE interface 211.
  • the PSU 222 is connected to an external power source and supplies power to each part of the storage chassis 200.
  • the storage controller 201 includes a CPU 202 and a memory (MEM) 203.
  • the CPU 202 may be referred to as a storage CPU 202 (storage processor)
  • the SVP 221 may be referred to as a storage SVP 221 (storage management processor)
  • the PSU 222 may be referred to as a storage PSU 222 (storage power supply circuit).
  • the memory 203, SWM 111, and BE 211 are connected to the storage CPU 202.
  • the memory 203 stores the program and data of the storage controller 201.
  • the storage CPU 202 executes the processing of the storage controller 201 according to the program stored in the memory 203.
  • the storage CPU 202 manages the configuration information in the storage chassis 200 and the drive chassis 250 and the model name for each component in the storage chassis 200 and the drive chassis 250.
  • the storage CPU 202 notifies the storage SVP 221 of an error when there is a contradiction between the configuration information detected in the storage chassis 200 and the drive chassis 250 and the configuration information set from the storage SVP 221.
  • the drive housing 250 includes one or more drives 251.
  • the drive 251 is connected to the BE interface 211 directly or via a device such as a switch.
  • the storage SVP 221 sets configuration information of modules in the storage chassis 200 and the drive chassis 250 such as the storage controller 201 and the drive 251 based on information from the management client 50.
  • the storage SVP 221 also operates the storage chassis 200 and the drive chassis 250 based on information from the management client 50, such as the status of the storage controller 201, the temperature, voltage, and FAN rotation speed in the storage chassis 200. The status is monitored, a fault in the storage chassis 200 and the drive chassis 250 is detected, and information on the detected fault is transmitted to the management client 50.
  • the storage PSU 222 supplies power to each part of the storage chassis 200 from an external power source.
  • the management client 50 includes a memory that stores programs and data, a processor that executes processing according to the programs, an input device that receives input from the administrator, and a display device that displays information from the server SVP 121 and the storage SVP 221. Including.
  • the management client 50 performs a remote access to the server SVP 121 to call a server management screen for managing the server chassis 100 and display the server management screen on a display device.
  • the management client 50 further performs remote access to the storage SVP 221 to call a storage management screen for managing the storage chassis 200 and display the storage management screen on the display device. Further, the management client 50 receives failure information from the server SVP 121 and the storage SVP 221 and displays the received information on the display device.
  • FIG. 2 shows a path configuration inside and between the server chassis 100 and the storage chassis 200 in a computer system including one server chassis 100 and one storage chassis 200.
  • the server chassis 100 includes eight server units 101 and two SWMs 111.
  • Each SWM 111 includes one PCIe-SW 112.
  • the PCIe-SW 112 includes four virtual switches VS0 to VS3.
  • the storage chassis 200 includes two storage controllers 201 and eight BEs 211.
  • Each storage controller 201 includes two storage CPUs 202.
  • Each storage CPU 202 is connected to two BEs 211.
  • the two storage CPUs 202 in each storage controller 201 are connected to each other.
  • Two storage CPUs 202 in one storage controller 201 are connected to two storage CPUs 202 in the other storage controller 201, respectively.
  • Each I / F-LSI 105 is connected to two virtual switches in different SWMs 111 via PCIe wiring. These two virtual switches are respectively connected to the CPUs 202 in the two storage controllers 201 via PCIe wiring.
  • the two storage CPUs 202 in each storage controller 201 include a total of four PCIe root ports, and each root port is connected to four virtual switches in one SWM 111. Each virtual switch is connected to two I / F-LSIs 105.
  • the computer system of this embodiment includes eight server-side PCIe trees and eight storage-side PCIe trees. Further, the path between the I / F-LSI 105 and the storage controller 201 is made redundant.
  • the server CPU 102 can select one of a plurality of paths. Further, when the server CPU 102 receives a failure notification from the PCIe device on the path being used, the server CPU 102 may continue the access to the storage controller 201 by switching the path including the failure to another path. it can.
  • the number of each part, the number of each wiring, and the connection destination of each wiring in the computer system are not limited to the configuration of this embodiment.
  • the number of PCIe wirings between the plurality of server units 101 and the plurality of SWMs 111 is larger than the number of PCIe wirings between the plurality of SWMs 111 and the plurality of storage controllers 201.
  • the server chassis 100 since the server chassis 100 includes a plurality of SWMs 111, the PCIe wiring between the server chassis 100 and the storage chassis 200 can be compared with a configuration in which a plurality of SWMs are provided in the storage chassis 200. The number can be reduced, the connection and management between those cases can be facilitated, and failures can be reduced.
  • modules such as the I / F board 104 and the SWM 111 can be recognized as a faulty part, and can be replaced in units of modules.
  • the computer system is divided into the server casing 100 and the storage casing 200, the number of casings can be changed in scale-out or the like.
  • server CPU 102 and the I / F-LSI 105 are connected by PCIe, and the I / F-LSI 105 and the storage CPU 202 are connected by PCIe, so that the I / O performance can be improved.
  • the BE 211 and the drive 251 may be connected by PCIe.
  • FIG. 3 shows the configuration of the server-side PCIe tree and the storage-side PCIe tree.
  • the server CPU 102 includes a Root Complex (RC) 610, and the PCIe port of the RC 610 is a Root port 611.
  • the PCIe port connected to the root port 611 is an endpoint port 612.
  • the storage CPU 202 includes the RC 620, and the PCIe port of the RC 620 is the root port 621.
  • the PCIe port connected to the root port 621 is the upstream port 622.
  • the PCIe port on the I / F-LSI 105 side is the downstream port 311, and in the I / F-LSI 105, the PCIe port connected to the downstream port 311 of the PCIe-SW 112 is the endpoint port 301. That is, the Root port 611 and the Endpoint port 612 belong to the server side PCIe tree, and the Root port 621, the Upstream port 622, the Downstream port 311, and the Endpoint port 301 belong to the storage side PCIe tree.
  • the I / F-LSI 105 converts the server PCIe tree packet received at the endpoint port 612 into a storage-side PCIe tree packet, and transmits the packet from the endpoint port 301 to the PCIe-SW 112. Further, the I / F-LSI 105 converts the storage-side PCIe tree packet received at the Endpoint port 301 into a server PCIe tree packet, and transmits the packet from the Endpoint port 612 to the server CPU 102.
  • PCIe devices such as the I / F-LSI 105 and the PCIe-SW 112 include a register indicating the presence / absence of a failure for each port and each failure type.
  • PCIe failure a hardware failure of the PCIe path due to occurrence of an uncorrectable error such as Unsupported Request
  • the PCIe device writes failure information indicating the PCIe failure to a corresponding register.
  • the server CPU 102 and the storage CPU 202 can identify the parts and types of all the faults that have occurred at the same time by reading these registers after receiving the PCIe fault notification.
  • the PCIe device in the server side PCIe tree detects a PCIe failure
  • the PCIe device notifies the RC 610 of the server side PCIe tree of the PCIe failure.
  • the server CPU 102 detects a PCIe failure by an interrupt from the RC 610.
  • the PCIe device in the storage-side PCIe tree detects a PCIe failure
  • the PCIe failure is notified to the RC 620 of the storage-side PCIe tree.
  • the storage CPU 202 detects a PCIe failure by an interrupt from the RC 620.
  • the range of the part managed together with the server unit 101 in the computer system is set as the server-side management range and managed together with the storage controller 201.
  • the part range is set as the storage-side management range.
  • the server side management range includes the server unit 101.
  • the storage side management range includes the storage chassis 200, the drive chassis 250, and the SWM 111.
  • the server CPU 102 transmits a maintenance report indicating PCIe fault information, maintenance action, and maintenance component model name within the server-side management range to the management client 50 via the server SVP 121.
  • the storage CPU 202 transmits a maintenance report indicating PCIe fault information, maintenance action, and maintenance component model name within the storage management range to the management client 50 via the storage SVP 221.
  • the part from the endpoint port 301 of the I / F-LSI 105 to the downstream port 311 of the PCIe-SW 112 is a boundary between the server side management range and the storage side management range. This part is called a boundary range.
  • a boundary range When the PCIe device on the packet receiving side detects a PCIe failure, there is a possibility of a hardware failure of the PCIe device on the packet receiving side and a hardware failure of the PCIe device on the packet transmitting side. Therefore, when a PCIe failure is detected in the boundary range, both the I / F-LSI 105 and the PCIe-SW 112 become replacement candidates.
  • the server chassis 100 and the storage chassis 200 notify the detected failure to one management client 50, the administrator can accurately recognize the location of the failure in the computer system, and the entire computer system Can be integrated and managed.
  • management clients corresponding to the server-side management range and the storage-side management range may be provided.
  • the maintenance notification for the server-side management range may be transmitted to the management client corresponding to the server-side management range
  • the maintenance notification for the storage-side management range may be transmitted to the management client corresponding to the storage-side management range.
  • the server side management range and the storage side management range may be managed by different management software.
  • the management client 50 may display the maintenance report using management software corresponding to the maintenance report.
  • the computer system can manage the server-side management range and the storage-side management range. It is possible to notify the administrator of the faulty part in each of the above.
  • FIG. 4 shows a connection configuration between the I / F-LSI 105 and the PCIe-SW 112.
  • the I / F-LSI 105 includes the above-described Endpoint port 301, a block request notification register 304, and a block request reception register 305.
  • the Endpoint port 301 includes a PCIe link control register 302 and a PCIe link status register 303.
  • the PCIe-SW 112 includes the above-described downstream port 311, a block request reception register 314, and a block request notification register 315.
  • the downstream port 311 includes a PCIe link control register 312 and a PCIe link status register 313.
  • the PCIe link control register 302 indicates PCIe link control information written by the I / F-LSI 105.
  • the PCIe link status register 303 indicates the PCIe link status received by the I / F-LSI 105.
  • the PCIe link control register 312 indicates PCIe link control information written by the storage CPU 202.
  • the PCIe link status register 313 indicates the PCIe link status received by the storage CPU 202.
  • the block request notification register 304 indicates a block request written by the I / F-LSI 105.
  • the block request reception register 305 indicates a block request received by the I / F-LSI 105.
  • the block request notification register 315 indicates a block request written by the storage CPU 202.
  • the block request reception register 314 indicates a block request received by the storage CPU 202.
  • the PCIe wiring 320 between the I / F-LSI 105 and the PCIe-SW 112 includes a PCIe lane 321 and sideband signals 322 and 323.
  • Lane 321 is a pair of a differential signal for transmission and a differential signal for reception.
  • the sideband signal 322 transmits the block request written in the block request notification register 304 of the I / F-LSI 105 to the block request reception register 314 of the PCIe-SW 112.
  • the sideband signal 323 transmits the block request written in the block request notification register 315 of the PCIe-SW 112 to the block request reception register 305 of the I / F-LSI 105.
  • the PCIe wiring 320 may be a cable or a substrate such as a backplane.
  • FIG. 5 shows programs of each part in the computer system.
  • the memory 103 in the server unit 101 stores an I / F-LSI driver 501 for controlling the I / F-LSI 105.
  • the server CPU 102 executes processing for the I / F-LSI 105 according to the I / F-LSI driver 501.
  • the I / F-LSI driver 501 includes a path switching program 502 and a maintenance notification program 503.
  • the path switching program 502 switches the PCIe path to another path when the PCIe path including the PCIe failure is blocked.
  • the maintenance notification program 503 transmits, to the management client 50 via the server SVP 121, a maintenance notification indicating PCIe failure information, maintenance action, and maintenance component model name notified from the I / F-LSI 105.
  • the memory 107 in the I / F-LSI 105 stores an I / F-LSI control program 504 for controlling the I / F-LSI 105.
  • the processor in the I / F-LSI 105 executes processing according to the I / F-LSI control program 504.
  • the I / F-LSI control program 504 includes a regular monitoring program 505 and a failure processing program 506.
  • the periodic monitoring program 505 executes periodic monitoring that periodically reads the register in the I / F-LSI 105.
  • the failure processing program 506 executes detection of another PCIe failure, blockage of the failed part, notification of the PCIe failure, and the like.
  • the memory 203 in the storage controller 201 stores a storage control program 507 for controlling the storage chassis 200.
  • the storage CPU 202 executes processing according to the storage control program 507.
  • the storage control program 507 includes an initial setting program 508, a periodic monitoring program 509, a failure processing program 510, and a maintenance notification program 511.
  • the initial setting program 508 executes initial setting of the storage chassis 200.
  • the periodic monitoring program 509 performs periodic monitoring that periodically reads the registers in the storage controller 201.
  • the failure processing program 510 executes blockage of the failed part, notification of the PCIe failure, or the like.
  • the maintenance notification program 511 transmits, to the management client 50 via the storage SVP 221, a maintenance notification indicating PCIe failure information, maintenance action, and maintenance component model name notified from the PCIe device in the storage-side PCIe tree.
  • FIG. 6 shows the first failure processing
  • This sequence is an I / F-LSI control executed by an I / F-LSI driver 501 executed by the server CPU 102 and a processor in the I / F-LSI 105 as server-side processing that is processing in the server-side management range.
  • the operations of the program 504 and the Endpoint port 301 in the I / F-LSI 105 are shown.
  • This sequence further shows the operations of the downstream port 311 of the PCIe-SW 112, the blocking request reception register 314 of the PCIe-SW 112, and the storage control program 507 of the storage CPU 202 as storage-side processing that is processing of the storage-side management range. .
  • the failure processing program 506 executes failure information pruning processing for reading each register in the I / F-LSI 105 and acquiring failure information. As a result, the failure processing program 506 can acquire failure information indicating the type of detected PCIe failure and the location and type of other PCIe failures.
  • the failure processing program 506 closes (Link ⁇ ⁇ Disable) the PCIe path including the PCIe failure. In this example, the failure processing program 506 closes the PCIe path of the lane 321 by writing to the PCIe link control register 302 in the endpoint port 301 at least. As a result, the PCIe path enters a link down (communication impossible) state.
  • the failure processing program 506 writes a blocking request for requesting blocking of the PCIe path and issuance of the maintenance notification to the blocking request notification register 304 in the I / F-LSI 105, thereby generating the sideband signal 322. Then, the blocking request is transmitted to the blocking request reception register 314 in the PCIe-SW 112.
  • the periodic monitoring program 509 of the storage control program 507 of the storage controller 201 executes periodic monitoring that periodically monitors the registers in the storage-side PCIe tree in S1210. Thereafter, in S1220, the regular monitoring program 509 determines whether or not there is a block request in the block request reception register 314.
  • the regular monitoring program 509 executes the next regular monitoring after a preset time has elapsed. If it is determined that there is a blocking request (Yes), in S1230, the failure processing program 510 executes failure information pruning processing for reading each register in the storage-side PCIe tree and acquiring failure information. Thereafter, in S1240, the failure processing program 510 closes the PCIe path including the PCIe failure indicated in the failure information. In this example, the failure processing program 510 closes the PCIe path of the lane 321 by writing to the PCIe link control register 312 in at least the downstream port 311. Thereafter, in S 1260, the maintenance notification program 511 transmits a switch module (SWM) maintenance notification indicating the PCIe fault information, maintenance action, and maintenance component type name of the SWM 111 to the management client 50 via the storage SVP 221.
  • SWM switch module
  • the failure processing program 506 transmits an error message indicating the PCIe failure to the I / F-LSI driver 501 of the server CPU 102 based on the acquired failure information. Thereafter, in S1170, the path switching program 502 of the I / F-LSI driver 501 switches the path including the PCIe failure indicated in the error message to another path. Thereafter, in S1180, the maintenance notification program 503 of the I / F-LSI driver 501 sends an I / F board maintenance notification indicating the PCIe failure information, maintenance action, and maintenance component type name of the I / F board 104 to the BMC 106 and the server SVP 121. To the management client 50. The management client 50 displays a screen based on the I / F board maintenance report.
  • the I / F-LSI 105 detects a PCIe failure in the boundary range
  • the PCIe failure can be notified to the PCIe-SW 112 by using the sideband signal 322.
  • both the server CPU 102 and the storage CPU 202 can detect a PCIe failure in the boundary range and transmit a maintenance report indicating the failed part to the administrator.
  • the I / F-LSI 105 can notify the PCIe-SW 112 using the sideband signal 322.
  • the I / F-LSI 105 and the PCIe-SW 112 block the PCIe path including the PCIe failure, another process can be prevented from using the PCIe path.
  • FIG. 7 shows the second failure processing.
  • This sequence includes, as server side processing, an I / F-LSI driver 501 executed by the server CPU 102, an I / F-LSI control program 504 executed by a processor in the I / F-LSI 105, and an I / F-
  • the operations of the block request reception register 305 in the LSI 105 and the Endpoint port 301 in the I / F-LSI 105 are shown.
  • This sequence further shows the operations of the downstream port 311 of the PCIe-SW 112, the storage CPU 202 that is the RC of the storage-side PCIe tree, and the storage control program 507 of the storage CPU 202 as storage-side processing.
  • the failure processing program 510 executes failure information pruning processing for reading the register in the storage-side PCIe tree and acquiring failure information. After that, in S2140, the failure processing program 510 closes the PCIe path including the PCIe failure indicated in the failure information. In this example, the failure processing program 510 closes the PCIe path of the lane 321 by writing to the PCIe link control register 312 in at least the downstream port 311. As a result, the PCIe path enters a link down (communication impossible) state.
  • the failure processing program 510 writes a blocking request for requesting blocking of the PCIe path and issuance of the maintenance notification to the blocking request notification register 315 of the PCIe-SW 112, so that I / F-LSI 105 sends a block request to block request reception register 305.
  • the maintenance notification program 511 transmits the SWM maintenance notification indicating the PCIe fault information, maintenance action, and maintenance part type name of the SWM 111 to the management client 50 via the storage SVP 221.
  • the management client 50 displays a screen based on the SWM maintenance notification.
  • step S ⁇ b> 2210 the periodic monitoring program 505 of the I / F-LSI control program 504 of the I / F-LSI 105 executes periodic monitoring that periodically monitors the register of the I / F-LSI 105. Thereafter, in S2220, the periodic monitoring program 505 determines whether or not there is a block request in the block request reception register 305.
  • the regular monitoring program 505 executes the next regular monitoring after a preset time has elapsed. If it is determined that there is a blocking request (Yes), the failure processing program 506 reads out each register in the I / F-LSI 105 and obtains failure information pruning in S2230 to obtain failure information. After that, in S2240, the failure processing program 506 closes the PCIe path including the PCIe failure. In this example, the failure processing program 506 closes the PCIe path of the lane 321 by writing to the PCIe link control register 302 in the endpoint port 301 at least.
  • the failure processing program 506 transmits an error message indicating the PCIe failure to the I / F-LSI driver 501 of the server CPU 102 based on the acquired failure information.
  • the path switching program 502 of the I / F-LSI driver 501 switches the path including the PCIe failure indicated in the error message to another path.
  • the maintenance notification program 503 of the I / F-LSI driver 501 sends an I / F board maintenance notification indicating the PCIe fault information, maintenance action, and maintenance component type name of the I / F board 104 to the BMC 106 and the server SVP 121.
  • the management client 50 displays a screen based on the I / F board maintenance report.
  • the PCIe-SW 112 when the PCIe-SW 112 detects a PCIe fault in the boundary range, the PCIe fault can be notified to the I / F-LSI 105 by using the sideband signal 323.
  • both the server CPU 102 and the storage CPU 202 can detect a PCIe failure in the boundary range and transmit a maintenance report indicating the failed part to the administrator.
  • the PCIe-SW 112 can notify the I / F-LSI 105 using the sideband signal 323.
  • the I / F-LSI 105 and the PCIe-SW 112 block the PCIe path including the PCIe failure, another process can be prevented from using the PCIe path.
  • the PCIe wiring 320 between the I / F-LSI 105 and the PCIe-SW 112 may not include the sideband signal 323.
  • the third failure processing is executed instead of the second failure processing.
  • FIG. 8 shows the third failure processing.
  • the storage side processing executes S2110 to S2140 and S2160 similar to the second failure processing.
  • the PCIe path is linked down.
  • the endpoint port 301 of the I / F-LSI 105 issues an interrupt to the I / F-LSI control program 504 when detecting the link down.
  • the I / F-LSI control program 504 that received the interrupt executes the same S2230 to S2260 as in the second failure process.
  • the I / F-LSI driver 501 that has received the error message executes S2270 to S2280 similar to the second failure process.
  • the PCIe fault detected by the PCIe-SW 112 can be notified to the I / F-LSI 105 even if the PCIe wiring 320 does not include the sideband signal 323.
  • the PCIe device on both sides of the boundary range can detect the PCIe failure.
  • the computer system having this configuration executes the first failure process.
  • PCIe wiring 320 between the I / F-LSI 105 and the PCIe-SW 112 may not include the sideband signal 322.
  • the computer system having this configuration executes the fourth fault process instead of the first fault process.
  • the server side process executes S1110 to S1140 and S1160 to S1180 similar to the first fault process.
  • the PCIe fault detected by the I / F-LSI 105 can be notified to the PCIe-SW 112 even if the PCIe wiring 320 does not include the sideband signal 322.
  • the PCIe device on both sides of the boundary range can detect the PCIe failure.
  • the computer system having this configuration executes the second failure process.
  • the PCIe-SW 112 recognizes the PCIe failure notified from the I / F-LSI 105 by the sideband signal 322, and detects an abnormality other than the PCIe failure by link down. These abnormalities can be distinguished.
  • the abnormality other than the PCIe failure to be notified is, for example, when the server unit 101 (server blade) is removed from the server chassis 100 while the power is on, or when the power is forcibly cut off.
  • PCIe wiring 320 between the I / F-LSI 105 and the PCIe-SW 112 may not include both the sideband signals 322 and 323.
  • the computer system having this configuration executes the fourth fault process instead of the first fault process, and executes the third fault process instead of the second fault process.
  • the computer system may combine the first failure processing and the fourth failure processing, or may combine the second failure processing and the third failure processing. For example, even when a failure occurs in the sideband signal, one PCIe device in the boundary range detects the PCIe failure and closes the PCIe, so that the other PCIe device detects the PCIe failure due to link down. be able to.
  • both the part in the server side management range and the part in the storage side management range become replacement candidates.
  • the PCIe failure may be detected only on one PCIe port.
  • Such a PCIe failure is, for example, a Malformed TLP error detected at the port receiving the PCIe packet.
  • the PCIe port on the opposite side to the PCIe wiring in the boundary range does not detect the PCIe failure.
  • the PCIe device that has detected the PCIe failure in the boundary range notifies the PCIe failure to the PCIe device on the opposite side of the boundary range by changing the state of the PCIe wiring having the PCIe failure.
  • the PCIe device that has detected the PCIe failure by this notification can execute blocking of the PCIe path including the PCIe failure and transmission of a maintenance notification indicating the PCIe failure.
  • both the server-side management range and the storage-side management range can detect the PCIe failure in the boundary range and notify the management client 50 of it. That is, in a computer system including a server and a storage connected by PCIe, it is possible to reliably identify a faulty part and perform maintenance notification.
  • the administrator can recognize the exact failure part based on the maintenance report, and can recognize the possibility of parts replacement and recovery. Further, the administrator can perform appropriate maintenance such as replacement of both the server-side management range and the storage-side management range. For example, the administrator can replace the I / F board 104 indicated in the I / F board maintenance report from the server chassis 100 and replace the SWM 111 indicated in the SWM maintenance report from the storage chassis 200. can do. Note that the SWM maintenance notification indicates the PCIe-SW 112, and the PCIe-SW 112 may be replaced.
  • the computer system of this embodiment controls the power-on of the server chassis 100 and the storage chassis 200.
  • elements denoted by the same reference numerals as those in the first embodiment are the same as those in the first embodiment.
  • the server chassis 100 operates with power from the server PSU 122
  • the storage chassis 200 operates with power from the storage PSU 222.
  • the PCIe-SW 112 belongs to the storage-side management range, but is included in the server chassis 100, so that power is supplied from the server PSU 122.
  • the storage controller 201 may be activated and the storage-side management range may be initialized while the PCIe-SW 112 is not activated.
  • the SWM 111 belongs to the storage-side management range, the SWM 111 is not activated when the storage controller 201 executes the initial setting because it is arranged in the server housing 100. Therefore, the storage controller 201 recognizes that the PCIe-SW 112 is not connected at the time of initial setting, and issues a configuration check error.
  • the computer system of this embodiment prevents a configuration check error in such a case.
  • FIG. 9 shows a configuration for controlling the power supply of the PCIe-SW 112.
  • the server PSU 122 in the server housing 100 includes a main power supply 123 and a sub power supply 124.
  • the SWM 111 in the server housing 100 includes a PCIe-SW 112 and an SWM power supply control IC (Integrated Circuit) 113.
  • the main power supply 123 supplies power from the external power supply to the PCIe-SW 112 under the control of the SWM power supply control IC 113.
  • the sub power supply 124 supplies power from the external power supply to the SWM power supply control IC 113.
  • the SWM power supply control IC 113 activates the PCIe-SW 112.
  • the storage controller 201 in the storage chassis 200 includes an SWM control unit 205 connected to the storage CPU 202 via the PCIe wiring in addition to the elements of the first embodiment.
  • the SWM control unit 205 includes an SWM control register 206 and an SWM status register 207.
  • the SWM control register 206 indicates an activation request for activating the PCIe-SW 112.
  • the SWM status register 207 includes mounting status information indicating whether or not the SWM 111 is mounted on the server chassis 100, power status information indicating whether or not the SWM power control IC 113 is operating with power from the sub power supply 124, and Stores state information including
  • the PCIe wiring 401 between the SWM 111 and the storage controller 201 includes a PCIe lane 402 and sideband signals 403, 404, and 405.
  • Lane 402 is a pair of a differential signal for transmission and a differential signal for reception.
  • the sideband signal 403 transmits the activation request written in the SWM control register 206 of the SWM control unit 205 to the SWM power supply control IC 113.
  • the sideband signal 404 transmits power state information from the SWM power supply control IC 113 to the SWM state register 207 of the SWM control unit 205.
  • the sideband signal 405 transmits the mounting state information from the reference potential (ground) of the SWM control unit 205 and the SWM 111 to the SWM state register 207.
  • the PCIe wiring 401 may be a cable or a substrate.
  • FIG. 10 shows the first part of the first power-on process.
  • FIG. 11 shows a second part following the first part of the first power-on process.
  • the operating entities in this sequence are the management client 50, the server chassis 100, and the storage chassis 200.
  • the operation entities in the server chassis 100 are the server PSU 122, the server SVP 121, the server unit 101, the PCIe-SW 112, and the SWM power control IC 113.
  • the operation entities in the storage chassis 200 are the SWM control unit 205, the storage control program 507 of the storage CPU 202, the storage SVP 221, and the storage PSU 222.
  • the server SVP 121 transmits a completion notification indicating the completion of the activation of the server SVP 121 to the management client 50.
  • the management client 50 displays a screen based on the received completion notification.
  • the storage SVP 221 transmits a completion notification indicating the completion of the startup of the storage SVP 221 to the management client 50.
  • the management client 50 displays a screen based on the received completion notification.
  • the administrator inputs a CTL power-on instruction for powering on the storage controller 201 to the management client 50 in response to the display of the completion notification.
  • the management client 50 transmits a CTL power-on instruction to the storage SVP 221.
  • the storage SVP 221 turns on the storage controller 201 in response to the CTL power-on instruction.
  • the storage control program 507 is activated and starts the initial setting.
  • the storage control program 507 executes a SWM state check.
  • the storage control program 507 reads the mounting status information and the power status information from the SWM status register 207 and determines whether the SWM power control IC 113 can be used.
  • the storage control program 507 indicates that the mounting status information indicates that the SWM 111 is mounted on the server housing 100 and the power status information indicates that the SWM power control IC 113 is operating. It is determined that the SWM power control IC 113 can be used.
  • the storage control program 507 determines that the SWM 111 is not installed in the server chassis 100 by the SWM status check, or the SWM 111 is installed in the server chassis 100 and the SWM power control IC 113 is operating. If it is determined that there is no, the SWM state check is repeated at predetermined time intervals. If the SWM state check determines that the SWM 111 is connected to the storage controller 201 and the SWM power supply control IC 113 is operating, the storage control program 507 executes the next S4210.
  • the storage control program 507 writes an activation request for activating the PCIe-SW 112 to the SWM control register 206.
  • the SWM control unit 205 notifies the activation request written in the SWM control register 206 to the SWM power supply control IC 113 via the sideband signal 403.
  • the SWM power control IC 113 activates the PCIe-SW 112 in response to the activation request.
  • the PCIe-SW 112 is activated using the power of the main power supply 123.
  • the storage control program 507 executes the initial setting of the PCIe-SW 112 via the lane 402. Thereafter, when the storage control program 507 completes the initial setting of the PCIe-SW 112 in S4230, the storage SVP 221 is notified of the completion of the initial setting. Thereafter, in S4240, the storage SVP 221 transmits a completion notification indicating completion of activation of the storage controller 201 and the PCIe-SW 112 to the management client 50 in response to the notification of completion of the initial setting. The management client 50 displays a screen based on the received completion notification.
  • the administrator inputs a server unit power-on instruction for powering on the server unit 101 to the management client 50 in accordance with the display of the completion notification.
  • the management client 50 transmits a server unit power-on instruction to the server SVP 121.
  • the server SVP 121 turns on the server unit 101 in response to the server unit power-on instruction.
  • the server unit 101 executes initial setting of the I / F-LSI 105 by the I / F-LSI driver 501. Thereafter, in S4270, when the server unit 101 completes the initial setting, the server unit 101 transmits a completion notification indicating the completion of activation of the server unit 101 to the management client 50 via the server SVP 121. The management client 50 displays a screen based on the received completion notification.
  • the storage PSU 222 is activated after the server PSU 122 is activated.
  • the storage controller 201 can activate the PCIe-SW 112 via the SWM power supply control IC 113 after supplying power from the sub power supply 124 to the SWM power supply control IC 113.
  • the power-on sequence as in the first power-on process is not guaranteed. Accordingly, the second power-on process when power is turned on in the order of the storage chassis 200 and the server chassis 100 will be described.
  • FIG. 12 shows the first part of the second power-on process.
  • the second part following the first part of the second power-on process is the same as the second part of the first power-on process.
  • the operating subject in this sequence is the same as in the first power-on process.
  • the storage SVP 221 when the storage PSU 222 is first activated in S4110 due to power recovery after a power failure, the storage SVP 221 is activated by the power from the storage PSU 222. Thereafter, in S 4120, the storage SVP 221 transmits a completion notification indicating completion of activation of the storage SVP 221 to the management client 50. The management client 50 displays a screen based on the received completion notification.
  • the storage SVP 221 turns on the power of the storage controller 201.
  • the storage control program 507 is activated and starts the initial setting.
  • the storage control program 507 executes the same SWM status check as in S3180 described above.
  • the storage control program 507 determines in S4150 and S4160 that the SWM 111 is connected to the storage controller 201 and the sub power 124 is not supplied to the SWM power control IC 113, the SWM status check is performed at predetermined time intervals. repeat.
  • the server PSU 122 supplies power to the server SVP 121 and the SWM power supply control IC 113.
  • the server SVP 121 transmits a completion notification indicating the completion of the activation of the server SVP 121 to the management client 50.
  • the management client 50 displays a screen based on the received completion notification.
  • step S4190 If the storage control program 507 determines in step S4190 that the SWM 111 is mounted on the server chassis 100 and the SWM power supply control IC 113 is operating, the storage control program 507 uses the SWM control unit 205 via the SWM control unit 205 as described above in step S4210.
  • An instruction to turn on the PCIe-SW 112 is transmitted to the power supply control IC 113.
  • the SWM power control IC 113 turns on the power to the PCIe-SW 112 according to the instruction.
  • the PCIe-SW 112 is activated by the power of the main power supply 123. Subsequent processing is the same as that after S4220 described above.
  • the storage controller 201 monitors the SWM power control IC 113 even when the server PSU 122 is activated after the storage PSU 222 is activated due to power recovery after a power failure.
  • the PCIe-SW 112 can be controlled in response to the activation of the SWM power supply control IC 113. Further, since the power supply of the computer system is divided into the server PSU 122 and the storage PSU 222, it is possible to prevent the occurrence of a configuration check error due to the PCIe-SW recognition failure even when the storage chassis 200 is activated first.
  • the computer system according to the second embodiment may not include the configuration for each failure processing according to the first embodiment.
  • the storage device includes a drive 251 and the like.
  • the interface device includes the I / F board 104 or the I / F-LSI 105.
  • the first PCI Express wiring includes a PCIe wiring between the storage CPU 202 and the SWM 111.
  • the second PCI Express wiring includes a PCIe wiring between the SWM 111 and the I / F-LSI 105 and the like.
  • the third PCI Express wiring includes a PCIe wiring between the I / F-LSI 105 and the server CPU 102 and the like.
  • the change process includes a process of transmitting a blocking request via a sideband signal, a process of bringing down a PCIe path, and the like.
  • the server-side message includes an error message from the PCIe device to the server CPU 102 and the like.
  • the storage side message includes an error message from the PCIe device to the storage CPU 202 and the like.
  • the first sideband signal line includes a sideband signal 322 and the like.
  • the second sideband signal line includes a sideband signal 323 and the like.
  • the third sideband signal line includes sideband signals 404, 405 and the like.
  • the fourth sideband signal line includes a sideband signal 403 and the like.
  • SYMBOLS 50 Management client, 100 ... Server housing, 101 ... Server unit, 102 ... CPU, 103 ... Memory, 104 ... I / F board, 105 ... I / F-LSI, 107 ... Memory, 111 ... SWM, 112 ... PCIe -SW, 121 ... SVP, 122 ... PSU, 200 ... storage enclosure, 201 ... storage controller, 202 ... CPU, 203 ... memory, 221 ... SVP, 222 ... PSU, 211 ... BE interface, 250 ... drive enclosure, 251 ...drive

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Debugging And Monitoring (AREA)

Abstract

Even when a failure occurs on a boundary between a PCIe device for notifying a server of a failure and a PCIe device for notifying a storage of a failure, the failure is notified to both of the server and the storage. In the case where an interface device detects a first failure of a second PCI Express wiring, the interface device executes a first changing process for changing a state of the second PCI Express wiring, and transmits a first server side message that indicates the first failure to a server processor via a third PCI Express wiring. On the basis of the change in the state of the second PCI Express wiring, a storage controller detects the first failure, and on the basis of the first server side message, a server processor detects the first failure.

Description

計算機システムおよび制御方法Computer system and control method
 本発明は、計算機システムに関する。 The present invention relates to a computer system.
 統合プラットフォームと呼ばれる計算機システムにおいて、サーバとストレージの接続にFibre Channel(FC)を用いる技術が知られている。FCのパスに障害が発生した場合、サーバおよびストレージは、障害部位を特定することができない。そこで、管理者は、FC-Switch(SW)のログやLED点灯状態により障害部位を切り分け、交換対象部位を特定する。 In a computer system called an integrated platform, a technology using Fiber Channel (FC) for connection between a server and storage is known. When a failure occurs in the FC path, the server and storage cannot identify the failed part. Therefore, the administrator identifies the faulty part according to the FC-Switch (SW) log and the LED lighting state, and specifies the replacement target part.
 非特許文献1によれば、PCI Express(PCIe)規格では、Root Complex(RC)と、RCに接続されるEndpoint(EP)とがPCIeツリーを形成する。PCIeで規定されたAdvanced Error Reporting(AER)機能により、EPは、障害を検出した場合、PCIeツリーの上流のRCへエラーメッセージを送信する。エラーメッセージを受けたRCはCPUへ割り込みを発行し、CPUが障害を示すレジスタ値を確認することにより、障害部位および障害種別を特定できる。 According to Non-patent Document 1, in the PCI Express (PCIe) standard, a Root Complex (RC) and an Endpoint (EP) connected to the RC form a PCIe tree. When the EP detects an error by the Advanced Error Reporting (AER) function defined by PCIe, the EP transmits an error message to the RC upstream of the PCIe tree. Upon receipt of the error message, the RC issues an interrupt to the CPU, and the CPU confirms the register value indicating the failure, thereby identifying the failure site and the failure type.
 サーバとストレージがPCIeデバイスを介して接続されている場合、計算機システムは、サーバ側管理範囲とストレージ側管理範囲に分けて管理される。サーバ側管理範囲内で検出された障害はサーバへ通知され、ストレージ側管理範囲内で検出された障害はストレージへ通知される。 When the server and the storage are connected via the PCIe device, the computer system is managed by dividing it into a server side management range and a storage side management range. A failure detected in the server-side management range is notified to the server, and a failure detected in the storage-side management range is notified to the storage.
 例えば、ストレージ側管理範囲内でPCIeデバイスの障害が検出された場合で、そのPCIeデバイスが障害発生部位ではなく、サーバ側管理範囲内のPCIeデバイスが障害発生部位である場合がある。この場合、ストレージは、障害を認識するが、サーバ側管理範囲内で障害が検出されていないため、サーバは、障害を認識しない。このように、サーバ側管理範囲とストレージ側管理範囲を含む計算機システムにおいて、PCIeパスの障害が発生した場合、管理者が障害部位を特定できない場合がある。 For example, when a failure of a PCIe device is detected within the storage-side management range, the PCIe device may not be a failure occurrence site, but the PCIe device within the server-side management range may be a failure occurrence site. In this case, the storage recognizes the failure, but the server does not recognize the failure because no failure is detected within the server-side management range. As described above, in the computer system including the server-side management range and the storage-side management range, when a PCIe path failure occurs, the administrator may not be able to identify the failed part.
 上記課題を解決するために、本発明の一態様である計算機システムは、記憶デバイスと、前記記憶デバイスに接続されるストレージコントローラと、第一PCI Express配線を介して前記ストレージコントローラに接続されるスイッチモジュールと、第二PCI Express配線を介して前記スイッチモジュールに接続されるインタフェースデバイスと、第三PCI Express配線を介して前記インタフェースデバイスに接続されるサーバプロセッサと、を備える。前記ストレージコントローラは、第一PCI ExpressツリーのRoot Complexを含み、前記サーバプロセッサは、第二PCI ExpressツリーのRoot Complexを含み、前記スイッチモジュールは、前記第一PCI Expressツリーに属し、前記インタフェースデバイスは、前記第二PCI Express配線に接続され前記第一PCI Expressツリーに属する第一Endpointと、前記第三PCI Express配線に接続され前記第二PCI Expressツリーに属する第二Endpointとを含む。前記インタフェースデバイスが、前記第二PCI Express配線の第一障害を検出した場合、前記インタフェースデバイスは、前記第二PCI Express配線の状態を変更する第一変更処理を実行し、前記第一障害を示す第一サーバ側メッセージを、前記第三PCI Express配線を介して前記サーバプロセッサへ送信し、前記ストレージコントローラは、前記第二PCI Express配線の状態の変更に基づいて、前記第一障害を検出し、前記サーバプロセッサは、前記第一サーバ側メッセージに基づいて、前記第一障害を検出する。 In order to solve the above problems, a computer system according to an aspect of the present invention includes a storage device, a storage controller connected to the storage device, and a switch connected to the storage controller via a first PCI-Express wiring. A module, an interface device connected to the switch module via a second PCI Express wiring, and a server processor connected to the interface device via a third PCI Express wiring. The storage controller includes a Root Complex of a first PCI Express tree, the server processor includes a Root Complex of a second PCI Express tree, the switch module belongs to the first PCI Express tree, and the interface device is A first endpoint connected to the second PCI Express wiring and belonging to the first PCI Express tree, and a second Endpoint connected to the third PCI Express wiring and belonging to the second PCI Express tree. When the interface device detects the first failure of the second PCI Express wiring, the interface device executes a first change process for changing the state of the second PCI Express wiring and indicates the first failure A first server side message is sent to the server processor via the third PCI Express wiring, and the storage controller detects the first failure based on a change in the state of the second PCI Express wiring, The server processor detects the first failure based on the first server side message.
 障害をサーバへ通知するPCIeデバイスと、障害をストレージへ通知するPCIeデバイスとの境界で障害が発生し、いずれか一方のPCIeデバイスで障害を検出した場合であっても、障害をサーバとストレージの両方へ通知することができる。 Even if a failure occurs at the boundary between the PCIe device that notifies the server of the failure and the PCIe device that notifies the storage of the failure, and the failure is detected by one of the PCIe devices, the failure is detected between the server and the storage. Both can be notified.
計算機システムの構成を示す。The configuration of the computer system is shown. サーバ筐体100とストレージ筐体200の内部および筐体間のパス構成を示す。The path configuration inside and between the server chassis 100 and the storage chassis 200 is shown. サーバ側PCIeツリーとストレージ側PCIeツリーの構成を示す。The structure of the server side PCIe tree and the storage side PCIe tree is shown. I/F-LSI105とPCIe-SW112の間の接続の構成を示す。The configuration of the connection between the I / F-LSI 105 and the PCIe-SW 112 is shown. 計算機システム内の各部のプログラムを示す。The program of each part in a computer system is shown. 第一障害処理を示す。The first failure process is shown. 第二障害処理を示す。The second failure processing is shown. 第三障害処理を示す。The third failure processing is shown. PCIe-SW112の電源の制御のための構成を示す。The structure for control of the power supply of PCIe-SW112 is shown. 第一電源投入処理の第一部分を示す。The 1st part of a 1st power activation process is shown. 第一電源投入処理の第一部分に続く第二部分を示す。The 2nd part following the 1st part of a 1st power activation process is shown. 第二電源投入処理の第一部分を示す。The 1st part of a 2nd power-on process is shown.
 以下、図面を参照して本発明の実施形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
 以下の説明では、「×××テーブル」の表現にて情報を説明することがあるが、情報は、どのようなデータ構造で表現されていてもよい。すなわち、情報がデータ構造に依存しないことを示すために、「×××テーブル」を「×××情報」と呼ぶことができる。また、以下の説明において、各テーブルの構成は一例であり、1つのテーブルは、2以上のテーブルに分割されてもよいし、2以上のテーブルの全部又は一部が1つのテーブルであってもよい。 In the following description, information may be described using the expression “xxx table”, but the information may be expressed in any data structure. That is, “xxx table” can be referred to as “xxx information” to indicate that the information does not depend on the data structure. In the following description, the configuration of each table is an example, and one table may be divided into two or more tables, or all or part of the two or more tables may be a single table. Good.
 また、以下の説明では、要素の識別情報として、IDが使用されるが、それに代えて又は加えて他種の識別情報が使用されてもよい。 In the following description, an ID is used as element identification information, but other types of identification information may be used instead of or in addition thereto.
 また、以下の説明では、同種の要素を区別しないで説明する場合には、参照符号又は参照符号における共通番号を使用し、同種の要素を区別して説明する場合は、その要素の参照符号を使用又は参照符号に代えてその要素に割り振られたIDを使用することがある。 In the following description, when a description is made without distinguishing the same type of element, a reference number or a common number in the reference number is used, and when a description is made by distinguishing the same type of element, the reference number of the element is used. Alternatively, an ID assigned to the element may be used instead of the reference code.
 また、以下の説明では、I/O(Input/Output)要求は、ライト要求又はリード要求であり、アクセス要求と呼ばれてもよい。 In the following description, an I / O (Input / Output) request is a write request or a read request, and may be referred to as an access request.
 また、以下の説明では、「プログラム」を主語として処理を説明する場合があるが、プログラムは、プロセッサ(例えばCPU(Central Processing Unit))によって実行されることで、定められた処理を、適宜に記憶資源(例えばメモリ)及び/又はインターフェースデバイス(例えば通信ポート)等を用いながら行うため、処理の主語がプロセッサとされてもよい。プログラムを主語として説明された処理は、プロセッサあるいはそのプロセッサを有する装置が行う処理又はシステムとしてもよい。また、プロセッサは、処理の一部または全部を行うハードウェア回路を含んでもよい。プログラムは、プログラムソースから計算機のような装置にインストールされてもよい。プログラムソースは、例えば、プログラム配布サーバまたは計算機が読み取り可能な記憶メディアであってもよい。プログラムソースがプログラム配布サーバの場合、プログラム配布サーバはプロセッサ(例えばCPU)と記憶資源を含み、記憶資源はさらに配布プログラムと配布対象であるプログラムとを記憶してよい。そして、プログラム配布サーバのプロセッサが配布プログラムを実行することで、プログラム配布サーバのプロセッサは配布対象のプログラムを他の計算機に配布してよい。また、以下の説明において、2以上のプログラムが1つのプログラムとして実現されてもよいし、1つのプログラムが2以上のプログラムとして実現されてもよい。 In the following description, the process may be described using “program” as a subject. However, the program is executed by a processor (for example, a CPU (Central Processing Unit)), so that a predetermined processing is appropriately performed. Since processing is performed using a storage resource (for example, a memory) and / or an interface device (for example, a communication port), the subject of processing may be a processor. The process described with the program as the subject may be a process or system performed by a processor or an apparatus having the processor. The processor may include a hardware circuit that performs a part or all of the processing. The program may be installed in a computer-like device from a program source. The program source may be, for example, a storage medium that can be read by a program distribution server or a computer. When the program source is a program distribution server, the program distribution server may include a processor (for example, a CPU) and a storage resource, and the storage resource may further store a distribution program and a program to be distributed. Then, the processor of the program distribution server executes the distribution program, so that the processor of the program distribution server may distribute the distribution target program to other computers. In the following description, two or more programs may be realized as one program, or one program may be realized as two or more programs.
 また、以下の説明では、管理システムは、一以上の計算機で構成されてよい。具体的には、例えば、管理計算機が情報を表示する場合(具体的には、例えば、管理計算機が自分の表示デバイスに情報を表示する、或いは、管理計算機が表示用情報を遠隔の表示用計算機に送信する場合)、管理計算機が管理システムである。また、例えば、複数の計算機で管理計算機と同等の機能が実現されている場合は、当該複数の計算機(表示を表示用計算機が行う場合は表示用計算機を含んでよい)が、管理システムである。管理計算機(例えば管理システム)は、表示システムを含むI/Oシステムに接続されたインタフェースデバイスと、記憶資源(例えばメモリ)と、インタフェースデバイス及び記憶資源に接続されたプロセッサとを有してよい。表示システムは、管理計算機が有する表示デバイスでもよいし、管理計算機に接続された表示用計算機でもよい。I/Oシステムは、管理計算機が有するI/Oデバイス(例えばキーボード及びポインティングデバイス、タッチパネル)でもよいし、管理計算機に接続された表示用計算機又は別の計算機でもよい。管理計算機が「表示用情報を表示する」ことは、表示システムに表示用情報を表示することであり、これは、管理計算機が有する表示デバイスに表示用情報を表示することであってもよいし、管理計算機が表示用計算機に表示用情報を送信することであってもよい(後者の場合は表示用計算機によって表示用情報が表示される)。また、管理計算機が情報を入出力するとは、管理計算機が有するI/Oデバイスとの間で情報の入出力を行うことであってもよいし、管理計算機に接続された遠隔の計算機(例えば表示用計算機)との間で情報の入出力を行うことであってもよい。情報の出力は、情報の表示であってもよい。 In the following description, the management system may be composed of one or more computers. Specifically, for example, when the management computer displays information (specifically, for example, the management computer displays information on its own display device, or the management computer displays display information in a remote display computer) Management computer is the management system. For example, when a function equivalent to that of the management computer is realized by a plurality of computers, the plurality of computers (may include a display computer when the display computer performs display) is the management system. . The management computer (eg, management system) may include an interface device connected to the I / O system including the display system, a storage resource (eg, memory), and a processor connected to the interface device and the storage resource. The display system may be a display device included in the management computer or a display computer connected to the management computer. The I / O system may be an I / O device (for example, a keyboard and a pointing device or a touch panel) included in the management computer, a display computer connected to the management computer, or another computer. “Displaying display information” by the management computer means displaying the display information on the display system, which may be displaying the display information on a display device included in the management computer. The management computer may transmit display information to the display computer (in the latter case, the display information is displayed by the display computer). The management computer inputting / outputting information may be inputting / outputting information to / from an I / O device of the management computer, or a remote computer connected to the management computer (for example, a display) Information may be input / output to / from the computer. The information output may be a display of information.
 以下、本実施例の計算機システムの構成について説明する。 Hereinafter, the configuration of the computer system of this embodiment will be described.
 図1は、計算機システムの構成を示す。 Fig. 1 shows the configuration of the computer system.
 計算機システムは、1個以上のサーバ筐体100と、1個以上のストレージ筐体200と、1個以上のドライブ筐体250と、管理クライアント50(管理計算機)とを含む。サーバ筐体100は、PCIe配線を介してストレージ筐体200に接続されている。PCIe配線は、ケーブルであってもよいし、基板であってもよい。ストレージ筐体200は、PCIe、SAS、SATA等のインタフェースを介してドライブ筐体250に接続されている。管理クライアント50は、Local Area Network(LAN)等の通信ネットワークを介して、サーバ筐体100及びストレージ筐体200に接続されている。 The computer system includes one or more server enclosures 100, one or more storage enclosures 200, one or more drive enclosures 250, and a management client 50 (management computer). The server chassis 100 is connected to the storage chassis 200 via a PCIe wiring. The PCIe wiring may be a cable or a substrate. The storage chassis 200 is connected to the drive chassis 250 via an interface such as PCIe, SAS, or SATA. The management client 50 is connected to the server casing 100 and the storage casing 200 via a communication network such as Local Area Network (LAN).
 サーバ筐体100は、1個以上のサーバユニット101と、複数のスイッチモジュール(SWM)111と、複数のService Processor(SVP)121と、複数のPower Supply Unit(PSU)122とを含む。サーバユニット101は、PCIe配線を介してSWM111に接続され、LAN配線を介してSVP121に接続されている。SWM111は、ストレージ筐体200に接続されている。 The server housing 100 includes one or more server units 101, a plurality of switch modules (SWM) 111, a plurality of Service Processors (SVP) 121, and a plurality of Power Supply Units (PSU) 122. The server unit 101 is connected to the SWM 111 via the PCIe wiring, and is connected to the SVP 121 via the LAN wiring. The SWM 111 is connected to the storage chassis 200.
 サーバユニット101は、Central Processing Unit(CPU)102と、メモリ(MEM)103と、インタフェース(I/F)ボード104と、Baseboard Management Controller(BMC)106とを含む。サーバユニット101は、例えばサーバブレードである。 The server unit 101 includes a central processing unit (CPU) 102, a memory (MEM) 103, an interface (I / F) board 104, and a baseboard management controller (BMC) 106. The server unit 101 is, for example, a server blade.
 以後、区別のために、CPU102を、サーバCPU102(サーバプロセッサ)と呼び、SVP121をサーバSVP121(サーバ管理プロセッサ)と呼び、PSU122をサーバPSU122(サーバ電源回路)と呼ぶことがある。 Hereinafter, for distinction, the CPU 102 may be referred to as a server CPU 102 (server processor), the SVP 121 may be referred to as a server SVP 121 (server management processor), and the PSU 122 may be referred to as a server PSU 122 (server power supply circuit).
 サーバCPU102は、メモリ103とBMC106に接続されている。メモリ103は、Operating System(OS)やアプリケーション等のプログラム及びデータを格納する。サーバCPU102は、メモリ103に格納されたプログラムに従って、ストレージ筐体200に対するI/O等の処理を実行する。サーバCPU102は、サーバユニット101内の構成情報と、サーバユニット101内の構成要素毎の型名とを管理する。サーバCPU102は、サーバユニット101内で検出された構成情報と、サーバSVP121から設定された構成情報に矛盾がある場合、エラーをBMC106経由でサーバSVP121へ通知する。 The server CPU 102 is connected to the memory 103 and the BMC 106. The memory 103 stores programs and data such as Operating System (OS) and applications. The server CPU 102 executes processing such as I / O for the storage chassis 200 in accordance with a program stored in the memory 103. The server CPU 102 manages configuration information in the server unit 101 and a model name for each component in the server unit 101. When there is a discrepancy between the configuration information detected in the server unit 101 and the configuration information set from the server SVP 121, the server CPU 102 notifies the server SVP 121 via the BMC 106 of an error.
 BMC106は、サーバSVP121に接続されている。BMC106は、サーバCPU102の状態や、サーバユニット101内の温度、電圧、FANの回転数等、サーバユニット101の稼働状況を監視し、サーバユニット101内の障害を検出し、検出された障害の情報をサーバSVP121へ送信する。 The BMC 106 is connected to the server SVP 121. The BMC 106 monitors the operating status of the server unit 101, such as the state of the server CPU 102, the temperature, voltage, and FAN rotation speed in the server unit 101, detects a fault in the server unit 101, and information on the detected fault Is transmitted to the server SVP 121.
 サーバCPU102は、PCIe配線を介してI/Fボード104に接続されている。ここで、サーバCPU102と、メモリ103と、BMC106とは、サーバユニット101のマザーボード(サーバブレード)上に搭載されている。I/Fボード104は、例えば、マザーボードに接続されるメザニンボードであり、交換可能である。 The server CPU 102 is connected to the I / F board 104 via the PCIe wiring. Here, the server CPU 102, the memory 103, and the BMC 106 are mounted on the motherboard (server blade) of the server unit 101. The I / F board 104 is, for example, a mezzanine board connected to a motherboard and can be replaced.
 I/Fボード104は、I/F-Large-Scale Integrated circuit(LSI)105を含む。I/F-LSI105は、PCIe配線を介してSWM111に接続されている。I/F-LSI105は、プログラム及びデータを格納するメモリと、そのプログラムに従って処理を実行するプロセッサとを含む。 The I / F board 104 includes an I / F-Large-Scale integrated circuit (LSI) 105. The I / F-LSI 105 is connected to the SWM 111 via a PCIe wiring. The I / F-LSI 105 includes a memory that stores a program and data, and a processor that executes processing according to the program.
 I/F-LSI105はDMA転送制御回路を含み、ストレージCPU202からの指示に基づいて、サーバユニット101内のメモリ103と、ストレージコントローラ201内のメモリ203との間のDMA転送を実行する。また、サーバユニット101内のCPU102及びI/F-LSI105の間の上位層プロトコルと、ストレージコントローラ201内のCPU202及びI/F-LSI105の間の上位層プロトコルとが異なる場合、I/F-LSI105は、プロトコル変換を実行する。 The I / F-LSI 105 includes a DMA transfer control circuit, and executes DMA transfer between the memory 103 in the server unit 101 and the memory 203 in the storage controller 201 based on an instruction from the storage CPU 202. Further, when the upper layer protocol between the CPU 102 and the I / F-LSI 105 in the server unit 101 and the upper layer protocol between the CPU 202 and the I / F-LSI 105 in the storage controller 201 are different, the I / F-LSI 105 Performs protocol conversion.
 SWM111は、少なくとも一つのPCIeスイッチ(PCIe-SW)112を含む。PCIe-SW112は、PCIe配線を介してサーバ筐体100のI/F-LSI105に接続されている。PCIe-SW112は更に、ストレージ筐体200に接続されている。なお、PCIe-SW112の代わりに、PCIeブリッジ等、他のPCIeデバイスが用いられてもよい。 The SWM 111 includes at least one PCIe switch (PCIe-SW) 112. The PCIe-SW 112 is connected to the I / F-LSI 105 of the server housing 100 via a PCIe wiring. The PCIe-SW 112 is further connected to the storage chassis 200. Instead of the PCIe-SW 112, another PCIe device such as a PCIe bridge may be used.
 サーバSVP121は、管理クライアント50からの情報に基づいて、サーバユニット101等、サーバ筐体100内のモジュールの構成情報を設定する。また、サーバSVP121は、管理クライアント50からの情報に基づいて、各サーバユニット101の電源投入及びシャットダウンを実行する。また、サーバSVP121は、サーバユニット101の状態や、サーバ筐体100内の温度、電圧、FANの回転数等、サーバ筐体100の稼働状況を監視し、サーバ筐体100内の障害を検出し、検出された障害の情報を管理クライアント50へ送信する。 The server SVP 121 sets configuration information of modules in the server housing 100 such as the server unit 101 based on information from the management client 50. Further, the server SVP 121 executes power-on and shutdown of each server unit 101 based on information from the management client 50. The server SVP 121 also monitors the operating status of the server chassis 100 such as the state of the server unit 101, the temperature, voltage, and FAN rotation speed in the server chassis 100, and detects a failure in the server chassis 100. Information of the detected failure is transmitted to the management client 50.
 サーバPSU122は、外部電源からサーバ筐体100の各部へ電力を供給する。 The server PSU 122 supplies power to each part of the server chassis 100 from an external power source.
 ストレージ筐体200は、複数のストレージコントローラ(CTL)201と、複数のBack-End(BE)インタフェース211と、複数のSVP221と、複数のPSU222とを含む。ストレージコントローラ(CTL)201は、PCIe配線を介してPCIe-SW112に接続され、PCIe配線を介してBEインタフェース211に接続されている。BEインタフェース211には、1個以上のドライブ筐体250が接続されている。PSU222は、外部電源に接続され、ストレージ筐体200の各部へ電力を供給する。 The storage chassis 200 includes a plurality of storage controllers (CTL) 201, a plurality of back-end (BE) interfaces 211, a plurality of SVPs 221 and a plurality of PSUs 222. The storage controller (CTL) 201 is connected to the PCIe-SW 112 via the PCIe wiring, and is connected to the BE interface 211 via the PCIe wiring. One or more drive housings 250 are connected to the BE interface 211. The PSU 222 is connected to an external power source and supplies power to each part of the storage chassis 200.
 ストレージコントローラ201は、CPU202と、メモリ(MEM)203とを含む。 The storage controller 201 includes a CPU 202 and a memory (MEM) 203.
 以後、区別のために、CPU202を、ストレージCPU202(ストレージプロセッサ)と呼び、SVP221をストレージSVP221(ストレージ管理プロセッサ)と呼び、PSU222をストレージPSU222(ストレージ電源回路)と呼ぶことがある。 Hereinafter, for distinction, the CPU 202 may be referred to as a storage CPU 202 (storage processor), the SVP 221 may be referred to as a storage SVP 221 (storage management processor), and the PSU 222 may be referred to as a storage PSU 222 (storage power supply circuit).
 ストレージCPU202には、メモリ203と、SWM111と、BE211とが接続されている。メモリ203は、ストレージコントローラ201のプログラム及びデータを格納する。ストレージCPU202は、メモリ203に格納されたプログラムに従って、ストレージコントローラ201の処理を実行する。ストレージCPU202は、ストレージ筐体200およびドライブ筐体250内の構成情報と、ストレージ筐体200およびドライブ筐体250内の構成要素毎の型名とを管理する。ストレージCPU202は、ストレージ筐体200およびドライブ筐体250内で検出された構成情報と、ストレージSVP221から設定された構成情報に矛盾がある場合、エラーをストレージSVP221へ通知する。 The memory 203, SWM 111, and BE 211 are connected to the storage CPU 202. The memory 203 stores the program and data of the storage controller 201. The storage CPU 202 executes the processing of the storage controller 201 according to the program stored in the memory 203. The storage CPU 202 manages the configuration information in the storage chassis 200 and the drive chassis 250 and the model name for each component in the storage chassis 200 and the drive chassis 250. The storage CPU 202 notifies the storage SVP 221 of an error when there is a contradiction between the configuration information detected in the storage chassis 200 and the drive chassis 250 and the configuration information set from the storage SVP 221.
 ドライブ筐体250は、1個以上のドライブ251を含む。ドライブ251は、直接またはスイッチ等のデバイスを介して、BEインタフェース211に接続されている。 The drive housing 250 includes one or more drives 251. The drive 251 is connected to the BE interface 211 directly or via a device such as a switch.
 ストレージSVP221は、管理クライアント50からの情報に基づいて、ストレージコントローラ201やドライブ251等、ストレージ筐体200およびドライブ筐体250内のモジュールの構成情報を設定する。また、ストレージSVP221は、管理クライアント50からの情報に基づいて、ストレージコントローラ201の状態や、ストレージ筐体200内の温度、電圧、FANの回転数等、ストレージ筐体200およびドライブ筐体250の稼働状況を監視し、ストレージ筐体200およびドライブ筐体250内の障害を検出し、検出された障害の情報を管理クライアント50へ送信する。 The storage SVP 221 sets configuration information of modules in the storage chassis 200 and the drive chassis 250 such as the storage controller 201 and the drive 251 based on information from the management client 50. The storage SVP 221 also operates the storage chassis 200 and the drive chassis 250 based on information from the management client 50, such as the status of the storage controller 201, the temperature, voltage, and FAN rotation speed in the storage chassis 200. The status is monitored, a fault in the storage chassis 200 and the drive chassis 250 is detected, and information on the detected fault is transmitted to the management client 50.
 ストレージPSU222は、外部電源からストレージ筐体200の各部へ電力を供給する。 The storage PSU 222 supplies power to each part of the storage chassis 200 from an external power source.
 管理クライアント50は、プログラム及びデータを格納するメモリと、そのプログラムに従って処理を実行するプロセッサと、管理者からの入力を受け付ける入力デバイスと、サーバSVP121及びストレージSVP221からの情報を表示する表示デバイスとを含む。管理クライアント50は、サーバSVP121へのリモートアクセスを行うことで、サーバ筐体100の管理のためのサーバ管理画面を呼び出し、サーバ管理画面を表示デバイスに表示する。管理クライアント50は更に、ストレージSVP221へのリモートアクセスを行うことで、ストレージ筐体200の管理のためのストレージ管理画面を呼び出し、ストレージ管理画面を表示デバイスに表示する。また、管理クライアント50は、サーバSVP121及びストレージSVP221から障害の情報を受信し、受信された情報を表示デバイスに表示する。 The management client 50 includes a memory that stores programs and data, a processor that executes processing according to the programs, an input device that receives input from the administrator, and a display device that displays information from the server SVP 121 and the storage SVP 221. Including. The management client 50 performs a remote access to the server SVP 121 to call a server management screen for managing the server chassis 100 and display the server management screen on a display device. The management client 50 further performs remote access to the storage SVP 221 to call a storage management screen for managing the storage chassis 200 and display the storage management screen on the display device. Further, the management client 50 receives failure information from the server SVP 121 and the storage SVP 221 and displays the received information on the display device.
 図2は、1個のサーバ筐体100と、1個のストレージ筐体200を含む計算機システムにおいて、サーバ筐体100とストレージ筐体200の内部および筐体間のパス構成を示す。 FIG. 2 shows a path configuration inside and between the server chassis 100 and the storage chassis 200 in a computer system including one server chassis 100 and one storage chassis 200.
 本実施例において、サーバ筐体100は、8個のサーバユニット101と、2個のSWM111とを含む。各SWM111は、1個のPCIe-SW112を含む。PCIe-SW112は4個の仮想スイッチVS0~VS3を含む。ストレージ筐体200は、2個のストレージコントローラ201と、8個のBE211とを含む。各ストレージコントローラ201は、2個のストレージCPU202を含む。各ストレージCPU202は、2個のBE211に接続されている。各ストレージコントローラ201内の2個のストレージCPU202は、互いに接続されている。一方のストレージコントローラ201内の2個のストレージCPU202は、他方のストレージコントローラ201内の2個のストレージCPU202に夫々接続されている。 In this embodiment, the server chassis 100 includes eight server units 101 and two SWMs 111. Each SWM 111 includes one PCIe-SW 112. The PCIe-SW 112 includes four virtual switches VS0 to VS3. The storage chassis 200 includes two storage controllers 201 and eight BEs 211. Each storage controller 201 includes two storage CPUs 202. Each storage CPU 202 is connected to two BEs 211. The two storage CPUs 202 in each storage controller 201 are connected to each other. Two storage CPUs 202 in one storage controller 201 are connected to two storage CPUs 202 in the other storage controller 201, respectively.
 各I/F-LSI105は、PCIe配線を介して、異なるSWM111内の2個の仮想スイッチに接続されている。それらの2個の仮想スイッチは、PCIe配線を介して、2個のストレージコントローラ201内のCPU202に夫々接続されている。また、各ストレージコントローラ201内の2個のストレージCPU202は計4個のPCIeのRootポートを含み、各Rootポートは一つのSWM111内の4個の仮想スイッチに夫々接続されている。各仮想スイッチは、2個のI/F-LSI105に接続されている。 Each I / F-LSI 105 is connected to two virtual switches in different SWMs 111 via PCIe wiring. These two virtual switches are respectively connected to the CPUs 202 in the two storage controllers 201 via PCIe wiring. The two storage CPUs 202 in each storage controller 201 include a total of four PCIe root ports, and each root port is connected to four virtual switches in one SWM 111. Each virtual switch is connected to two I / F-LSIs 105.
 これにより、本実施例の計算機システムは、8個のサーバ側PCIeツリーと、8個のストレージ側PCIeツリーを含む。また、I/F-LSI105とストレージコントローラ201の間のパスが冗長化される。サーバCPU102は、複数のパスの一つを選択することができる。また、サーバCPU102は、使用しているパス上のPCIeデバイスから、障害の通知を受信した場合、障害を含むパスを、他のパスに切り替えることで、ストレージコントローラ201へのアクセスを継続することができる。 Thereby, the computer system of this embodiment includes eight server-side PCIe trees and eight storage-side PCIe trees. Further, the path between the I / F-LSI 105 and the storage controller 201 is made redundant. The server CPU 102 can select one of a plurality of paths. Further, when the server CPU 102 receives a failure notification from the PCIe device on the path being used, the server CPU 102 may continue the access to the storage controller 201 by switching the path including the failure to another path. it can.
 なお、計算機システムにおける各部の数、各配線の数、各配線の接続先は、本実施例の構成に限られない。 In addition, the number of each part, the number of each wiring, and the connection destination of each wiring in the computer system are not limited to the configuration of this embodiment.
 本実施例では、複数のサーバユニット101と複数のSWM111との間のPCIe配線の数が、複数のSWM111と複数のストレージコントローラ201との間のPCIe配線の数より多い。この場合、サーバ筐体100が複数のSWM111を含むことにより、複数のSWMがストレージ筐体200内に設けられている構成に比べて、サーバ筐体100とストレージ筐体200の間のPCIe配線の数を抑えることができ、それらの筐体の間の接続や管理が容易になり、障害を低減することができる。 In this embodiment, the number of PCIe wirings between the plurality of server units 101 and the plurality of SWMs 111 is larger than the number of PCIe wirings between the plurality of SWMs 111 and the plurality of storage controllers 201. In this case, since the server chassis 100 includes a plurality of SWMs 111, the PCIe wiring between the server chassis 100 and the storage chassis 200 can be compared with a configuration in which a plurality of SWMs are provided in the storage chassis 200. The number can be reduced, the connection and management between those cases can be facilitated, and failures can be reduced.
 この構成によれば、I/Fボード104やSWM111等のモジュールを障害部位として認識することができ、モジュール単位で交換することができる。 According to this configuration, modules such as the I / F board 104 and the SWM 111 can be recognized as a faulty part, and can be replaced in units of modules.
 この構成によれば、計算機システムがサーバ筐体100とストレージ筐体200に分割されていることで、スケールアウト等において、各筐体の数を変更することができる。 According to this configuration, since the computer system is divided into the server casing 100 and the storage casing 200, the number of casings can be changed in scale-out or the like.
 また、サーバCPU102とI/F-LSI105がPCIeで接続され、I/F-LSI105とストレージCPU202がPCIeで接続されることにより、I/Oの性能を向上させることができる。なお、BE211とドライブ251がPCIeで接続されていてもよい。 Further, the server CPU 102 and the I / F-LSI 105 are connected by PCIe, and the I / F-LSI 105 and the storage CPU 202 are connected by PCIe, so that the I / O performance can be improved. The BE 211 and the drive 251 may be connected by PCIe.
 図3は、サーバ側PCIeツリーとストレージ側PCIeツリーの構成を示す。 FIG. 3 shows the configuration of the server-side PCIe tree and the storage-side PCIe tree.
 サーバ側PCIeツリーにおいて、サーバCPU102がRoot Complex(RC)610を含み、RC610のPCIeポートがRootポート611である。I/F-LSI105において、Rootポート611に接続されているPCIeポートがEndpointポート612である。 In the server-side PCIe tree, the server CPU 102 includes a Root Complex (RC) 610, and the PCIe port of the RC 610 is a Root port 611. In the I / F-LSI 105, the PCIe port connected to the root port 611 is an endpoint port 612.
 ストレージ側のPCIeツリーにおいて、ストレージCPU202がRC620を含み、RC620のPCIeポートがRootポート621である。PCIe-SW112において、Rootポート621に接続されているPCIeポートがUpstreamポート622である。PCIe-SW112において、I/F-LSI105側のPCIeポートがDownstreamポート311であり、I/F-LSI105において、PCIe-SW112のDownstreamポート311に接続されているPCIeポートがEndpointポート301である。即ち、Rootポート611とEndpointポート612がサーバ側PCIeツリーに属しており、Rootポート621とUpstreamポート622とDownstreamポート311とEndpointポート301がストレージ側PCIeツリーに属している。 In the PCIe tree on the storage side, the storage CPU 202 includes the RC 620, and the PCIe port of the RC 620 is the root port 621. In the PCIe-SW 112, the PCIe port connected to the root port 621 is the upstream port 622. In the PCIe-SW 112, the PCIe port on the I / F-LSI 105 side is the downstream port 311, and in the I / F-LSI 105, the PCIe port connected to the downstream port 311 of the PCIe-SW 112 is the endpoint port 301. That is, the Root port 611 and the Endpoint port 612 belong to the server side PCIe tree, and the Root port 621, the Upstream port 622, the Downstream port 311, and the Endpoint port 301 belong to the storage side PCIe tree.
 I/F-LSI105は、Endpointポート612で受信されたサーバPCIeツリーのパケットをストレージ側PCIeツリーのパケットに変換し、そのパケットをEndpointポート301からPCIe-SW112へ送信する。また、I/F-LSI105は、Endpointポート301で受信されたストレージ側PCIeツリーのパケットをサーバPCIeツリーのパケットに変換し、そのパケットをEndpointポート612からサーバCPU102へ送信する。 The I / F-LSI 105 converts the server PCIe tree packet received at the endpoint port 612 into a storage-side PCIe tree packet, and transmits the packet from the endpoint port 301 to the PCIe-SW 112. Further, the I / F-LSI 105 converts the storage-side PCIe tree packet received at the Endpoint port 301 into a server PCIe tree packet, and transmits the packet from the Endpoint port 612 to the server CPU 102.
 I/F-LSI105やPCIe-SW112等のPCIeデバイスは、ポート毎、障害の種類毎に、障害の有無を示すレジスタを含む。PCIeデバイスは、Unsupported Request等のアンコレクタブルエラーの発生により、PCIeパスのハードウェア障害(PCIe障害)を検出すると、そのPCIe障害を示す障害情報を、対応するレジスタへ書き込む。サーバCPU102やストレージCPU202は、PCIe障害の通知を受信した後にこれらのレジスタを読み出すことにより、同時に発生した全ての障害について部位や種類を特定することができる。 PCIe devices such as the I / F-LSI 105 and the PCIe-SW 112 include a register indicating the presence / absence of a failure for each port and each failure type. When the PCIe device detects a hardware failure (PCIe failure) of the PCIe path due to occurrence of an uncorrectable error such as Unsupported Request, the PCIe device writes failure information indicating the PCIe failure to a corresponding register. The server CPU 102 and the storage CPU 202 can identify the parts and types of all the faults that have occurred at the same time by reading these registers after receiving the PCIe fault notification.
 サーバ側PCIeツリー内のPCIeデバイスがPCIe障害を検出した場合、サーバ側PCIeツリーのRC610へPCIe障害を通知する。サーバCPU102はRC610からの割り込みによりPCIe障害を検知する。 When the PCIe device in the server side PCIe tree detects a PCIe failure, the PCIe device notifies the RC 610 of the server side PCIe tree of the PCIe failure. The server CPU 102 detects a PCIe failure by an interrupt from the RC 610.
 ストレージ側PCIeツリー内のPCIeデバイスがPCIe障害を検知した場合、ストレージ側PCIeツリーのRC620へPCIe障害を通知する。ストレージCPU202はRC620からの割り込みによりPCIe障害を検知する。 When the PCIe device in the storage-side PCIe tree detects a PCIe failure, the PCIe failure is notified to the RC 620 of the storage-side PCIe tree. The storage CPU 202 detects a PCIe failure by an interrupt from the RC 620.
 本実施例では、各筐体や各PCIeツリーの範囲とは別に、計算機システムのうち、サーバユニット101と共に管理される部位の範囲を、サーバ側管理範囲と設定し、ストレージコントローラ201と共に管理される部位の範囲を、ストレージ側管理範囲と設定する。サーバ側管理範囲は、サーバユニット101を含む。ストレージ側管理範囲は、ストレージ筐体200と、ドライブ筐体250と、SWM111とを含む。 In this embodiment, apart from the range of each chassis and each PCIe tree, the range of the part managed together with the server unit 101 in the computer system is set as the server-side management range and managed together with the storage controller 201. The part range is set as the storage-side management range. The server side management range includes the server unit 101. The storage side management range includes the storage chassis 200, the drive chassis 250, and the SWM 111.
 サーバCPU102は、サーバ側管理範囲内のPCIe障害情報、保守アクションおよび保守部品型名を示す保守通報を、サーバSVP121を介して管理クライアント50へ送信する。ストレージCPU202は、ストレージ側管理範囲内のPCIe障害情報、保守アクションおよび保守部品型名を示す保守通報を、ストレージSVP221を介して管理クライアント50へ送信する。 The server CPU 102 transmits a maintenance report indicating PCIe fault information, maintenance action, and maintenance component model name within the server-side management range to the management client 50 via the server SVP 121. The storage CPU 202 transmits a maintenance report indicating PCIe fault information, maintenance action, and maintenance component model name within the storage management range to the management client 50 via the storage SVP 221.
 I/F-LSI105のEndpointポート301から、PCIe-SW112のDownstreamポート311までの部位は、サーバ側管理範囲とストレージ側管理範囲の境界である。この部位を境界範囲と呼ぶ。パケット受信側のPCIeデバイスがPCIe障害を検出した場合、パケット受信側のPCIeデバイスのハードウェア障害と、パケット送信側のPCIeデバイスのハードウェア障害との可能性がある。よって、境界範囲においてPCIe障害が検出されると、I/F-LSI105とPCIe-SW112の両方が交換候補となる。もし、I/F-LSI105とPCIe-SW112の一方だけがPCIe障害を検出し、検出したPCIeデバイスが属するPCIeツリーのRCへ通知すると、I/F-LSI105とPCIe-SW112の一方の交換だけが管理者に通知されることになる。例えば、境界範囲でSWM111のPCIe障害だけが検出され、ストレージCPU202により管理者へ通知される場合、管理者は、SWM111を交換できるが、PCIe障害の可能性があるI/Fボード104を交換できない。そこで、本実施例では、境界範囲の一方のPCIeデバイスがPCIe障害を検出した場合、他方のPCIeデバイスにもPCIe障害を通知する。 The part from the endpoint port 301 of the I / F-LSI 105 to the downstream port 311 of the PCIe-SW 112 is a boundary between the server side management range and the storage side management range. This part is called a boundary range. When the PCIe device on the packet receiving side detects a PCIe failure, there is a possibility of a hardware failure of the PCIe device on the packet receiving side and a hardware failure of the PCIe device on the packet transmitting side. Therefore, when a PCIe failure is detected in the boundary range, both the I / F-LSI 105 and the PCIe-SW 112 become replacement candidates. If only one of the I / F-LSI 105 and the PCIe-SW 112 detects a PCIe failure and notifies the RC of the PCIe tree to which the detected PCIe device belongs, only one of the I / F-LSI 105 and the PCIe-SW 112 is exchanged. The administrator will be notified. For example, when only the PCIe failure of the SWM 111 is detected in the boundary range and notified to the administrator by the storage CPU 202, the administrator can replace the SWM 111 but cannot replace the I / F board 104 that may have a PCIe failure. . Thus, in this embodiment, when one PCIe device in the boundary range detects a PCIe failure, the other PCIe device is also notified of the PCIe failure.
 サーバ筐体100とストレージ筐体200が、検出された障害を、一つの管理クライアント50へ通知する場合、管理者は、計算機システムにおける障害の部位を正確に認識することができ、計算機システムの全体を統合管理することができる。 When the server chassis 100 and the storage chassis 200 notify the detected failure to one management client 50, the administrator can accurately recognize the location of the failure in the computer system, and the entire computer system Can be integrated and managed.
 なお、サーバ側管理範囲とストレージ側管理範囲とが互いに異なる管理者により管理される場合等、サーバ側管理範囲とストレージ側管理範囲の夫々に対応する管理クライアントが設けられてもよい。この場合、サーバ側管理範囲の保守通報は、サーバ側管理範囲に対応する管理クライアントへ送信され、ストレージ側管理範囲の保守通報は、ストレージ側管理範囲に対応する管理クライアントへ送信されてもよい。サーバ側管理範囲とストレージ側管理範囲とが互いに異なる管理ソフトウェアにより管理されてもよい。この場合、管理クライアント50は、保守通報に対応する管理ソフトウェアにより保守通報を表示してもよい。また、計算機システムにおけるサーバ側管理範囲とストレージ側管理範囲の境界が、サーバ側PCIeツリーとストレージ側PCIeツリーの境界と異なる場合であっても、計算機システムは、サーバ側管理範囲とストレージ側管理範囲の夫々において障害部位を管理者へ通知することができる。 In addition, when the server-side management range and the storage-side management range are managed by different managers, management clients corresponding to the server-side management range and the storage-side management range may be provided. In this case, the maintenance notification for the server-side management range may be transmitted to the management client corresponding to the server-side management range, and the maintenance notification for the storage-side management range may be transmitted to the management client corresponding to the storage-side management range. The server side management range and the storage side management range may be managed by different management software. In this case, the management client 50 may display the maintenance report using management software corresponding to the maintenance report. Even if the boundary between the server-side management range and the storage-side management range in the computer system is different from the boundary between the server-side PCIe tree and the storage-side PCIe tree, the computer system can manage the server-side management range and the storage-side management range. It is possible to notify the administrator of the faulty part in each of the above.
 図4は、I/F-LSI105とPCIe-SW112の間の接続の構成を示す。 FIG. 4 shows a connection configuration between the I / F-LSI 105 and the PCIe-SW 112.
 I/F-LSI105は、前述のEndpointポート301と、閉塞要求通知レジスタ304と、閉塞要求受信レジスタ305とを含む。Endpointポート301は、PCIeリンク制御レジスタ302と、PCIeリンク状態レジスタ303とを含む。 The I / F-LSI 105 includes the above-described Endpoint port 301, a block request notification register 304, and a block request reception register 305. The Endpoint port 301 includes a PCIe link control register 302 and a PCIe link status register 303.
 PCIe-SW112は、前述のDownstreamポート311と、閉塞要求受信レジスタ314と、閉塞要求通知レジスタ315とを含む。Downstreamポート311は、PCIeリンク制御レジスタ312と、PCIeリンク状態レジスタ313とを含む。 The PCIe-SW 112 includes the above-described downstream port 311, a block request reception register 314, and a block request notification register 315. The downstream port 311 includes a PCIe link control register 312 and a PCIe link status register 313.
 PCIeリンク制御レジスタ302は、I/F-LSI105により書き込まれるPCIeリンク制御情報を示す。PCIeリンク状態レジスタ303は、I/F-LSI105により受信されるPCIeリンク状態を示す。PCIeリンク制御レジスタ312は、ストレージCPU202により書き込まれるPCIeリンク制御情報を示す。PCIeリンク状態レジスタ313は、ストレージCPU202により受信されるPCIeリンク状態を示す。閉塞要求通知レジスタ304は、I/F-LSI105により書き込まれる閉塞要求を示す。閉塞要求受信レジスタ305は、I/F-LSI105により受信される閉塞要求を示す。閉塞要求通知レジスタ315は、ストレージCPU202により書き込まれる閉塞要求を示す。閉塞要求受信レジスタ314は、ストレージCPU202により受信される閉塞要求を示す。 The PCIe link control register 302 indicates PCIe link control information written by the I / F-LSI 105. The PCIe link status register 303 indicates the PCIe link status received by the I / F-LSI 105. The PCIe link control register 312 indicates PCIe link control information written by the storage CPU 202. The PCIe link status register 313 indicates the PCIe link status received by the storage CPU 202. The block request notification register 304 indicates a block request written by the I / F-LSI 105. The block request reception register 305 indicates a block request received by the I / F-LSI 105. The block request notification register 315 indicates a block request written by the storage CPU 202. The block request reception register 314 indicates a block request received by the storage CPU 202.
 I/F-LSI105とPCIe-SW112の間のPCIe配線320は、PCIeのレーン321と、サイドバンド信号322、323とを含む。レーン321は、送信の差動信号と受信の差動信号のペアである。サイドバンド信号322は、I/F-LSI105の閉塞要求通知レジスタ304に書き込まれた閉塞要求を、PCIe-SW112の閉塞要求受信レジスタ314へ伝送する。サイドバンド信号323は、PCIe-SW112の閉塞要求通知レジスタ315に書き込まれた閉塞要求を、I/F-LSI105の閉塞要求受信レジスタ305へ伝送する。PCIe配線320は、ケーブルであってもよいし、バックプレーン等の基板であってもよい。 The PCIe wiring 320 between the I / F-LSI 105 and the PCIe-SW 112 includes a PCIe lane 321 and sideband signals 322 and 323. Lane 321 is a pair of a differential signal for transmission and a differential signal for reception. The sideband signal 322 transmits the block request written in the block request notification register 304 of the I / F-LSI 105 to the block request reception register 314 of the PCIe-SW 112. The sideband signal 323 transmits the block request written in the block request notification register 315 of the PCIe-SW 112 to the block request reception register 305 of the I / F-LSI 105. The PCIe wiring 320 may be a cable or a substrate such as a backplane.
 図5は、計算機システム内の各部のプログラムを示す。 FIG. 5 shows programs of each part in the computer system.
 サーバユニット101内のメモリ103は、I/F-LSI105を制御するためのI/F-LSIドライバ501を格納する。サーバCPU102は、I/F-LSIドライバ501に従ってI/F-LSI105に対する処理を実行する。I/F-LSIドライバ501は、パス切り替えプログラム502と、保守通報プログラム503とを含む。パス切り替えプログラム502は、PCIe障害を含むPCIeパスが閉塞された場合に、そのPCIeパスを、他のパスに切り替える。保守通報プログラム503は、I/F-LSI105から通知されるPCIe障害情報、保守アクションおよび保守部品型名を示す保守通報を、サーバSVP121を介して管理クライアント50へ送信する。 The memory 103 in the server unit 101 stores an I / F-LSI driver 501 for controlling the I / F-LSI 105. The server CPU 102 executes processing for the I / F-LSI 105 according to the I / F-LSI driver 501. The I / F-LSI driver 501 includes a path switching program 502 and a maintenance notification program 503. The path switching program 502 switches the PCIe path to another path when the PCIe path including the PCIe failure is blocked. The maintenance notification program 503 transmits, to the management client 50 via the server SVP 121, a maintenance notification indicating PCIe failure information, maintenance action, and maintenance component model name notified from the I / F-LSI 105.
 I/F-LSI105内のメモリ107は、I/F-LSI105を制御するためのI/F-LSI制御プログラム504を格納する。I/F-LSI105内のプロセッサは、I/F-LSI制御プログラム504に従って処理を実行する。I/F-LSI制御プログラム504は、定期監視プログラム505と、障害処理プログラム506とを含む。定期監視プログラム505は、I/F-LSI105内のレジスタを定期的に読み取る定期監視を実行する。障害処理プログラム506は、PCIe障害が検出された場合に、他のPCIe障害の検出や、障害部位の閉塞や、PCIe障害の通知等を実行する。 The memory 107 in the I / F-LSI 105 stores an I / F-LSI control program 504 for controlling the I / F-LSI 105. The processor in the I / F-LSI 105 executes processing according to the I / F-LSI control program 504. The I / F-LSI control program 504 includes a regular monitoring program 505 and a failure processing program 506. The periodic monitoring program 505 executes periodic monitoring that periodically reads the register in the I / F-LSI 105. When a PCIe failure is detected, the failure processing program 506 executes detection of another PCIe failure, blockage of the failed part, notification of the PCIe failure, and the like.
 ストレージコントローラ201内のメモリ203は、ストレージ筐体200を制御するためのストレージ制御プログラム507を格納する。ストレージCPU202は、ストレージ制御プログラム507に従って処理を実行する。ストレージ制御プログラム507は、初期設定プログラム508と、定期監視プログラム509と、障害処理プログラム510と、保守通報プログラム511とを含む。初期設定プログラム508は、ストレージ筐体200の初期設定を実行する。定期監視プログラム509は、ストレージコントローラ201内のレジスタを定期的に読み出す定期監視を実行する。障害処理プログラム510は、PCIe障害が検出された場合に、障害部位の閉塞や、PCIe障害の通知等を実行する。保守通報プログラム511は、ストレージ側のPCIeツリー内のPCIeデバイスから通知されるPCIe障害情報、保守アクションおよび保守部品型名を示す保守通報を、ストレージSVP221を介して管理クライアント50へ送信する。 The memory 203 in the storage controller 201 stores a storage control program 507 for controlling the storage chassis 200. The storage CPU 202 executes processing according to the storage control program 507. The storage control program 507 includes an initial setting program 508, a periodic monitoring program 509, a failure processing program 510, and a maintenance notification program 511. The initial setting program 508 executes initial setting of the storage chassis 200. The periodic monitoring program 509 performs periodic monitoring that periodically reads the registers in the storage controller 201. When a PCIe failure is detected, the failure processing program 510 executes blockage of the failed part, notification of the PCIe failure, or the like. The maintenance notification program 511 transmits, to the management client 50 via the storage SVP 221, a maintenance notification indicating PCIe failure information, maintenance action, and maintenance component model name notified from the PCIe device in the storage-side PCIe tree.
 以下、計算機システムの動作について説明する。 The operation of the computer system is described below.
 ここでは、サーバ側管理範囲のI/F-LSI105が境界範囲のPCIe障害を検出した場合の第一障害処理について説明する。 Here, the first failure process when the I / F-LSI 105 in the server-side management range detects a PCIe failure in the boundary range will be described.
 図6は、第一障害処理を示す。 FIG. 6 shows the first failure processing.
 このシーケンスは、サーバ側管理範囲の処理であるサーバ側処理として、サーバCPU102により実行されるI/F-LSIドライバ501と、I/F-LSI105内のプロセッサにより実行されるI/F-LSI制御プログラム504と、I/F-LSI105内のEndpointポート301との動作を示す。このシーケンスは更に、ストレージ側管理範囲の処理であるストレージ側処理として、PCIe-SW112のDownstreamポート311と、PCIe-SW112の閉塞要求受信レジスタ314と、ストレージCPU202のストレージ制御プログラム507との動作を示す。 This sequence is an I / F-LSI control executed by an I / F-LSI driver 501 executed by the server CPU 102 and a processor in the I / F-LSI 105 as server-side processing that is processing in the server-side management range. The operations of the program 504 and the Endpoint port 301 in the I / F-LSI 105 are shown. This sequence further shows the operations of the downstream port 311 of the PCIe-SW 112, the blocking request reception register 314 of the PCIe-SW 112, and the storage control program 507 of the storage CPU 202 as storage-side processing that is processing of the storage-side management range. .
 S1110においてI/F-LSI105のEndpointポート301が、PCIe障害を検知すると、I/F-LSI制御プログラム504に対して割り込みを発行する。割り込みを受信したI/F-LSI制御プログラム504は、S1120において障害処理プログラム506を起動させる。 In S1110, when the endpoint port 301 of the I / F-LSI 105 detects a PCIe failure, it issues an interrupt to the I / F-LSI control program 504. The I / F-LSI control program 504 that has received the interrupt activates the failure processing program 506 in S1120.
 その後、S1130において障害処理プログラム506は、I/F-LSI105内の各レジスタを読み出して障害情報を取得する障害情報刈り取り処理を実行する。これにより、障害処理プログラム506は、検知されたPCIe障害の種類や、それ以外のPCIe障害の部位及び種類を示す障害情報を取得することができる。その後、S1140において障害処理プログラム506は、PCIe障害を含むPCIeパスを閉塞(Link Disable、切断)する。この例において、障害処理プログラム506は、少なくともEndpointポート301内のPCIeリンク制御レジスタ302への書き込みにより、レーン321のPCIeパスを閉塞する。これにより当該PCIeパスは、リンクダウン(通信不可能)状態になる。 Thereafter, in S1130, the failure processing program 506 executes failure information pruning processing for reading each register in the I / F-LSI 105 and acquiring failure information. As a result, the failure processing program 506 can acquire failure information indicating the type of detected PCIe failure and the location and type of other PCIe failures. After that, in S1140, the failure processing program 506 closes (Link パ ス Disable) the PCIe path including the PCIe failure. In this example, the failure processing program 506 closes the PCIe path of the lane 321 by writing to the PCIe link control register 302 in the endpoint port 301 at least. As a result, the PCIe path enters a link down (communication impossible) state.
 その後、S1150において障害処理プログラム506は、当該PCIeパスの閉塞と保守通報の発行とを要求する閉塞要求を、I/F-LSI105内の閉塞要求通知レジスタ304へ書き込むことで、サイドバンド信号322を介してPCIe-SW112内の閉塞要求受信レジスタ314へ、閉塞要求を送信する。 After that, in S1150, the failure processing program 506 writes a blocking request for requesting blocking of the PCIe path and issuance of the maintenance notification to the blocking request notification register 304 in the I / F-LSI 105, thereby generating the sideband signal 322. Then, the blocking request is transmitted to the blocking request reception register 314 in the PCIe-SW 112.
 ストレージコントローラ201のストレージ制御プログラム507の定期監視プログラム509は、S1210において定期的にストレージ側PCIeツリー内のレジスタを監視する定期監視を実行する。その後、S1220において定期監視プログラム509は、閉塞要求受信レジスタ314に閉塞要求があるか否かを判定する。 The periodic monitoring program 509 of the storage control program 507 of the storage controller 201 executes periodic monitoring that periodically monitors the registers in the storage-side PCIe tree in S1210. Thereafter, in S1220, the regular monitoring program 509 determines whether or not there is a block request in the block request reception register 314.
 閉塞要求がないと判定された場合(No)、定期監視プログラム509は、予め設定された時間が経過した後、次の定期監視を実行する。閉塞要求があると判定された場合(Yes)、S1230において障害処理プログラム510は、ストレージ側PCIeツリー内の各レジスタを読み出して障害情報を取得する障害情報刈り取り処理を実行する。その後、S1240において障害処理プログラム510は、その障害情報に示されたPCIe障害を含むPCIeパスを閉塞する。この例において、障害処理プログラム510は、少なくともDownstreamポート311内のPCIeリンク制御レジスタ312への書き込みにより、レーン321のPCIeパスを閉塞する。その後、S1260において保守通報プログラム511は、SWM111のPCIe障害情報、保守アクションおよび保守部品型名を示すスイッチモジュール(SWM)保守通報を、ストレージSVP221を介して管理クライアント50へ送信する。 When it is determined that there is no blocking request (No), the regular monitoring program 509 executes the next regular monitoring after a preset time has elapsed. If it is determined that there is a blocking request (Yes), in S1230, the failure processing program 510 executes failure information pruning processing for reading each register in the storage-side PCIe tree and acquiring failure information. Thereafter, in S1240, the failure processing program 510 closes the PCIe path including the PCIe failure indicated in the failure information. In this example, the failure processing program 510 closes the PCIe path of the lane 321 by writing to the PCIe link control register 312 in at least the downstream port 311. Thereafter, in S 1260, the maintenance notification program 511 transmits a switch module (SWM) maintenance notification indicating the PCIe fault information, maintenance action, and maintenance component type name of the SWM 111 to the management client 50 via the storage SVP 221.
 S1150の後、S1160において障害処理プログラム506は、取得された障害情報に基づいて、PCIe障害を示すエラーメッセージを、サーバCPU102のI/F-LSIドライバ501へ送信する。その後、S1170においてI/F-LSIドライバ501のパス切り替えプログラム502は、エラーメッセージに示されたPCIe障害を含むパスを、他のパスに切り替える。その後、S1180においてI/F-LSIドライバ501の保守通報プログラム503は、I/Fボード104のPCIe障害情報、保守アクションおよび保守部品型名を示すI/Fボード保守通報を、BMC106及びサーバSVP121を介して管理クライアント50へ送信する。管理クライアント50は、I/Fボード保守通報に基づく画面を表示する。 After S1150, in S1160, the failure processing program 506 transmits an error message indicating the PCIe failure to the I / F-LSI driver 501 of the server CPU 102 based on the acquired failure information. Thereafter, in S1170, the path switching program 502 of the I / F-LSI driver 501 switches the path including the PCIe failure indicated in the error message to another path. Thereafter, in S1180, the maintenance notification program 503 of the I / F-LSI driver 501 sends an I / F board maintenance notification indicating the PCIe failure information, maintenance action, and maintenance component type name of the I / F board 104 to the BMC 106 and the server SVP 121. To the management client 50. The management client 50 displays a screen based on the I / F board maintenance report.
 以上の第一障害処理によれば、I/F-LSI105が境界範囲のPCIe障害を検出した場合に、サイドバンド信号322を用いることにより、そのPCIe障害をPCIe-SW112へ通知することができる。これにより、サーバCPU102とストレージCPU202の両方が、境界範囲のPCIe障害を検知し、障害部位を示す保守通報を管理者へ送信することができる。レーン321がPCIe障害を有する場合であっても、I/F-LSI105はサイドバンド信号322を用いてPCIe-SW112へ通知することができる。また、I/F-LSI105及びPCIe-SW112がPCIe障害を含むPCIeパスを閉塞することにより、別の処理が当該PCIeパスを使うことを防ぐことができる。 According to the first failure processing described above, when the I / F-LSI 105 detects a PCIe failure in the boundary range, the PCIe failure can be notified to the PCIe-SW 112 by using the sideband signal 322. As a result, both the server CPU 102 and the storage CPU 202 can detect a PCIe failure in the boundary range and transmit a maintenance report indicating the failed part to the administrator. Even if the lane 321 has a PCIe failure, the I / F-LSI 105 can notify the PCIe-SW 112 using the sideband signal 322. In addition, since the I / F-LSI 105 and the PCIe-SW 112 block the PCIe path including the PCIe failure, another process can be prevented from using the PCIe path.
 ここでは、ストレージ側管理範囲のPCIe-SW112が境界範囲のPCIe障害を検出した場合の第二障害処理について説明する。 Here, the second failure processing when the PCIe-SW 112 in the storage-side management range detects a PCIe failure in the boundary range will be described.
 図7は、第二障害処理を示す。 FIG. 7 shows the second failure processing.
 このシーケンスは、サーバ側処理として、サーバCPU102により実行されるI/F-LSIドライバ501と、I/F-LSI105内のプロセッサにより実行されるI/F-LSI制御プログラム504と、I/F-LSI105内の閉塞要求受信レジスタ305と、I/F-LSI105内のEndpointポート301との動作を示す。このシーケンスは更に、ストレージ側処理として、PCIe-SW112のDownstreamポート311と、ストレージ側PCIeツリーのRCであるストレージCPU202と、ストレージCPU202のストレージ制御プログラム507との動作を示す。 This sequence includes, as server side processing, an I / F-LSI driver 501 executed by the server CPU 102, an I / F-LSI control program 504 executed by a processor in the I / F-LSI 105, and an I / F- The operations of the block request reception register 305 in the LSI 105 and the Endpoint port 301 in the I / F-LSI 105 are shown. This sequence further shows the operations of the downstream port 311 of the PCIe-SW 112, the storage CPU 202 that is the RC of the storage-side PCIe tree, and the storage control program 507 of the storage CPU 202 as storage-side processing.
 S2110においてPCIe-SW112のDownstreamポート311が、PCIe障害を検知すると、RC620へエラーメッセージを送信する。エラーメッセージを受信したRC620は、ストレージ制御プログラム507に対して割り込みを発行する。割り込みを受信したストレージ制御プログラム507は、S2120において障害処理プログラム510を起動させる。 In S2110, when the downstream port 311 of the PCIe-SW 112 detects a PCIe failure, it transmits an error message to the RC 620. The RC 620 that has received the error message issues an interrupt to the storage control program 507. The storage control program 507 that has received the interrupt activates the failure processing program 510 in S2120.
 S2130において障害処理プログラム510は、ストレージ側PCIeツリー内のレジスタを読み出して障害情報を取得する障害情報刈り取り処理を実行する。その後、S2140において障害処理プログラム510は、その障害情報に示されたPCIe障害を含むPCIeパスを閉塞する。この例において、障害処理プログラム510は、少なくともDownstreamポート311内のPCIeリンク制御レジスタ312への書き込みにより、レーン321のPCIeパスを閉塞する。これにより当該PCIeパスは、リンクダウン(通信不可能)状態になる。その後、S2150において障害処理プログラム510は、当該PCIeパスの閉塞と保守通報の発行とを要求する閉塞要求を、PCIe-SW112の閉塞要求通知レジスタ315へ書き込むことで、サイドバンド信号323を介してI/F-LSI105の閉塞要求受信レジスタ305へ、閉塞要求を送信する。 In S2130, the failure processing program 510 executes failure information pruning processing for reading the register in the storage-side PCIe tree and acquiring failure information. After that, in S2140, the failure processing program 510 closes the PCIe path including the PCIe failure indicated in the failure information. In this example, the failure processing program 510 closes the PCIe path of the lane 321 by writing to the PCIe link control register 312 in at least the downstream port 311. As a result, the PCIe path enters a link down (communication impossible) state. After that, in S2150, the failure processing program 510 writes a blocking request for requesting blocking of the PCIe path and issuance of the maintenance notification to the blocking request notification register 315 of the PCIe-SW 112, so that I / F-LSI 105 sends a block request to block request reception register 305.
 その後、S2160において保守通報プログラム511は、SWM111のPCIe障害情報、保守アクションおよび保守部品型名を示すSWM保守通報を、ストレージSVP221を介して管理クライアント50へ送信する。管理クライアント50は、SWM保守通報に基づく画面を表示する。 Thereafter, in S2160, the maintenance notification program 511 transmits the SWM maintenance notification indicating the PCIe fault information, maintenance action, and maintenance part type name of the SWM 111 to the management client 50 via the storage SVP 221. The management client 50 displays a screen based on the SWM maintenance notification.
 S2210においてI/F-LSI105のI/F-LSI制御プログラム504の定期監視プログラム505は、定期的にI/F-LSI105のレジスタを監視する定期監視を実行する。その後、S2220において定期監視プログラム505は、閉塞要求受信レジスタ305に閉塞要求があるか否かを判定する。 In step S <b> 2210, the periodic monitoring program 505 of the I / F-LSI control program 504 of the I / F-LSI 105 executes periodic monitoring that periodically monitors the register of the I / F-LSI 105. Thereafter, in S2220, the periodic monitoring program 505 determines whether or not there is a block request in the block request reception register 305.
 閉塞要求がないと判定された場合(No)、定期監視プログラム505は、予め設定された時間が経過した後、次の定期監視を実行する。閉塞要求があると判定された場合(Yes)、S2230において障害処理プログラム506は、I/F-LSI105内の各レジスタを読み出して障害情報を取得する障害情報刈り取りを実行する。その後、S2240において障害処理プログラム506は、PCIe障害を含むPCIeパスを閉塞する。この例において、障害処理プログラム506は、少なくともEndpointポート301内のPCIeリンク制御レジスタ302への書き込みにより、レーン321のPCIeパスを閉塞する。 If it is determined that there is no blocking request (No), the regular monitoring program 505 executes the next regular monitoring after a preset time has elapsed. If it is determined that there is a blocking request (Yes), the failure processing program 506 reads out each register in the I / F-LSI 105 and obtains failure information pruning in S2230 to obtain failure information. After that, in S2240, the failure processing program 506 closes the PCIe path including the PCIe failure. In this example, the failure processing program 506 closes the PCIe path of the lane 321 by writing to the PCIe link control register 302 in the endpoint port 301 at least.
 その後、S2260において障害処理プログラム506は、取得された障害情報に基づいて、PCIe障害を示すエラーメッセージを、サーバCPU102のI/F-LSIドライバ501へ送信する。その後、S2270においてI/F-LSIドライバ501のパス切り替えプログラム502は、エラーメッセージに示されたPCIe障害を含むパスを他のパスに切り替える。その後、S2280においてI/F-LSIドライバ501の保守通報プログラム503は、I/Fボード104のPCIe障害情報、保守アクションおよび保守部品型名を示すI/Fボード保守通報を、BMC106及びサーバSVP121を介して管理クライアント50へ送信する。管理クライアント50は、I/Fボード保守通報に基づく画面を表示する。 Thereafter, in S2260, the failure processing program 506 transmits an error message indicating the PCIe failure to the I / F-LSI driver 501 of the server CPU 102 based on the acquired failure information. After that, in S2270, the path switching program 502 of the I / F-LSI driver 501 switches the path including the PCIe failure indicated in the error message to another path. Thereafter, in S2280, the maintenance notification program 503 of the I / F-LSI driver 501 sends an I / F board maintenance notification indicating the PCIe fault information, maintenance action, and maintenance component type name of the I / F board 104 to the BMC 106 and the server SVP 121. To the management client 50. The management client 50 displays a screen based on the I / F board maintenance report.
 以上の第二障害処理によれば、PCIe-SW112が境界範囲のPCIe障害を検出した場合に、サイドバンド信号323を用いることにより、そのPCIe障害をI/F-LSI105へ通知することができる。これにより、サーバCPU102とストレージCPU202の両方が、境界範囲のPCIe障害を検知し、障害部位を示す保守通報を管理者へ送信することができる。レーン321がPCIe障害を有する場合であっても、PCIe-SW112はサイドバンド信号323を用いてI/F-LSI105へ通知することができる。また、I/F-LSI105及びPCIe-SW112がPCIe障害を含むPCIeパスを閉塞することにより、別の処理が当該PCIeパスを使うことを防ぐことができる。 According to the second fault processing described above, when the PCIe-SW 112 detects a PCIe fault in the boundary range, the PCIe fault can be notified to the I / F-LSI 105 by using the sideband signal 323. As a result, both the server CPU 102 and the storage CPU 202 can detect a PCIe failure in the boundary range and transmit a maintenance report indicating the failed part to the administrator. Even if the lane 321 has a PCIe failure, the PCIe-SW 112 can notify the I / F-LSI 105 using the sideband signal 323. In addition, since the I / F-LSI 105 and the PCIe-SW 112 block the PCIe path including the PCIe failure, another process can be prevented from using the PCIe path.
 なお、I/F-LSI105とPCIe-SW112の間のPCIe配線320が、サイドバンド信号323を含まなくてもよい。この構成の計算機システムは、ストレージ側管理範囲のPCIe-SW112が境界範囲のPCIe障害を検出した場合、第二障害処理の代わりに第三障害処理を実行する。 Note that the PCIe wiring 320 between the I / F-LSI 105 and the PCIe-SW 112 may not include the sideband signal 323. In the computer system having this configuration, when the PCIe-SW 112 in the storage management range detects a PCIe failure in the boundary range, the third failure processing is executed instead of the second failure processing.
 図8は、第三障害処理を示す。 FIG. 8 shows the third failure processing.
 この第三障害処理において、第二障害処理と同一符号が付された要素は、第二障害処理の要素と同様である。 In this third failure process, elements with the same reference numerals as in the second failure process are the same as the elements in the second failure process.
 PCIe-SW112のDownstreamポート311が、PCIe障害を検知すると、ストレージ側処理は、第二障害処理と同様のS2110~S2140、S2160を実行する。 When the downstream port 311 of the PCIe-SW 112 detects a PCIe failure, the storage side processing executes S2110 to S2140 and S2160 similar to the second failure processing.
 S2140によりPCIeパスがリンクダウンになり、S2170においてI/F-LSI105のEndpointポート301は、リンクダウンを検知すると、I/F-LSI制御プログラム504に対して割り込みを発行する。 In S2140, the PCIe path is linked down. In S2170, the endpoint port 301 of the I / F-LSI 105 issues an interrupt to the I / F-LSI control program 504 when detecting the link down.
 割り込みを受信したI/F-LSI制御プログラム504は、第二障害処理と同様のS2230~S2260を実行する。エラーメッセージを受信したI/F-LSIドライバ501は、第二障害処理と同様のS2270~S2280を実行する。 The I / F-LSI control program 504 that received the interrupt executes the same S2230 to S2260 as in the second failure process. The I / F-LSI driver 501 that has received the error message executes S2270 to S2280 similar to the second failure process.
 以上の第三障害処理によれば、PCIe配線320がサイドバンド信号323を含まなくても、PCIe-SW112により検出されたPCIe障害をI/F-LSI105へ通知することができる。これにより、境界範囲の両側のPCIeデバイスがPCIe障害を検出することができる。また、I/F-LSI105が境界範囲のPCIe障害を検出した場合、この構成の計算機システムは、第一障害処理を実行する。 According to the above third fault processing, the PCIe fault detected by the PCIe-SW 112 can be notified to the I / F-LSI 105 even if the PCIe wiring 320 does not include the sideband signal 323. Thereby, the PCIe device on both sides of the boundary range can detect the PCIe failure. When the I / F-LSI 105 detects a PCIe failure in the boundary range, the computer system having this configuration executes the first failure process.
 なお、I/F-LSI105とPCIe-SW112の間のPCIe配線320が、サイドバンド信号322を含まなくてもよい。この構成の計算機システムは、第一障害処理の代わりに第四障害処理を実行する。 Note that the PCIe wiring 320 between the I / F-LSI 105 and the PCIe-SW 112 may not include the sideband signal 322. The computer system having this configuration executes the fourth fault process instead of the first fault process.
 第四障害処理において、I/F-LSI105のEndpointポート301が、PCIe障害を検知すると、サーバ側処理は、第一障害処理と同様のS1110~S1140、S1160~S1180を実行する。 In the fourth fault process, when the Endpoint port 301 of the I / F-LSI 105 detects a PCIe fault, the server side process executes S1110 to S1140 and S1160 to S1180 similar to the first fault process.
 S1140によりPCIeパスがリンクダウンになり、PCIe-SW112のDownstreamポート311が、リンクダウンを検知すると、第二障害処理のS2110と同様、RC620へエラーメッセージを送信する。エラーメッセージを受信したRC620は、ストレージ制御プログラム507に対して割り込みを発行する。割り込みを受信したストレージ制御プログラム507は、第一障害処理と同様のS1230~S1260を実行する。 When the PCIe path is linked down by S1140 and the downstream port 311 of the PCIe-SW 112 detects a link down, an error message is transmitted to the RC 620 as in S2110 of the second failure processing. The RC 620 that has received the error message issues an interrupt to the storage control program 507. The storage control program 507 that has received the interrupt executes S1230 to S1260 similar to the first failure process.
 以上の第四障害処理によれば、PCIe配線320がサイドバンド信号322を含まなくても、I/F-LSI105により検出されたPCIe障害をPCIe-SW112へ通知することができる。これにより、境界範囲の両側のPCIeデバイスがPCIe障害を検出することができる。また、PCIe-SW112が境界範囲のPCIe障害を検出した場合、この構成の計算機システムは、第二障害処理を実行する。 According to the above fourth fault processing, the PCIe fault detected by the I / F-LSI 105 can be notified to the PCIe-SW 112 even if the PCIe wiring 320 does not include the sideband signal 322. Thereby, the PCIe device on both sides of the boundary range can detect the PCIe failure. When the PCIe-SW 112 detects a PCIe failure in the boundary range, the computer system having this configuration executes the second failure process.
 なお、第一障害処理によれば、PCIe-SW112は、サイドバンド信号322によりI/F-LSI105から通知されるPCIe障害を認識し、そのPCIe障害以外の異常をリンクダウンにより検出することにより、それらの異常を区別することができる。通知されるPCIe障害以外の異常は例えば、サーバユニット101(サーバブレード)が電源オン中にサーバ筐体100から取り外された場合や、強制的に電源が切断された場合等、である。 According to the first failure processing, the PCIe-SW 112 recognizes the PCIe failure notified from the I / F-LSI 105 by the sideband signal 322, and detects an abnormality other than the PCIe failure by link down. These abnormalities can be distinguished. The abnormality other than the PCIe failure to be notified is, for example, when the server unit 101 (server blade) is removed from the server chassis 100 while the power is on, or when the power is forcibly cut off.
 なお、I/F-LSI105とPCIe-SW112の間のPCIe配線320が、サイドバンド信号322、323の両方を含まなくてもよい。この構成の計算機システムは、第一障害処理の代わりに第四障害処理を実行し、第二障害処理の代わりに第三障害処理を実行する。 Note that the PCIe wiring 320 between the I / F-LSI 105 and the PCIe-SW 112 may not include both the sideband signals 322 and 323. The computer system having this configuration executes the fourth fault process instead of the first fault process, and executes the third fault process instead of the second fault process.
 なお、計算機システムは、第一障害処理と第四障害処理を組み合わせてもよいし、第二障害処理と第三障害処理を組み合わせてもよい。例えば、サイドバンド信号に障害が発生した場合であっても、境界範囲の一方のPCIeデバイスがPCIe障害を検出し、PCIeを閉塞することにより、他方のPCIeデバイスはリンクダウンによりPCIe障害を検出することができる。 Note that the computer system may combine the first failure processing and the fourth failure processing, or may combine the second failure processing and the third failure processing. For example, even when a failure occurs in the sideband signal, one PCIe device in the boundary range detects the PCIe failure and closes the PCIe, so that the other PCIe device detects the PCIe failure due to link down. be able to.
 境界範囲でPCIe障害が発生すると、サーバ側管理範囲内の部位とストレージ側管理範囲内の部位との両方が交換候補となる。しかし、そのPCIe障害が片方のPCIeポートでしか検出されない場合がある。そのようなPCIe障害は例えば、PCIeパケットの受信側ポートで検出されるMalformed TLPエラーである。この場合、境界範囲のPCIe配線に対して反対側のPCIeポートはPCIe障害を検出しない。本実施例によれば、境界範囲でPCIe障害を検出したPCIeデバイスは、PCIe障害を有するPCIe配線の状態を変更することにより、境界範囲の反対側のPCIeデバイスへPCIe障害を通知する。この通知によりPCIe障害を検出したPCIeデバイスは、PCIe障害を含むPCIeパスの閉塞と、そのPCIe障害を示す保守通報の送信とを実行することができる。これにより、サーバ側管理範囲とストレージ側管理範囲の両方が境界範囲のPCIe障害を検出し、管理クライアント50へ通知することができる。即ち、PCIeで接続されたサーバおよびストレージを含む計算機システムにおいて、確実な障害部位の特定と、保守通報が可能になる。 When a PCIe failure occurs in the boundary range, both the part in the server side management range and the part in the storage side management range become replacement candidates. However, the PCIe failure may be detected only on one PCIe port. Such a PCIe failure is, for example, a Malformed TLP error detected at the port receiving the PCIe packet. In this case, the PCIe port on the opposite side to the PCIe wiring in the boundary range does not detect the PCIe failure. According to the present embodiment, the PCIe device that has detected the PCIe failure in the boundary range notifies the PCIe failure to the PCIe device on the opposite side of the boundary range by changing the state of the PCIe wiring having the PCIe failure. The PCIe device that has detected the PCIe failure by this notification can execute blocking of the PCIe path including the PCIe failure and transmission of a maintenance notification indicating the PCIe failure. As a result, both the server-side management range and the storage-side management range can detect the PCIe failure in the boundary range and notify the management client 50 of it. That is, in a computer system including a server and a storage connected by PCIe, it is possible to reliably identify a faulty part and perform maintenance notification.
 管理者は、保守通報に基づいて、正確な障害部位を認識でき、部品交換や復旧の可能性を認識できる。また、管理者は、サーバ側管理範囲とストレージ側管理範囲の両方の部位の交換等、適切な保守を行うことができる。例えば、管理者は、サーバ筐体100からのI/Fボード保守通報に示されたI/Fボード104を交換することができ、ストレージ筐体200からのSWM保守通報に示されたSWM111を交換することができる。なお、SWM保守通報がPCIe-SW112を示し、そのPCIe-SW112が交換されてもよい。 The administrator can recognize the exact failure part based on the maintenance report, and can recognize the possibility of parts replacement and recovery. Further, the administrator can perform appropriate maintenance such as replacement of both the server-side management range and the storage-side management range. For example, the administrator can replace the I / F board 104 indicated in the I / F board maintenance report from the server chassis 100 and replace the SWM 111 indicated in the SWM maintenance report from the storage chassis 200. can do. Note that the SWM maintenance notification indicates the PCIe-SW 112, and the PCIe-SW 112 may be replaced.
 本実施例の計算機システムは、サーバ筐体100とストレージ筐体200の電源投入を制御する。本実施例において、実施例1と同一符号が付された要素は、実施例1の要素と同様である。 The computer system of this embodiment controls the power-on of the server chassis 100 and the storage chassis 200. In the present embodiment, elements denoted by the same reference numerals as those in the first embodiment are the same as those in the first embodiment.
 前述のように、サーバ筐体100は、サーバPSU122からの電力で動作し、ストレージ筐体200は、ストレージPSU222からの電力で動作する。また、PCIe-SW112は、ストレージ側管理範囲に属しているが、サーバ筐体100に含まれるため、サーバPSU122から電力を供給される。 As described above, the server chassis 100 operates with power from the server PSU 122, and the storage chassis 200 operates with power from the storage PSU 222. The PCIe-SW 112 belongs to the storage-side management range, but is included in the server chassis 100, so that power is supplied from the server PSU 122.
 停電後の復電等により、PCIe-SW112が起動していない状態で、ストレージコントローラ201が起動し、ストレージ側管理範囲の初期設定を実行する場合がある。この場合、SWM111は、ストレージ側管理範囲に属しているが、サーバ筐体100内に配置されているため、ストレージコントローラ201が初期設定を実行する時に、SWM111は起動していない。そのため、ストレージコントローラ201は、初期設定時に、PCIe-SW112が接続されていないと認識し、構成チェックエラーを発行する。本実施例の計算機システムは、このような場合の構成チェックエラーを防ぐ。 Due to power recovery after a power failure, the storage controller 201 may be activated and the storage-side management range may be initialized while the PCIe-SW 112 is not activated. In this case, although the SWM 111 belongs to the storage-side management range, the SWM 111 is not activated when the storage controller 201 executes the initial setting because it is arranged in the server housing 100. Therefore, the storage controller 201 recognizes that the PCIe-SW 112 is not connected at the time of initial setting, and issues a configuration check error. The computer system of this embodiment prevents a configuration check error in such a case.
 図9は、PCIe-SW112の電源の制御のための構成を示す。 FIG. 9 shows a configuration for controlling the power supply of the PCIe-SW 112.
 サーバ筐体100内のサーバPSU122は、メイン電源123と、サブ電源124とを含む。サーバ筐体100内のSWM111はPCIe-SW112と、SWM電源制御IC(Integrated Circuit)113とを含む。メイン電源123は、SWM電源制御IC113の制御の下で、外部電源からの電力をPCIe-SW112へ供給する。サブ電源124は、外部電源からの電力をSWM電源制御IC113へ供給する。SWM電源制御IC113は、PCIe-SW112を起動させる。 The server PSU 122 in the server housing 100 includes a main power supply 123 and a sub power supply 124. The SWM 111 in the server housing 100 includes a PCIe-SW 112 and an SWM power supply control IC (Integrated Circuit) 113. The main power supply 123 supplies power from the external power supply to the PCIe-SW 112 under the control of the SWM power supply control IC 113. The sub power supply 124 supplies power from the external power supply to the SWM power supply control IC 113. The SWM power supply control IC 113 activates the PCIe-SW 112.
 ストレージ筐体200内のストレージコントローラ201は、実施例1の要素に加え、PCIe配線を介してストレージCPU202に接続されたSWM制御部205を含む。SWM制御部205は、SWM制御レジスタ206と、SWM状態レジスタ207とを含む。SWM制御レジスタ206は、PCIe-SW112を起動させるための起動要求を示す。SWM状態レジスタ207は、SWM111がサーバ筐体100に搭載されているか否かを示す搭載状態情報と、サブ電源124からの電力によりSWM電源制御IC113が動作しているか否かを示す電力状態情報とを含む、状態情報を格納する。 The storage controller 201 in the storage chassis 200 includes an SWM control unit 205 connected to the storage CPU 202 via the PCIe wiring in addition to the elements of the first embodiment. The SWM control unit 205 includes an SWM control register 206 and an SWM status register 207. The SWM control register 206 indicates an activation request for activating the PCIe-SW 112. The SWM status register 207 includes mounting status information indicating whether or not the SWM 111 is mounted on the server chassis 100, power status information indicating whether or not the SWM power control IC 113 is operating with power from the sub power supply 124, and Stores state information including
 SWM111とストレージコントローラ201の間のPCIe配線401は、PCIeのレーン402と、サイドバンド信号403、404、405とを含む。レーン402は、送信の差動信号と受信の差動信号のペアである。サイドバンド信号403は、SWM制御部205のSWM制御レジスタ206に書き込まれた起動要求を、SWM電源制御IC113へ情報を伝送する。サイドバンド信号404は、SWM電源制御IC113からSWM制御部205のSWM状態レジスタ207へ電力状態情報を伝送する。サイドバンド信号405は、SWM制御部205と、SWM111の基準電位(グラウンド)からSWM状態レジスタ207へ搭載状態情報を伝送する。PCIe配線401は、ケーブルであってもよいし、基板であってもよい。 The PCIe wiring 401 between the SWM 111 and the storage controller 201 includes a PCIe lane 402 and sideband signals 403, 404, and 405. Lane 402 is a pair of a differential signal for transmission and a differential signal for reception. The sideband signal 403 transmits the activation request written in the SWM control register 206 of the SWM control unit 205 to the SWM power supply control IC 113. The sideband signal 404 transmits power state information from the SWM power supply control IC 113 to the SWM state register 207 of the SWM control unit 205. The sideband signal 405 transmits the mounting state information from the reference potential (ground) of the SWM control unit 205 and the SWM 111 to the SWM state register 207. The PCIe wiring 401 may be a cable or a substrate.
 以下、計算機システムの電源を投入するための電源投入処理について説明する。 Hereinafter, the power-on process for turning on the computer system will be described.
 通常の電源投入において、管理者が、管理クライアント50を用いて、サーバ筐体100、ストレージ筐体200の順に、電源を投入する場合の、第一電源投入処理について説明する。 A description will be given of the first power-on process when the administrator powers on the server chassis 100 and the storage chassis 200 in this order using the management client 50 in normal power-on.
 図10は、第一電源投入処理の第一部分を示す。図11は、第一電源投入処理の第一部分に続く第二部分を示す。 FIG. 10 shows the first part of the first power-on process. FIG. 11 shows a second part following the first part of the first power-on process.
 このシーケンスにおける動作主体は、管理クライアント50と、サーバ筐体100と、ストレージ筐体200とである。サーバ筐体100における動作主体は、サーバPSU122と、サーバSVP121と、サーバユニット101と、PCIe-SW112と、SWM電源制御IC113とである。ストレージ筐体200における動作主体は、SWM制御部205と、ストレージCPU202のストレージ制御プログラム507と、ストレージSVP221と、ストレージPSU222とである。 The operating entities in this sequence are the management client 50, the server chassis 100, and the storage chassis 200. The operation entities in the server chassis 100 are the server PSU 122, the server SVP 121, the server unit 101, the PCIe-SW 112, and the SWM power control IC 113. The operation entities in the storage chassis 200 are the SWM control unit 205, the storage control program 507 of the storage CPU 202, the storage SVP 221, and the storage PSU 222.
 S3110において管理者が、サーバ筐体100においてサーバPSU122の電源を投入すると、サーバPSU122は、サーバSVP121とSWM電源制御IC113へ電力を供給する。 In S3110, when the administrator turns on the server PSU 122 in the server chassis 100, the server PSU 122 supplies power to the server SVP 121 and the SWM power control IC 113.
 SWM電源制御IC113がサブ電源124からの電力により起動すると、SWM111がサーバ筐体100に搭載されていることを示す搭載状態情報と、SWM電源制御IC113が動作していることを示す電力状態情報とが、サイドバンド信号404、405を介してSWM状態レジスタ207へ書き込まれる。 When the SWM power supply control IC 113 is activated by power from the sub power supply 124, mounting state information indicating that the SWM 111 is mounted on the server housing 100, and power state information indicating that the SWM power control IC 113 is operating, Is written to the SWM status register 207 via the sideband signals 404 and 405.
 その後、S3120においてサーバSVP121は、起動を完了すると、サーバSVP121の起動の完了を示す完了通知を管理クライアント50へ送信する。管理クライアント50は、受信された完了通知に基づく画面を表示する。 Thereafter, in S3120, when the server SVP 121 completes the activation, the server SVP 121 transmits a completion notification indicating the completion of the activation of the server SVP 121 to the management client 50. The management client 50 displays a screen based on the received completion notification.
 その後、S3130において管理者は、その完了通知の表示に応じて、ストレージ筐体200においてストレージPSU222の電源を投入すると、ストレージPSU222は、ストレージSVP221へ電力を供給する。 Thereafter, in S3130, when the administrator turns on the storage PSU 222 in the storage chassis 200 in accordance with the display of the completion notification, the storage PSU 222 supplies power to the storage SVP 221.
 その後、S3140において、ストレージSVP221は、起動を完了すると、ストレージSVP221の起動の完了を示す完了通知を管理クライアント50へ送信する。管理クライアント50は、受信された完了通知に基づく画面を表示する。 Thereafter, in S3140, when the storage SVP 221 completes the startup, the storage SVP 221 transmits a completion notification indicating the completion of the startup of the storage SVP 221 to the management client 50. The management client 50 displays a screen based on the received completion notification.
 その後、S3150において管理者は、その完了通知の表示に応じて、ストレージコントローラ201の電源を投入するためのCTL電源投入指示を管理クライアント50へ入力する。管理クライアント50は、その入力に応じて、CTL電源投入指示をストレージSVP221へ送信する。その後、S3160においてストレージSVP221は、CTL電源投入指示に応じて、ストレージコントローラ201の電源を投入する。その後、S3170においてストレージ制御プログラム507は起動し、初期設定を開始する。その後、S3180においてストレージ制御プログラム507は、SWM状態チェックを実行する。 Thereafter, in S3150, the administrator inputs a CTL power-on instruction for powering on the storage controller 201 to the management client 50 in response to the display of the completion notification. In response to the input, the management client 50 transmits a CTL power-on instruction to the storage SVP 221. Thereafter, in S3160, the storage SVP 221 turns on the storage controller 201 in response to the CTL power-on instruction. Thereafter, in S3170, the storage control program 507 is activated and starts the initial setting. Thereafter, in S3180, the storage control program 507 executes a SWM state check.
 SWM状態チェックにおいてストレージ制御プログラム507は、SWM状態レジスタ207から搭載状態情報及び電力状態情報を読み出し、SWM電源制御IC113を利用可能であるか否かを判定する。ここで、ストレージ制御プログラム507は、搭載状態情報が、SWM111がサーバ筐体100に搭載されていることを示し、且つ電力状態情報が、SWM電源制御IC113が動作していることを示している場合、SWM電源制御IC113を利用可能であると判定する。 In the SWM status check, the storage control program 507 reads the mounting status information and the power status information from the SWM status register 207 and determines whether the SWM power control IC 113 can be used. Here, the storage control program 507 indicates that the mounting status information indicates that the SWM 111 is mounted on the server housing 100 and the power status information indicates that the SWM power control IC 113 is operating. It is determined that the SWM power control IC 113 can be used.
 ストレージ制御プログラム507は、SWM状態チェックにより、SWM111がサーバ筐体100に搭載されてないと判定した場合、または、SWM111がサーバ筐体100に搭載されており、且つSWM電源制御IC113が動作していないと判定した場合、所定の時間間隔で、SWM状態チェックを繰り返す。ストレージ制御プログラム507は、SWM状態チェックにより、SWM111がストレージコントローラ201に接続されており、且つSWM電源制御IC113が動作していると判定した場合、次のS4210を実行する。 The storage control program 507 determines that the SWM 111 is not installed in the server chassis 100 by the SWM status check, or the SWM 111 is installed in the server chassis 100 and the SWM power control IC 113 is operating. If it is determined that there is no, the SWM state check is repeated at predetermined time intervals. If the SWM state check determines that the SWM 111 is connected to the storage controller 201 and the SWM power supply control IC 113 is operating, the storage control program 507 executes the next S4210.
 S4210においてストレージ制御プログラム507は、PCIe-SW112を起動させるための起動要求をSWM制御レジスタ206へ書き込む。SWM制御部205は、SWM制御レジスタ206に書き込まれた起動要求を、サイドバンド信号403を介してSWM電源制御IC113へ通知する。SWM電源制御IC113は、起動要求に応じて、PCIe-SW112を起動させる。PCIe-SW112は、メイン電源123の電力を用いて起動する。 In S4210, the storage control program 507 writes an activation request for activating the PCIe-SW 112 to the SWM control register 206. The SWM control unit 205 notifies the activation request written in the SWM control register 206 to the SWM power supply control IC 113 via the sideband signal 403. The SWM power control IC 113 activates the PCIe-SW 112 in response to the activation request. The PCIe-SW 112 is activated using the power of the main power supply 123.
 その後、S4220においてストレージ制御プログラム507は、レーン402を介してPCIe-SW112の初期設定を実行する。その後、S4230においてストレージ制御プログラム507がPCIe-SW112の初期設定を完了すると、初期設定完了をストレージSVP221へ通知する。その後、S4240においてストレージSVP221は、初期設定完了の通知に応じて、ストレージコントローラ201及びPCIe-SW112の起動の完了を示す完了通知を管理クライアント50へ送信する。管理クライアント50は、受信された完了通知に基づく画面を表示する。 Thereafter, in S4220, the storage control program 507 executes the initial setting of the PCIe-SW 112 via the lane 402. Thereafter, when the storage control program 507 completes the initial setting of the PCIe-SW 112 in S4230, the storage SVP 221 is notified of the completion of the initial setting. Thereafter, in S4240, the storage SVP 221 transmits a completion notification indicating completion of activation of the storage controller 201 and the PCIe-SW 112 to the management client 50 in response to the notification of completion of the initial setting. The management client 50 displays a screen based on the received completion notification.
 その後、S4250において管理者は、その完了通知の表示に応じて、サーバユニット101の電源を投入するためのサーバユニット電源投入指示を、管理クライアント50へ入力する。管理クライアント50は、その入力に応じて、サーバユニット電源投入指示をサーバSVP121へ送信する。サーバSVP121は、サーバユニット電源投入指示に応じて、サーバユニット101の電源を投入する。 Thereafter, in S4250, the administrator inputs a server unit power-on instruction for powering on the server unit 101 to the management client 50 in accordance with the display of the completion notification. In response to the input, the management client 50 transmits a server unit power-on instruction to the server SVP 121. The server SVP 121 turns on the server unit 101 in response to the server unit power-on instruction.
 その後、S4260においてサーバユニット101は、I/F-LSIドライバ501によるI/F-LSI105の初期設定を実行する。その後、S4270においてサーバユニット101は、初期設定を完了すると、サーバユニット101の起動の完了を示す完了通知を、サーバSVP121を介して管理クライアント50へ送信する。管理クライアント50は、受信された完了通知に基づく画面を表示する。 Thereafter, in S4260, the server unit 101 executes initial setting of the I / F-LSI 105 by the I / F-LSI driver 501. Thereafter, in S4270, when the server unit 101 completes the initial setting, the server unit 101 transmits a completion notification indicating the completion of activation of the server unit 101 to the management client 50 via the server SVP 121. The management client 50 displays a screen based on the received completion notification.
 以上の第一電源投入処理によれば、サーバPSU122が起動した後に、ストレージPSU222が起動する。これにより、サブ電源124からSWM電源制御IC113へ電力を供給した後に、ストレージコントローラ201がSWM電源制御IC113を介して、PCIe-SW112を起動することができる。 According to the above first power-on process, the storage PSU 222 is activated after the server PSU 122 is activated. As a result, the storage controller 201 can activate the PCIe-SW 112 via the SWM power supply control IC 113 after supplying power from the sub power supply 124 to the SWM power supply control IC 113.
 停電後の復電時等においては、第一電源投入処理のような電源投入順序は保障されない。そこで、ストレージ筐体200、サーバ筐体100の順に、電源が投入される場合の、第二電源投入処理について説明する。 When power is restored after a power failure, the power-on sequence as in the first power-on process is not guaranteed. Accordingly, the second power-on process when power is turned on in the order of the storage chassis 200 and the server chassis 100 will be described.
 図12は、第二電源投入処理の第一部分を示す。第二電源投入処理の第一部分に続く第二部分は、第一電源投入処理の第二部分と同様である。 FIG. 12 shows the first part of the second power-on process. The second part following the first part of the second power-on process is the same as the second part of the first power-on process.
 このシーケンスにおける動作主体は、第一電源投入処理と同様である。 The operating subject in this sequence is the same as in the first power-on process.
 例えば、停電後の復電により、S4110において最初にストレージPSU222が起動すると、ストレージPSU222からの電力により、ストレージSVP221が起動する。その後、S4120においてストレージSVP221は、ストレージSVP221の起動の完了を示す完了通知を管理クライアント50へ送信する。管理クライアント50は、受信された完了通知に基づく画面を表示する。 For example, when the storage PSU 222 is first activated in S4110 due to power recovery after a power failure, the storage SVP 221 is activated by the power from the storage PSU 222. Thereafter, in S 4120, the storage SVP 221 transmits a completion notification indicating completion of activation of the storage SVP 221 to the management client 50. The management client 50 displays a screen based on the received completion notification.
 その後、S4130においてストレージSVP221は、ストレージコントローラ201の電源を投入する。その後、S4140においてストレージ制御プログラム507は起動し、初期設定を開始する。その後、S4150においてストレージ制御プログラム507は、前述のS3180と同様のSWM状態チェックを実行する。 Thereafter, in S4130, the storage SVP 221 turns on the power of the storage controller 201. Thereafter, in S4140, the storage control program 507 is activated and starts the initial setting. Thereafter, in S4150, the storage control program 507 executes the same SWM status check as in S3180 described above.
 ストレージ制御プログラム507は、S4150、S4160において、SWM111がストレージコントローラ201に接続されており、且つSWM電源制御IC113へサブ電源124が供給されていないと判定した場合、所定の時間間隔で、SWM状態チェックを繰り返す。 When the storage control program 507 determines in S4150 and S4160 that the SWM 111 is connected to the storage controller 201 and the sub power 124 is not supplied to the SWM power control IC 113, the SWM status check is performed at predetermined time intervals. repeat.
 その後、S4170においてサーバPSU122が起動すると、サーバPSU122は、サーバSVP121とSWM電源制御IC113へ電力を供給する。その後、S4180においてサーバSVP121は、起動を完了すると、サーバSVP121の起動の完了を示す完了通知を管理クライアント50へ送信する。管理クライアント50は、受信された完了通知に基づく画面を表示する。 Thereafter, when the server PSU 122 is activated in S4170, the server PSU 122 supplies power to the server SVP 121 and the SWM power supply control IC 113. Thereafter, when the server SVP 121 completes the activation in S4180, the server SVP 121 transmits a completion notification indicating the completion of the activation of the server SVP 121 to the management client 50. The management client 50 displays a screen based on the received completion notification.
 ストレージ制御プログラム507は、S4190において、SWM111がサーバ筐体100に搭載されており、且つSWM電源制御IC113が動作していると判定した場合、前述のS4210と同様、SWM制御部205を介してSWM電源制御IC113に対し、PCIe-SW112の電源を投入する指示を送信する。SWM電源制御IC113は、その指示に応じて、PCIe-SW112への電源を投入する。PCIe-SW112は、メイン電源123の電力により起動する。以降の処理は、前述のS4220以降と同様である。 If the storage control program 507 determines in step S4190 that the SWM 111 is mounted on the server chassis 100 and the SWM power supply control IC 113 is operating, the storage control program 507 uses the SWM control unit 205 via the SWM control unit 205 as described above in step S4210. An instruction to turn on the PCIe-SW 112 is transmitted to the power supply control IC 113. The SWM power control IC 113 turns on the power to the PCIe-SW 112 according to the instruction. The PCIe-SW 112 is activated by the power of the main power supply 123. Subsequent processing is the same as that after S4220 described above.
 以上の第二電源投入処理によれば、停電後の復電等により、ストレージPSU222が起動した後に、サーバPSU122が起動する場合であっても、ストレージコントローラ201は、SWM電源制御IC113を監視し、SWM電源制御IC113の起動に応じて、PCIe-SW112を制御することができる。また、計算機システムの電源がサーバPSU122とストレージPSU222に分けられており、ストレージ筐体200が先に起動する場合であっても、PCIe-SW認識不可による構成チェックエラーの発生を防ぐことができる。 According to the second power-on process described above, the storage controller 201 monitors the SWM power control IC 113 even when the server PSU 122 is activated after the storage PSU 222 is activated due to power recovery after a power failure. The PCIe-SW 112 can be controlled in response to the activation of the SWM power supply control IC 113. Further, since the power supply of the computer system is divided into the server PSU 122 and the storage PSU 222, it is possible to prevent the occurrence of a configuration check error due to the PCIe-SW recognition failure even when the storage chassis 200 is activated first.
 なお、実施例2の計算機システムは、実施例1の各障害処理のための構成を含まなくてもよい。 Note that the computer system according to the second embodiment may not include the configuration for each failure processing according to the first embodiment.
 本発明の一態様の表現のための用語について説明する。記憶デバイスは、ドライブ251等を含む。インタフェースデバイスは、I/Fボード104又はI/F-LSI105等を含む。第一PCI Express配線は、ストレージCPU202とSWM111の間のPCIe配線等を含む。第二PCI Express配線は、SWM111とI/F-LSI105の間のPCIe配線等を含む。第三PCI Express配線は、I/F-LSI105とサーバCPU102の間のPCIe配線等を含む。変更処理は、サイドバンド信号を介して閉塞要求を送信する処理や、PCIeパスをリンクダウンにする処理等である。サーバ側メッセージは、PCIeデバイスからサーバCPU102へのエラーメッセージ等を含む。ストレージ側メッセージは、PCIeデバイスからストレージCPU202へのエラーメッセージ等を含む。第一サイドバンド信号線は、サイドバンド信号322等を含む。第二サイドバンド信号線は、サイドバンド信号323等を含む。第三サイドバンド信号線は、サイドバンド信号404、405等を含む。第四サイドバンド信号線は、サイドバンド信号403等を含む。 The term for the expression of one embodiment of the present invention will be described. The storage device includes a drive 251 and the like. The interface device includes the I / F board 104 or the I / F-LSI 105. The first PCI Express wiring includes a PCIe wiring between the storage CPU 202 and the SWM 111. The second PCI Express wiring includes a PCIe wiring between the SWM 111 and the I / F-LSI 105 and the like. The third PCI Express wiring includes a PCIe wiring between the I / F-LSI 105 and the server CPU 102 and the like. The change process includes a process of transmitting a blocking request via a sideband signal, a process of bringing down a PCIe path, and the like. The server-side message includes an error message from the PCIe device to the server CPU 102 and the like. The storage side message includes an error message from the PCIe device to the storage CPU 202 and the like. The first sideband signal line includes a sideband signal 322 and the like. The second sideband signal line includes a sideband signal 323 and the like. The third sideband signal line includes sideband signals 404, 405 and the like. The fourth sideband signal line includes a sideband signal 403 and the like.
 以上、本発明の実施形態を説明したが、これは本発明の説明のための例示であって、本発明の範囲を上記構成に限定する趣旨ではない。本発明は、他の種々の形態でも実施する事が可能である。 As mentioned above, although embodiment of this invention was described, this is an illustration for description of this invention, Comprising: It is not the meaning which limits the scope of the present invention to the said structure. The present invention can be implemented in various other forms.
 50…管理クライアント、 100…サーバ筐体、 101…サーバユニット、 102…CPU、 103…メモリ、 104…I/Fボード、 105…I/F-LSI、 107…メモリ、 111…SWM、 112…PCIe-SW、 121…SVP、 122…PSU、 200…ストレージ筐体、 201…ストレージコントローラ、 202…CPU、 203…メモリ、 221…SVP、 222…PSU、 211…BEインタフェース、 250…ドライブ筐体、 251…ドライブ
 
DESCRIPTION OF SYMBOLS 50 ... Management client, 100 ... Server housing, 101 ... Server unit, 102 ... CPU, 103 ... Memory, 104 ... I / F board, 105 ... I / F-LSI, 107 ... Memory, 111 ... SWM, 112 ... PCIe -SW, 121 ... SVP, 122 ... PSU, 200 ... storage enclosure, 201 ... storage controller, 202 ... CPU, 203 ... memory, 221 ... SVP, 222 ... PSU, 211 ... BE interface, 250 ... drive enclosure, 251 …drive

Claims (14)

  1.  記憶デバイスと、
     前記記憶デバイスに接続されるストレージコントローラと、
     第一PCI Express配線を介して前記ストレージコントローラに接続されるスイッチモジュールと、
     第二PCI Express配線を介して前記スイッチモジュールに接続されるインタフェースデバイスと、
     第三PCI Express配線を介して前記インタフェースデバイスに接続されるサーバプロセッサと、
    を備え、
     前記ストレージコントローラは、第一PCI ExpressツリーのRoot Complexを含み、
     前記サーバプロセッサは、第二PCI ExpressツリーのRoot Complexを含み、
     前記スイッチモジュールは、前記第一PCI Expressツリーに属し、
     前記インタフェースデバイスは、前記第二PCI Express配線に接続され前記第一PCI Expressツリーに属する第一Endpointと、前記第三PCI Express配線に接続され前記第二PCI Expressツリーに属する第二Endpointとを含み、
     前記インタフェースデバイスが、前記第二PCI Express配線の第一障害を検出した場合、前記インタフェースデバイスは、前記第二PCI Express配線の状態を変更する第一変更処理を実行し、前記第一障害を示す第一サーバ側メッセージを、前記第三PCI Express配線を介して前記サーバプロセッサへ送信し、
     前記ストレージコントローラは、前記第二PCI Express配線の状態の変更に基づいて、前記第一障害を検出し、
     前記サーバプロセッサは、前記第一サーバ側メッセージに基づいて、前記第一障害を検出する、
     計算機システム。
    A storage device;
    A storage controller connected to the storage device;
    A switch module connected to the storage controller via the first PCI Express wiring;
    An interface device connected to the switch module via a second PCI Express wiring;
    A server processor connected to the interface device via a third PCI Express wiring;
    With
    The storage controller includes a root complex of the first PCI Express tree,
    The server processor includes a Root Complex of a second PCI Express tree,
    The switch module belongs to the first PCI Express tree;
    The interface device includes a first Endpoint connected to the second PCI Express wiring and belonging to the first PCI Express tree, and a second Endpoint connected to the third PCI Express wiring and belonging to the second PCI Express tree. ,
    When the interface device detects the first failure of the second PCI Express wiring, the interface device executes a first change process for changing the state of the second PCI Express wiring, and indicates the first failure A first server side message is sent to the server processor via the third PCI Express wiring;
    The storage controller detects the first failure based on a change in the state of the second PCI Express wiring,
    The server processor detects the first failure based on the first server-side message;
    Computer system.
  2.  前記スイッチモジュールが、前記第二PCI Express配線の第二障害を検出した場合、前記スイッチモジュールは、前記第二障害を示すストレージ側メッセージを、前記第一PCI Express配線を介して前記ストレージコントローラへ送信し、
     前記ストレージコントローラは、前記ストレージ側メッセージに基づいて、前記第二障害を検出し、前記スイッチモジュールに対して、前記第二PCI Express配線の状態を変更する第二変更処理を実行し、
     前記インタフェースデバイスは、前記第二PCI Express配線の状態の変更に基づいて、前記第二障害を検出し、前記第二障害を示す第二サーバ側メッセージを、前記第三PCI Express配線を介して前記サーバプロセッサへ送信し、
     前記サーバプロセッサは、前記第二サーバ側メッセージに基づいて、前記第二障害を検出する、
    請求項1に記載の計算機システム。
    When the switch module detects a second failure of the second PCI Express wiring, the switch module sends a storage side message indicating the second failure to the storage controller via the first PCI Express wiring. And
    The storage controller detects the second failure based on the storage-side message, and executes a second change process for changing the state of the second PCI Express wiring for the switch module,
    The interface device detects the second failure based on a change in the state of the second PCI Express wiring, and sends a second server side message indicating the second failure via the third PCI Express wiring. To the server processor
    The server processor detects the second failure based on the second server side message;
    The computer system according to claim 1.
  3.  前記第二PCI Express配線は、PCI Expressレーンと、第一サイドバンド信号線とを含み、
     前記第一変更処理において、前記インタフェースデバイスは、前記第一障害を示す第一障害通知を、前記第一サイドバンド信号線を介して前記スイッチモジュールへ伝送し、
     前記ストレージコントローラは、前記前記スイッチモジュール内の第一障害通知に基づいて、前記第一障害を検出する、
    請求項2に記載の計算機システム。
    The second PCI Express wiring includes a PCI Express lane and a first sideband signal line,
    In the first change process, the interface device transmits a first failure notification indicating the first failure to the switch module via the first sideband signal line,
    The storage controller detects the first failure based on a first failure notification in the switch module;
    The computer system according to claim 2.
  4.  前記インタフェースデバイスは、前記第二PCI Express配線の前記第一障害を検出した場合、前記第二PCI Express配線の前記PCI Expressレーンの状態をリンクダウン状態に変更し、
     前記ストレージコントローラは、前記第二PCI Express配線の前記第二障害を検出した場合、前記スイッチモジュールに対して、前記第二PCI Express配線の前記PCI Expressレーンの状態をリンクダウン状態に変更する、
    請求項3に記載の計算機システム。
    When the interface device detects the first failure of the second PCI Express wiring, the state of the PCI Express lane of the second PCI Express wiring is changed to a link-down state,
    When the storage controller detects the second failure of the second PCI Express wiring, the storage controller changes the state of the PCI Express lane of the second PCI Express wiring to a link-down state with respect to the switch module.
    The computer system according to claim 3.
  5.  前記インタフェースデバイスは、PCI Express配線の第三障害を検出すると、前記第三障害を示す第三障害情報を格納し、
     前記インタフェースデバイスは、前記第二PCI Express配線の障害を検出した場合、前記第三障害情報を取得し、前記第三障害情報に基づいて、前記第三障害を有するPCI Express配線の前記PCI Expressレーンの状態をリンクダウン状態に変更し、
     前記第一PCI Expressツリー内のPCI Expressデバイスは、PCI Express配線の第四障害を検出すると、前記第四障害を示す第四障害情報を格納し、
     前記ストレージコントローラは、前記第二PCI Express配線の障害を検出した場合、前記第四障害情報を取得し、前記第四障害情報に基づいて、前記第四障害を有するPCI Express配線の前記PCI Expressレーンの状態をリンクダウン状態に変更する、
    請求項4に記載の計算機システム。
    When the interface device detects a third failure of the PCI Express wiring, it stores third failure information indicating the third failure,
    The interface device acquires the third failure information when detecting a failure of the second PCI Express wiring, and based on the third failure information, the PCI Express lane of the PCI Express wiring having the third failure Change the status of
    When a PCI Express device in the first PCI Express tree detects a fourth failure of PCI Express wiring, it stores fourth failure information indicating the fourth failure,
    When the storage controller detects a failure of the second PCI Express wiring, the storage controller acquires the fourth failure information, and based on the fourth failure information, the PCI Express lane of the PCI Express wiring having the fourth failure Change the status of to link-down status,
    The computer system according to claim 4.
  6.  前記第二PCI Express配線は、更に第二サイドバンド信号線を含み、
     前記第二変更処理において、前記スイッチモジュールは、前記第二障害を示す第二障害通知を、前記第二サイドバンド信号線を介して前記インタフェースデバイスへ伝送し、
     前記インタフェースデバイスは、前記インタフェースデバイス内の第二障害通知に基づいて、前記第二障害を検出する、
    請求項5に記載の計算機システム。
    The second PCI Express wiring further includes a second sideband signal line,
    In the second change process, the switch module transmits a second failure notification indicating the second failure to the interface device via the second sideband signal line,
    The interface device detects the second failure based on a second failure notification in the interface device;
    The computer system according to claim 5.
  7.  前記第二変更処理において、前記スイッチモジュールは、前記第二PCI Express配線の状態をリンクダウン状態に変更し、
     前記インタフェースデバイスは、前記第二PCI Express配線のリンクダウン状態を検出した場合、前記第二障害を検出する、
    請求項5に記載の計算機システム。
    In the second change process, the switch module changes the state of the second PCI Express wiring to a link-down state,
    The interface device detects the second failure when detecting a link down state of the second PCI Express wiring,
    The computer system according to claim 5.
  8.  前記ストレージコントローラに接続されるストレージ管理プロセッサと、
     前記サーバプロセッサに接続されるサーバ管理プロセッサと、
    を更に備え、
     前記ストレージ管理プロセッサは、前記ストレージコントローラが障害を検出した場合、前記ストレージコントローラにより検出された障害を示す情報を受信し、表示デバイスに表示させる、
     前記サーバ側プロセッサは、前記サーバプロセッサが障害を検出した場合、前記サーバプロセッサにより検出された障害を示す情報を受信し、表示デバイスに表示させる、
    請求項1に記載の計算機システム。
    A storage management processor connected to the storage controller;
    A server management processor connected to the server processor;
    Further comprising
    When the storage controller detects a failure, the storage management processor receives information indicating the failure detected by the storage controller and causes the display device to display the information.
    When the server processor detects a failure, the server-side processor receives information indicating the failure detected by the server processor and displays the information on a display device.
    The computer system according to claim 1.
  9.  前記インタフェースデバイスは、複数のパスを介して、前記複数のストレージコントローラに接続され、
     複数のパスの夫々は、前記第一PCI Expressツリー内のPCIeパスと、前記第二PCI Expressツリー内のPCIeパスとを含み、
     前記サーバプロセッサは、前記複数のパスの中の特定パスを介して、前記ストレージコントローラにアクセスし、
     前記サーバプロセッサは、前記特定パス上の障害を検出した場合、前記特定パスを、前記複数のパスの中の他のパスへ切り替え、前記切り替えられたパスを介して、前記ストレージコントローラにアクセスする、
    請求項1に記載の計算機システム。
    The interface device is connected to the plurality of storage controllers via a plurality of paths;
    Each of the plurality of paths includes a PCIe path in the first PCI Express tree and a PCIe path in the second PCI Express tree;
    The server processor accesses the storage controller via a specific path among the plurality of paths,
    When the server processor detects a failure on the specific path, the specific path is switched to another path among the plurality of paths, and the storage controller is accessed via the switched path.
    The computer system according to claim 1.
  10.  前記ストレージコントローラを含む複数のストレージコントローラと、
     前記スイッチモジュールを含む複数のスイッチモジュールと、
     前記インタフェースデバイスを含む1個以上のインタフェースデバイスと、
     前記サーバプロセッサを含む1個以上のサーバプロセッサと、
    を備え、
     前記複数のストレージコントローラは、前記複数のスイッチモジュールに夫々接続され、
     前記複数のスイッチモジュールの夫々は、前記1個以上のインタフェースデバイスに接続され、
     前記1個以上のインタフェースデバイスは、前記1個以上のサーバプロセッサに夫々接続され、
     前記1個以上のインタフェースデバイスの夫々は、前記複数のスイッチモジュールのうち2個以上のスイッチモジュールに接続される、
    請求項9に記載の計算機システム。
    A plurality of storage controllers including the storage controller;
    A plurality of switch modules including the switch module;
    One or more interface devices including the interface device;
    One or more server processors including the server processor;
    With
    The plurality of storage controllers are respectively connected to the plurality of switch modules,
    Each of the plurality of switch modules is connected to the one or more interface devices;
    The one or more interface devices are respectively connected to the one or more server processors;
    Each of the one or more interface devices is connected to two or more switch modules of the plurality of switch modules;
    The computer system according to claim 9.
  11.  前記ストレージコントローラへ電力を供給するストレージ電源回路と、
     前記スイッチモジュール、前記インタフェースデバイス、及び前記サーバプロセッサへ電力を供給するサーバ電源回路と、
    を更に備える、
    請求項1に記載の計算機システム。
    A storage power supply circuit for supplying power to the storage controller;
    A server power supply circuit for supplying power to the switch module, the interface device, and the server processor;
    Further comprising
    The computer system according to claim 1.
  12.  前記スイッチモジュールは、
      前記ストレージコントローラ及び前記インタフェースデバイスに接続されるPCI Expressスイッチと、
      前記PCI Expressスイッチへの電源供給を制御する電源制御回路と、
    を含み、
     前記電源制御回路は、前記サーバ電源回路からの電力により起動すると、前記電源制御回路が利用可能であることを示す状態情報を、前記第一PCI Express配線を介して前記ストレージコントローラへ送信し、
     前記ストレージコントローラは、前記状態情報に応じて、前記PCI Expressスイッチの起動を指示する起動指示を、前記第一PCI Express配線を介して前記電源制御回路へ送信し、
     前記電源制御回路は、前記起動指示に応じて、前記PCI Expressスイッチを起動させる、
    請求項11に記載の計算機システム。
    The switch module is
    A PCI Express switch connected to the storage controller and the interface device;
    A power control circuit for controlling power supply to the PCI Express switch;
    Including
    When the power supply control circuit is activated by power from the server power supply circuit, it transmits status information indicating that the power supply control circuit is available to the storage controller via the first PCI Express wiring,
    The storage controller, according to the state information, transmits a start instruction for instructing start of the PCI Express switch to the power control circuit via the first PCI Express wiring,
    The power supply control circuit activates the PCI Express switch in response to the activation instruction.
    The computer system according to claim 11.
  13.  前記第一PCI Express配線は、前記ストレージコントローラ及び前記PCI Expressスイッチの間を接続するPCI Expressレーンと、前記ストレージコントローラ及び前記電源制御回路の間を接続する第三サイドバンド信号線と、前記ストレージコントローラ及び前記電源制御回路の間を接続する第四サイドバンド信号線とを含み、
     前記電源制御回路は、前記動作情報を、前記第三サイドバンド信号線を介して前記ストレージコントローラへ伝送し、
     前記ストレージコントローラは、前記起動指示を、前記第四サイドバンド信号線を介して前記電源制御回路へ伝送する、
    請求項12に記載の計算機システム。
    The first PCI Express wiring includes a PCI Express lane connecting the storage controller and the PCI Express switch, a third sideband signal line connecting the storage controller and the power control circuit, and the storage controller. And a fourth sideband signal line connecting between the power supply control circuits,
    The power supply control circuit transmits the operation information to the storage controller via the third sideband signal line,
    The storage controller transmits the activation instruction to the power supply control circuit via the fourth sideband signal line.
    The computer system according to claim 12.
  14.  計算機システムの制御方法であって、
     記憶デバイスと、前記記憶デバイスに接続されるストレージコントローラと、第一PCI Express配線を介して前記ストレージコントローラに接続されるスイッチモジュールと、第二PCI Express配線を介して前記スイッチモジュールに接続されるインタフェースデバイスと、第三PCI Express配線を介して前記インタフェースデバイスに接続されるサーバプロセッサと、を含む、前記計算機システム内で、前記インタフェースデバイスが、前記第二PCI Express配線の第一障害を検出した場合、前記インタフェースデバイスを用いて、前記第二PCI Express配線の状態を変更する第一変更処理を実行し、前記第一障害を示す第一サーバ側メッセージを、前記第三PCI Express配線を介して前記サーバプロセッサへ送信し、
     前記ストレージコントローラを用いて、前記第二PCI Express配線の状態の変更に基づいて、前記第一障害を検出し、
     前記サーバプロセッサを用いて、前記第一サーバ側メッセージに基づいて、前記第一障害を検出する、
    ことを備え、
     前記ストレージコントローラは、第一PCI ExpressツリーのRoot Complexを含み、
     前記サーバプロセッサは、第二PCI ExpressツリーのRoot Complexを含み、
     前記スイッチモジュールは、前記第一PCI Expressツリーに属し、
     前記インタフェースデバイスは、前記第二PCI Express配線に接続され前記第一PCI Expressツリーに属する第一Endpointと、前記第三PCI Express配線に接続され前記第二PCI Expressツリーに属する第二Endpointとを含む、
    制御方法。
     
    A computer system control method comprising:
    A storage device, a storage controller connected to the storage device, a switch module connected to the storage controller via a first PCI Express wiring, and an interface connected to the switch module via a second PCI Express wiring When the interface device detects a first failure of the second PCI Express wiring in the computer system, including a device and a server processor connected to the interface device via a third PCI Express wiring , Using the interface device, to execute a first change process to change the state of the second PCI Express wiring, the first server-side message indicating the first failure is sent via the third PCI Express wiring To the server processor
    Using the storage controller, based on the change in the state of the second PCI Express wiring, detect the first failure,
    Using the server processor to detect the first failure based on the first server side message;
    Prepared
    The storage controller includes a root complex of the first PCI Express tree,
    The server processor includes a Root Complex of a second PCI Express tree,
    The switch module belongs to the first PCI Express tree;
    The interface device includes a first Endpoint connected to the second PCI Express wiring and belonging to the first PCI Express tree, and a second Endpoint connected to the third PCI Express wiring and belonging to the second PCI Express tree. ,
    Control method.
PCT/JP2015/067434 2015-06-17 2015-06-17 Computer system and control method WO2016203565A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/067434 WO2016203565A1 (en) 2015-06-17 2015-06-17 Computer system and control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/067434 WO2016203565A1 (en) 2015-06-17 2015-06-17 Computer system and control method

Publications (1)

Publication Number Publication Date
WO2016203565A1 true WO2016203565A1 (en) 2016-12-22

Family

ID=57545571

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/067434 WO2016203565A1 (en) 2015-06-17 2015-06-17 Computer system and control method

Country Status (1)

Country Link
WO (1) WO2016203565A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114731300A (en) * 2019-10-08 2022-07-08 日立安斯泰莫株式会社 Communication system, electronic control device, and communication method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012103975A (en) * 2010-11-11 2012-05-31 Nec Computertechno Ltd Data transfer device, data transfer method, and computer system
JP2015064648A (en) * 2013-09-24 2015-04-09 株式会社日立製作所 Computer system, computer system control method, and connection module

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012103975A (en) * 2010-11-11 2012-05-31 Nec Computertechno Ltd Data transfer device, data transfer method, and computer system
JP2015064648A (en) * 2013-09-24 2015-04-09 株式会社日立製作所 Computer system, computer system control method, and connection module

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114731300A (en) * 2019-10-08 2022-07-08 日立安斯泰莫株式会社 Communication system, electronic control device, and communication method
CN114731300B (en) * 2019-10-08 2024-04-09 日立安斯泰莫株式会社 Communication system, electronic control device, and communication method

Similar Documents

Publication Publication Date Title
US8948000B2 (en) Switch fabric management
JP4477365B2 (en) Storage device having a plurality of interfaces and control method of the storage device
US8745438B2 (en) Reducing impact of a switch failure in a switch fabric via switch cards
US8171174B2 (en) Out-of-band characterization of server utilization via remote access card virtual media for auto-enterprise scaling
US6446141B1 (en) Storage server system including ranking of data source
US20100180161A1 (en) Forced management module failover by bmc impeachment concensus
US8677175B2 (en) Reducing impact of repair actions following a switch failure in a switch fabric
US9501372B2 (en) Cluster system including closing a bus using an uncorrectable fault upon a fault detection in an active server
US20200133759A1 (en) System and method for managing, resetting and diagnosing failures of a device management bus
JP2006107080A (en) Storage device system
JP2013073289A (en) Multiplex system, data communication card, state abnormality detection method and program
JP2015114873A (en) Information processor and monitoring method
US20080256370A1 (en) Intrusion Protection For A Client Blade
US10852792B2 (en) System and method for recovery of sideband interfaces for controllers
TW202026938A (en) System and method to recover fpga firmware over a sideband interface
US9780960B2 (en) Event notifications in a shared infrastructure environment
WO2016203565A1 (en) Computer system and control method
JP2007018049A (en) Storage control system
US9304842B2 (en) Computer system, control method for computer system and coupling module
US10664429B2 (en) Systems and methods for managing serial attached small computer system interface (SAS) traffic with storage monitoring
JPWO2017119116A1 (en) Integrated platform, server, and failover method
CN104461951A (en) Physical and virtual multipath I/O dynamic management method and system
JP4779948B2 (en) Server system
US10409940B1 (en) System and method to proxy networking statistics for FPGA cards
TW202318193A (en) Remote control system for workload consolidation and controlling method thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15895586

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15895586

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP