WO2017094139A1 - Système informatique, et procédé pour partager des informations de dispositif - Google Patents

Système informatique, et procédé pour partager des informations de dispositif Download PDF

Info

Publication number
WO2017094139A1
WO2017094139A1 PCT/JP2015/083865 JP2015083865W WO2017094139A1 WO 2017094139 A1 WO2017094139 A1 WO 2017094139A1 JP 2015083865 W JP2015083865 W JP 2015083865W WO 2017094139 A1 WO2017094139 A1 WO 2017094139A1
Authority
WO
WIPO (PCT)
Prior art keywords
server
shared memory
storage
data transmission
information
Prior art date
Application number
PCT/JP2015/083865
Other languages
English (en)
Japanese (ja)
Inventor
安啓 柴田
翔 樽井
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2015/083865 priority Critical patent/WO2017094139A1/fr
Publication of WO2017094139A1 publication Critical patent/WO2017094139A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/10Program control for peripheral devices

Definitions

  • the present invention relates to a computer system and a device information sharing method, and in particular, a computer system in which a plurality of server devices and storage devices are connected by a data transmission relay device that can be mounted on the server device, and the computer system.
  • the present invention is suitable for application to a device information sharing method.
  • Non-Patent Document 1 discloses a basic specification of PCI Express.
  • FC fiber channel
  • the PCIe-SAN switch is placed under the management of the storage device side, but the management firmware (F / W) on the storage device side identifies the server management device and each PCIe-SAN switch. Since the I / F is not provided, when performing maintenance work to replace the PCIe-SAN switch, the management F / W on the storage device side has the PCIe-SAN switch to be replaced in any of the plurality of server devices. There was a problem that it was not possible to determine whether it was installed.
  • the present invention has been made in consideration of the above points, and in a computer system configured to include a plurality of server devices and storage devices, the mounting position of the data transfer relay device mounted on the server device can be specified. It is something to try.
  • a plurality of server devices having one or more server blades, a storage device having a disk array, and a server device mounted on the server device corresponding to each server blade
  • a computer system including a plurality of data transmission relay devices connected to a storage device, and each data transmission relay device can read and write data from a server device and a storage device to which the device is connected
  • Information that can identify each device of the computer system by the server device and the storage device writing to and reading from the shared memory, and the mounting destination of the data transmission relay device on the server device
  • a computer system characterized by sharing information to be shown is provided.
  • a plurality of server devices having one or more server blades, a storage device having a disk array, and each server blade are mounted on the server device, and the server A device information sharing method in a computer system comprising a plurality of data transmission relay devices connecting between a device and a storage device, wherein each data transmission relay device is a server device and a storage to which the device is connected Information having a shared memory that can read and write data from the device, and the server device and the storage device can identify each device of the computer system by writing to and reading from the shared memory, and a data transmission relay device An apparatus characterized by sharing information indicating a mounting destination of the server in the server apparatus Sharing method of distribution is provided.
  • the present invention in a computer system configured with a plurality of server devices and storage devices, it is possible to specify the mounting position of a specific data transmission relay device mounted on the server device. Maintenance work can be assisted.
  • FIG. 1 is a block diagram showing a hardware configuration of a computer system according to an embodiment of the present invention.
  • 1 indicates a computer system according to an embodiment of the present invention as a whole.
  • a maintenance computer 3 can be connected to the computer system 1 via a network switch 2.
  • the maintenance computer 3 is a computer used by a worker who performs maintenance work, and may be a general computer such as a PC.
  • a maintenance console is preinstalled in the maintenance computer 3 as software that can be used for maintenance work in the computer system 1, and the maintenance console 3 that is connected to the computer system 1 via the network switch 2 is activated. Thus, the worker can proceed with the maintenance work.
  • the inside of the computer system 1 is connected by a bus (for example, I Squared Sea (I2C) (registered trademark)), and the outside of the computer system 1 (that is, the maintenance computer 3 side) is connected by a network.
  • I2C I Squared Sea
  • the maintenance computer 3 side is connected by a network.
  • the computer system 1 includes a server device 10, a storage device 30, and a PCIe-SAN switch 20 that connects the server device 10 and the storage device 30.
  • the computer system 1 can be configured to include a plurality of server devices 10a, 10b,..., And in such a case, each server device 10a, 10b,. Are connected to the storage apparatus 30.
  • a plurality of PCIe-SAN switches 20a, 20b, 20c, 20d,... are used.
  • the server device 10a includes a management module 11a that performs overall management in the server device 10a, and a server blade 14 (individually, server blades 14a and 14b) in which the server functions of the server device 10a are integrated on a substrate.
  • the management module 11a includes a CPU (Central Processing Unit) 12a and a memory 13a.
  • a plurality of server blades 14 can be installed for one server device 10.
  • a PCIe-SAN switch 20 (20a, 20b) is connected to each server blade 14 (14a, 14b).
  • the configuration of the PCIe-SAN switch 20 will be described.
  • the PCIe-SAN switch 20 is a data transmission relay device that relays data transmission according to the I / O interface standard of PCI Express (PCIe).
  • the PCIe-SAN switch 20 connects between the server apparatus 10 and the storage apparatus 30 to form a SAN (Storage Area Network) capable of high-speed transmission.
  • the PCIe-SAN switch 20a includes a CPU 21a, a memory 22a, and an LED (Light Emitting Diode) 23a.
  • the PCIe-SAN switch 20 is physically mounted on the server device 10 side, but is controlled by the storage management F / W 303 (see FIG. 2) on the storage device 30 side.
  • the configuration of the storage device 30 will be described.
  • the storage device 30 is connected to a disk controller 31 that generally controls the storage device 30 and a disk array 34 (individually, disk arrays 34a, 34b, 34c,...) That stores data as the storage device. Composed.
  • the number of disk arrays 34 is not particularly limited.
  • FIG. 2 is a block diagram for explaining a functional configuration in the computer system 1 shown in FIG. Each functional configuration in the present embodiment will be described in detail with reference to FIG.
  • the server apparatus 10 includes a management module 11 and a plurality of server blades 14, and the management module 11 includes a microcomputer 101 and a FRU storage area 102.
  • a server management F / W 103 is executed as a firmware program for performing control for managing the entire server device 10.
  • the server management F / W 103 has a function of monitoring and controlling the power state, voltage, temperature, and the like of the server blade 14.
  • the FRU storage area 102 is a storage area provided by a field-replaceable unit, and the other FRU storage areas 202 and 302 described later are similar storage areas.
  • a manufacturing number for example, “aaa” in the case of the server apparatus 10a
  • the server management F / W 103 always has access rights to the FRU storage area 102. Further, the server management F / W 103 periodically checks the status of the connection destination capable of communicating with the microcomputer 101 by performing polling.
  • a generally used method may be employed as the polling method. Specifically, for example, when a predetermined inquiry or polling target device or program (including a flag or the like) is periodically inquired 3 times in 3 seconds, the same inquiry result is obtained 3 times. A method of making a determination based on the inquiry result is conceivable.
  • the PCIe-SAN switch 20 includes a shared memory storage area 201, a FRU storage area 202, and an LED 23.
  • the shared memory storage area 201 is a storage area in the shared memory (for example, corresponding to a part of the memory 22a shown in FIG. 1) provided in the PCIe-SAN switch 20, and is the server management F / W 103 or the storage management.
  • various data paths server read data path 211, server write data path 212, storage read data path 213, storage write data path 214) connected to the F / W 303 for use in an input / output manner. Yes.
  • a server write flag 215 for controlling permission or non-permission of data writing from the server management F / W 103 to the shared memory storage area 201 via the server write data path 212, and A storage write flag 216 for controlling permission or non-permission of data writing from the storage management F / W 303 to the shared memory storage area 201 via the storage write data path 214 is held.
  • exclusive control of writing to the shared memory storage area 201 is realized by controlling the flag values (0: invalid, 1: valid) in the server write flag 215 and the storage write flag 216. To do.
  • reading from the server management F / W 103 and the storage management F / W 303 to the shared memory storage area 201 is always permitted.
  • a server device manufacturing number write completion flag 217 indicating that the manufacturing number of the server device 10 has been written, and a storage device manufacturing number indicating that the manufacturing number of the storage device 30 has been written.
  • a write completion flag 218 is held. Specific processing for the various flags in the shared memory storage area 201 will be described later along the processing flow.
  • a serial number for identifying an individual PCIe-SAN switch 20 (for example, “dddd” in the case of the PCIe-SAN switch 20a) is written. It is assumed that the server management F / W 103 always has access rights to the FRU storage area 202.
  • the LED 23 is a light emitting body that is disposed at a position where the lighting state can be visually recognized by an operator or a user, and the lighting state is controlled by the storage management F / W 303. Specifically, for example, when a failure that requires replacement of the switch occurs in the PCIe-SAN switch 20 on which the LED 23 is mounted, the LED 23 is in a predetermined lighting state based on the selection of the switch in the maintenance console. It is controlled to become. Therefore, the operator or the user can confirm the individual PCIe-SAN switch 20 that needs to be replaced by visually checking the LED 23 that is lit.
  • the storage device 30 includes a disk controller 31 and a plurality of disk arrays 34.
  • the disk controller 31 includes a microcomputer 301 and a FRU storage area 302.
  • a storage management F / W 303 is executed as a firmware program for performing control for managing the entire storage apparatus 30.
  • the storage management F / W 303 not only controls the disk controller 31 to manage the writing and reading of data to and from the disk array 34, but also initializes the PCIe-SAN switch 20 connected to the storage device 30, the power supply, and It is possible to control processing related to a failure. For example, when the replacement process of the PCIe-SAN switch 20 is performed, the storage management F / W 303 determines that the storage management F / W 303 is based on the operation at the maintenance console started by the maintenance computer 3 connected via the network switch 2.
  • Control is performed to switch the connection bus to a plurality of PCIe-SAN switches 20 (20a, 20b, 20c, 20d) connected to the storage apparatus 30.
  • the storage management F / W 303 periodically checks the status of the connection destination that can communicate with the microcomputer 301 by performing polling.
  • a manufacturing number for example, “cccc”
  • cccc a manufacturing number for identifying the individual storage device 30. It is assumed that the storage management F / W 303 has access rights to the FRU storage area 302.
  • each of the plurality of server apparatuses 10 (10a, 10b,...), The plurality of PCIe-SAN switches 20 (20a, 20b, 20c, 20d,...), And the storage apparatus 30 are provided. A unique serial number is assigned to each.
  • the shared memory storage area 201 (even if read as shared memory) is accessible from the server management F / W 103 and the storage management F / W 303.
  • information PCIe-SAN switch ID, which will be described later
  • the serial number is written.
  • the server management F / W 103 and the storage management F / W 303 read and share these information groups held in the shared memory storage area 201 to identify (specify the mounting position of the PCIe-SAN switch 20). ) It is possible to do so.
  • the writing of the serial number of each device in such a shared memory will be described in detail.
  • FIG. 3 is a diagram for explaining a serial number in the computer system according to this embodiment.
  • FIG. 3A shows an example of a serial number assigned to the server device 10, where the server device “# 1” corresponds to the server device 10 a and the server device “# 2” corresponds to the server device 10 b. Corresponding (see also FIG. 1).
  • FIG. 3B shows an example of a serial number assigned to the PCIe-SAN switch 20.
  • the PCIe-SAN switch “# 1” corresponds to the PCIe-SAN switch 20a
  • the PCIe-SAN switch “# 2” corresponds to the PCIe-SAN switch 20b
  • the PCIe-SAN switch “# 3” corresponds to the PCIe-SAN switch 20c.
  • the PCIe-SAN switch “# 4” corresponds to the PCIe-SAN switch 20d.
  • FIG. 3C shows an example of the serial number assigned to the storage apparatus 30.
  • the storage apparatus “# 1” corresponds to the storage apparatus 30 shown in FIG.
  • serial numbers are assigned to the devices constituting the computer system 1, but in the present embodiment, at least the same kind is used. It suffices that a serial number that can identify the individual device group (server servers or PCIe-SAN switches) is attached.
  • FIG. 4 is a diagram for explaining flags and the like used in the manufacturing number writing process.
  • the flags and IDs shown in FIGS. 4A to 4D are used as an example.
  • FIGS. 5 and 6 are time charts showing the procedure of the server device manufacturing number writing process. The server device serial number writing process will be described in order with reference to FIGS.
  • the server device serial number writing process is started when the power of the server device 10 is turned on (AC ON) or when the PCIe-SAN switch 20 is connected (mounted) to the server device 10 (step). S101).
  • step S101 When a predetermined start trigger is established in step S101, an ID (PCIe-SAN switch ID) corresponding to the mounting position of the PCIe-SAN switch 20 is assigned to the hardware strap pin (H in the shared memory storage area 201 of the PCIe-SAN switch 20). / W strap pin) (step S102).
  • the setting by the H / W strap pin is a process that is electrically performed upon energization, and the flag setting process in steps S103 and S104 following step S102 is also performed by the H / W strap pin.
  • step S102 the mounting position of the PCIe-SAN switch 20 is determined depending on in which slot of the server device 10 the PCIe-SAN switch 20 is connected (mounted).
  • An example of the correspondence between the slot number of the server device 10 and the PCIe-SAN switch ID is shown in FIG. Specifically, for the PCIe-SAN switch 20 installed in the “# 1” slot, a predetermined “setting value 1” is set as the PCIe-SAN switch ID and installed in the “# 2” slot. For the PCIe-SAN switch 20, another predetermined “setting value 2” is set as the PCIe-SAN switch ID. Note that a different PCIe-SAN switch ID may be prepared in advance for each slot in one server device 10.
  • the same PCIe-SAN switch ID is prepared in the server device 10a and the server device 10b. It doesn't matter.
  • “setting value 1” is set corresponding to the “# 1” slot and “setting value 2” is set corresponding to the “# 2” slot in each of the plurality of server apparatuses 10a and 10b. To do.
  • step S 103 flag values are set for the server write flag 215 and the storage write flag 216 held in the shared memory storage area 201.
  • the server write flag 215 and the storage write flag 216 are flags for realizing exclusive control of writing to the shared memory storage area 201 (may be read as shared memory).
  • the relationship between the combination and the access right to the shared memory storage area 201 realized by the combination is as illustrated in FIG.
  • step S103 the server write flag 215 is set to “1 (valid)” and the storage write flag 216 is set to “0 (invalid)”.
  • the access right (write permission) to the shared memory storage area 201 is given to the server management F / W 103, and writing from the storage management F / W is prohibited.
  • step S104 flag values are set for the server device manufacturing number write completion flag 217 and the storage device manufacturing number write completion flag 218 held in the shared memory storage area 201. Specifically, “0” is set in both the server device manufacturing number write completion flag 217 and the storage device manufacturing number write completion flag 218. According to FIG. 4C, when the server device serial number write completion flag 217 is “0”, it means that the server device serial number has not been written, and when it is “1”, the server device serial number is It means that writing has been completed. Similarly, according to FIG.
  • step S104 the server management F / W 103 that has detected that the server write flag 215 is “1 (valid)” by polling is the FRU storage area 102
  • the serial number of the server device 10 (“aaa” in the case of the server device 10a) is acquired from the server (step S105).
  • the server management F / W 103 receives the FRU storage area 102 of the PCIe-SAN switch 20 (for example, the PCIe-SAN switch 20a, 20b in the case of the server device 10a) connected (mounted) to the server device 10 itself.
  • the serial number of the PCIe-SAN switch 20 (“dddd” for the PCIe-SAN switch 20a, “eeee” for the PCIe-SAN switch 20b) is acquired (step S106).
  • the server management F / W 103 shares the shared memory of the PCIe-SAN switch 20 (individually, for example, the PCIe-SAN switches 20a and 20b) connected (mounted) to the own server device 10.
  • the PCIe-SAN switch ID of the PCIe-SAN switch 20 (“setting value 1” for the PCIe-SAN switch 20a, “setting value 2” for the PCIe-SAN switch 20b) is acquired from the storage area 201 (step S107). ).
  • serial number and ID acquired by the server management F / W 103 in steps S105 to S107 are stored in a memory (for example, the memory 13a) in the microcomputer 103.
  • the storage of such acquired data is the same in the case of other processing by the server management F / W 103 and the storage management F / W 303 described later.
  • the data acquired by the storage management F / W 303 in steps S201 to S204 described later is stored in a memory (for example, the memory 33) in the microcomputer 301.
  • step S108 the server management F / W 103 writes the serial number of the server device 10 acquired in step S105 in the shared memory storage area 201. Then, the server management F / W 103 writes the checksum value based on the manufacturing number in the shared memory storage area 201 in order to guarantee the writing value of the manufacturing number of the server device 10 in step S108 (step S109).
  • step S110 the server management F / W 103 writes the PCIe-SAN switch ID acquired in step S106 into the shared memory storage area 201. Then, the server management F / W 103 writes the checksum value based on the PCIe-SAN switch ID in the shared memory storage area 201 in order to guarantee the write value of the PCIe-SAN switch ID in step S110 (step S111).
  • step S111 When the processing up to step S111 is completed, writing of the serial number and ID that can be acquired by the server management F / W 103 to the shared memory storage area 201 is completed.
  • the server device serial number write completion flag 217 held in the storage area 201 is set to “1 (write complete)” (step S112).
  • the server management F / W 103 updates the flag values for the server write flag 215 and the storage write flag 216 held in the shared memory storage area 201, specifically, the server write flag 215. Is set to “0 (invalid)”, and the storage write flag 216 is set to “1 (valid)” (step S113).
  • the server device serial number writing process ends with the process of step S113.
  • step S112 when the process of step S112 is performed, the server device serial number write completion flag 217, which has been set to “0 (write incomplete)” in step S104, is rewritten to “1 (write complete)”. Therefore, the storage management F / W 303 detects by polling that the PCIe-SAN switch 20 has completed writing of the serial number from the server device 10 on which the switch is mounted. Then, by performing the processing of step S113, the write access right to the shared memory storage area 201 is changed from the server management F / W 103 to the storage management F / W 303 while maintaining exclusive control. As a result, the storage management F / W 303 can recognize by polling that writing to the shared memory storage area 201 is permitted.
  • one server device for example, the server device 10a
  • the computer system 1 includes another server device 10 (for example, the server device 10b)
  • the shared memory of all the PCIe-SAN switches 20 for example, PCIe-SAN switches 20a, 20b, 20c, 20d
  • the manufacturing numbers of the own PCIe-SAN switch 20 and the server apparatus 10 of the mounting destination are written in association with each other.
  • FIGS. 7 and 8 are time charts showing the procedure of the storage device manufacturing number writing process.
  • the storage device serial number writing process will be described in order with reference to FIGS.
  • the storage device manufacturing number writing process not only the processing for writing the manufacturing number of the storage device 30 in the shared memory storage area 201 but also the storage device 30 manufacturing number written in the shared memory storage area 201 is used as the server management F.
  • the processing read by / W103 is also performed.
  • the storage device manufacturing number writing process is started when the storage management F / W 303 detects that the writing of the server device manufacturing number to the shared memory storage area 201 has been completed in the server device manufacturing number writing process (step S40).
  • the start trigger of step S201 is, for example, that the storage management F / W 303 polls that the server device manufacturing number write completion flag 217 is set to “1 (write complete)” in step S112 of FIG. Even if the storage management F / W 303 detects that the storage write flag 216 is set to “1 (valid)” in step S113 of FIG. 6 by polling. Good.
  • step S201 When the predetermined start trigger in step S201 is established, the storage management F / W 303 accesses the shared memory storage area 201 and acquires the serial number of the server device 10 written in the shared memory storage area 201 (step S202). ). Further, similarly to step S202, the storage management F / W 303 acquires the PCIe-SAN switch ID and serial number of the PCIe-SAN switch 20 from the shared memory storage area 201 (step S203, step S204).
  • the storage management F / W 303 By performing the processing of steps S202 to S204, the storage management F / W 303 performs the server side (server device 10 and PCIe-SAN switch 20) written in the shared memory storage area 201 by the server device serial number writing processing. Since the serial number and ID can be acquired, the individual server device 10 can be specified, and the slot in which the PCIe-SAN switch 20 is mounted in the server device 10 can be specified.
  • the storage management F / W 303 associates the correspondence relationship between the server apparatus 10 and the PCIe-SAN switch 20 and holds it in the memory.
  • the storage management F / W 303 acquires the serial number of the storage device 30 (“cccc” according to FIG. 3C) from the FRU storage area 302 (step S205).
  • step S206 the storage management F / W 303 writes the serial number of the storage device 30 acquired in step S205 into the shared memory storage area 201. Then, the storage management F / W 303 writes the checksum value based on the manufacturing number in the shared memory storage area 201 in order to guarantee the writing value of the manufacturing number of the storage device 30 in step S206 (step S207).
  • step S207 When the processing up to step S207 is completed, writing of the manufacturing number by the storage management F / W 303 is completed, so that the storage management F / W 303 writes the storage device manufacturing number held in the shared memory storage area 201.
  • the completion flag 218 is set to “1 (write complete)” (step S208).
  • the storage management F / W 303 updates the flag values for the server write flag 215 and the storage write flag 216 held in the shared memory storage area 201, specifically, the server write flag 215. Is set to “1 (valid)” and the storage write flag 216 is set to “0 (invalid)” (step S209).
  • step S208 when the processing of step S208 is performed, the storage device manufacturing number write completion flag 218, which has been set to “0 (write incomplete)” in step S104 of FIG. 5, is set to “1 (write complete)”. Since rewriting is performed, the server management F / W 103 detects by polling that the PCIe-SAN switch 20 has completed writing of the serial number from the storage device 30. As a result of the processing in step S209, the write access right to the shared memory storage area 201 is changed from the storage management F / W 303 to the server management F / W 103 while maintaining exclusive control. As a result, the server management F / W 103 can recognize by polling that writing to the shared memory storage area 201 is permitted.
  • step S210 when the server management F / W 103 detects by polling that the storage device manufacturing number has been written to the shared memory storage area 201 (step S210), the server management F / W 103 accesses the shared memory storage area 201 to manufacture the storage device 30. A number is acquired (step S211). The storage device serial number writing process ends with the process of step S211.
  • step S210 the detection of the completion of the writing of the storage device manufacturing number in step S210 is specifically, for example, that the storage device manufacturing number writing completion flag 218 is set to “1 (writing completed)” in step S208. May be detected by polling by the server management F / W 103, or the server management F / W 103 may perform polling to confirm that the server write flag 215 is set to “1 (valid)” in step S209. It may be detected.
  • step S101 in FIG. 5 to step S113 in FIG. 6 the server apparatus manufacturing number writing process
  • step S201 in FIG. 7 to step in FIG. 8 the storage apparatus manufacturing number writing process
  • step S211 the server management F / W 103 and the storage management F / W 303 both manufacture the server device 10, the PCIe-SAN switch 20, and the storage device 30 that are connected.
  • a number or the like (including the PCIe-SAN switch ID) can be acquired.
  • the storage management F / W 303 can identify an individual from the serial number for each of the plurality of connected server apparatuses 10, and further, the PCIe connected to (installed on) each server apparatus 10. -It is also possible to specify the mounting slot of the SAN switch 20.
  • FIG. 9 is a diagram exemplifying a serial number that can be identified by the storage management F / W.
  • the storage management F / W 303 performs the serial number of each device shown in FIG. 9 (the serial numbers of the server devices 10a and 10b, and PCIe).
  • the serial numbers of the server devices 10a and 10b, and PCIe the serial numbers of the server devices 10a and 10b, and PCIe.
  • the storage management F / W 303 has the serial number of each device in the computer system 1 (specifically, the serial numbers of the server devices 10a and 10b, the PCIe-SAN switches 20a to 20d, and the storage device 30). , Information indicating the mounting positions of all the data transmission relay devices (specifically, PCIe-SAN switch IDs indicating the mounting slots in the server device 10 of the PCIe-SAN switches 20a to 20d), and storage Since the management F / W 303 can execute the lighting control for the indicators (specifically, the LEDs 23a to 23d) mounted on each data transmission relay device, the storage management F / W 303 is an arbitrary data transmission relay device.
  • An indicator of the relay device (if PCIe-SAN switch 20a LED23a) can be turned. Note that such lighting control of the LED 23 by the storage management F / W 303 can be executed based on a predetermined operation on a maintenance console described later.
  • a failure occurs in the PCIe-SAN switch with the PCIe-SAN switch ID “setting value 1” mounted on the server device with the production number “aaaa”.
  • the storage management F / W 303 can determine that the server apparatus with the serial number “aaaa” is the server apparatus 10a.
  • the storage management F / W 303 can specify that the failed PCIe-SAN switch is the PCIe-SAN switch 20a connected (mounted) to the “# 1” slot of the server apparatus 10a.
  • the control for lighting the LED 23a can be executed.
  • the mounting position of the PCIe-SAN switch 20 in which a failure has occurred can be specified as described above, and the mounting position can be determined by controlling the lighting of the LED 23. Can be clearly indicated to the worker. Then, by clearly indicating the location where the failure has occurred (the mounting position of the PCIe-SAN switch 20), it is possible to assist the operator in replacing the PCIe-SAN switch 20.
  • FIG. 10 is a flowchart showing the procedure of PCIe-SAN switch replacement work.
  • the PCIe-SAN switch replacement work is a maintenance work in which when a failure occurs in the PCIe-SAN switch 20 in the computer system 1, the PCIe-SAN switch 20 in which the failure has occurred is replaced with a new PCIe-SAN switch.
  • a maintenance replacement operator connects the maintenance computer 3 to the network switch 2 (step S301), and starts a maintenance console on the maintenance computer 3 (step S302).
  • steps S301 and S302 may be performed in response to the fact that some kind of failure has occurred in the computer system 1, or may be performed in a periodic inspection or the like.
  • the storage management F / W 303 includes the storage device 30 in the maintenance domain and the server device 10 connected to the storage device 30 (in the case of FIG. 9, server devices 10a and 10b). Is displayed on the screen of the maintenance computer 3 (step S303).
  • the information displayed in step S303 is a serial number or the like that can be identified by the storage management F / W 303.
  • a display mode as shown in FIG. 9 is conceivable. By displaying the serial number and the like in this way, a list of connection statuses of the server device 10 (which may include the PCIe-SAN switch 20) and the storage device 30 is displayed.
  • step S303 a list of connection statuses of the server device 10 and the storage device 30 is displayed, and if an abnormality is found in the connection status, a failure log related to the abnormality is output.
  • the output of the failure log may be automatically output by a function of the maintenance console, or may be output by an operator performing a predetermined output operation. In this example, it is assumed that a failure log indicating that a failure has occurred in the PCIe-SAN switch with the PCIe-SAN switch ID “setting value 1” mounted on the server device with the production number “aaaa” is output.
  • the worker specifies a failure occurrence device and a failure site based on the failure log output in step S303 (step S304).
  • the server device 10a (manufacturing number “aaaa”) is the failure occurrence device
  • the PCIe-SAN switch 20a manufactured in the “# 1” slot of the server device 10a is the failure occurrence portion. Is identified.
  • the maintenance console may have a function of identifying the failure generating device and the failure site based on the failure log.
  • step S305 the operator selects the server apparatus 10a having the serial number “aaaa” on the maintenance console.
  • step S306 the operator selects the PCIe-SAN switch 20a having the serial number “dddd” on the maintenance console.
  • step S306 the PCIe-SAN switch 20a to be replaced is selected by the maintenance console, so that the storage management F / W 303 of the storage device 30 with the serial number “cccc” has the path of the communicable I2C bus. Then, the switch is switched to the PCIe-SAN switch 20a having the manufacturing number “dddd” (step S307), and further, the control for turning on the LED 23a included in the PCIe-SAN switch 20a having the manufacturing number “dddd” is executed (step S308). When the LED 23a is turned on in step S308, the operator can visually confirm the PCIe-SAN switch 20a to be replaced.
  • the computer system 1 can continue to operate while the above-described maintenance console is activated, but the timing at which the PCIe-SAN switch 20a to be replaced is selected at the maintenance console in step S306.
  • polling by the server management F / W 103 and the storage management F / W 303 may be temporarily suspended until the switch replacement is completed. By performing such control, it is possible to realize a switch replacement operation that ensures data safety while maintaining the maximum operation in the computer system 1.
  • step S309 the worker pulls out the PCIe-SAN switch 20a from the “# 1” slot of the server device 10a using the LED 23a in the lit state as a clue. Then, in place of the removed PCIe-SAN switch 20a, a new PCIe-SAN switch is connected (mounted) to the “# 1” slot of the server device 10a (step S310).
  • maintenance and replacement for the PCIe-SAN switch 20 can be performed by maintenance work from the storage apparatus 30 side without performing maintenance work on the server apparatus 10 side. Therefore, the burden on the maintenance worker can be reduced.
  • the PCIe-SAN switch 20a to be replaced is made visible by turning on the LED 23a, work efficiency can be improved because the replacement target can be found immediately and work can be performed, and the wrong switch can be removed. Accidents can be prevented and safety can be improved.
  • the above switch replacement operation has been described in the case where the maintenance computer 3 is connected via the network switch 2 on the storage device 30 side (see, for example, FIG. 9).
  • a separate network switch is provided on the server device 10 side.
  • the maintenance console may be started by connecting the maintenance computer 3 via the network switch.
  • the server management F / W 103 of the server device 10a can obtain manufacturing numbers and the like regarding the server device 10a, the PCIe-SAN switch 20a, and the storage device 30. it can.
  • the server management F / W 103 can manage the server device 10a. Similar to the storage management F / W 303, the F / W 103 can acquire the serial numbers and the like regarding all the devices constituting the computer system 1.
  • the server management F / W 103 needs to have the authority to execute control to turn on the LED 23a.
  • the server management F / W 103 and the storage management F / W 303 are the devices in the computer system 1 (the server device 10, the PCIe-SAN switch 20, Since the serial number of the storage device 30) and the PCIe-SAN switch ID of each PCIe-SAN switch 20 can be shared via the shared memory storage area 201, which server device 10 the PCIe-SAN switch 20 has Which slot is connected can be specified.
  • the storage management F / W 303 controls the lighting of the LED 23 of the specified PCIe-SAN switch 20 to clearly indicate the mounting location of the PCIe-SAN switch 20 that needs to be replaced due to a failure or the like. , Can be easily recognized by the operator.
  • the PCIe-SAN switch is connected to each server device for each server blade as a data transmission relay device for connecting a plurality of server devices having a plurality of server blades and a storage device in the computer system
  • the present invention uses a switch other than the PCIe-SAN switch as long as it is a data transmission relay device capable of speeding up data transmission (so-called server speedup) in a computer system.
  • server speedup a data transmission relay device capable of speeding up data transmission
  • the same control and processing as in this embodiment may be performed by providing the switch with a shared memory storage area.
  • the computer system 1 described above as the computer system according to the present invention is mounted on a server device corresponding to a plurality of server devices having one or more server blades, a storage device having a disk array, and individual server blades.
  • the computer system includes a plurality of data transmission relay devices that connect the server device and the storage device.
  • the data transmission relay device corresponds to the PCIe-SAN switch 20.
  • the shared memory storage area 201 (or the memories 22a to 22d) is an example of a shared memory that can read and write data from the server device and the storage device to which each data transmission relay device is connected. .
  • the server device that reads and writes to the shared memory is more specifically the microcomputer 101 of the management module 11 (as a program, the server management F / W 103), and the storage device that reads and writes to the shared memory is more specific.
  • the PCIe-SAN switch ID is an example of information indicating the mounting destination of the data transmission relay device in the server device.
  • the server serial number writing process and the storage serial number writing process described with reference to FIGS. 5 to 8 write information that allows the server apparatus and the storage apparatus to identify their own apparatuses to the shared memory, and then share them. It is an example of the process which each reads and shares the information group hold
  • each of the above-described configurations, functions, processing units, processes may be realized by hardware by designing a part or all of them by, for example, an integrated circuit.
  • Each of the above-described configurations, functions, and the like may be realized by software by interpreting and executing a program that realizes each function by the processor.
  • Information such as programs, tables, and files for realizing each function can be stored in a storage device such as a memory, a hard disk, and SSD (Solid State Drive), or a storage medium such as an IC card, an SD card, and a DVD.
  • control lines and information lines indicate what is considered necessary for the explanation, and not all the control lines and information lines on the product are necessarily shown. In practice, it may be considered that almost all components are connected to each other.
  • the computer system 1 illustrated in FIG. 1 may be configured such that a plurality of server devices 10a and 10b are connected to each other.
  • the present invention is preferably applied to a computer system in which a plurality of server devices (so-called blade servers) having a plurality of server blades are connected to a storage device. In such a case, the present invention is mounted on the server device.
  • a failure occurs in any of a plurality of switches (for example, PCIe-SAN switch)
  • PCIe-SAN switch it is possible to solve the problem that it is difficult for a maintenance / replacement worker to determine the mounting position of the failure-occurring switch. Contributes to improving safety and efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Le problème décrit par la présente invention est de permettre à l'emplacement de montage d'un dispositif de relais de transfert de données monté dans un dispositif de serveur d'être spécifié dans un système informatique configuré par fourniture d'une pluralité de dispositifs de serveur et d'un dispositif de stockage. La solution selon l'invention porte sur un système informatique (1) qui comporte une pluralité de dispositifs de serveur (10) ayant une pluralité de lames de serveur (14), un dispositif de stockage (30) ayant une unité multidisque (34), et une pluralité de commutateurs PCIe-SAN 20 permettant une connexion entre les dispositifs de serveur (10) et le dispositif de stockage, les commutateurs PCIe-SAN (20) étant montés dans les dispositifs de serveur (10) en correspondance avec chacune des lames de serveur individuelles (14). Une zone de stockage de mémoire partagée (201) accessible à partir d'un micro-logiciel (103) de gestion de serveur et d'un micro-logiciel (303) de gestion de stockage est fournie aux commutateurs PCIe-SAN (20), des informations avec lesquelles il est possible d'identifier chacun des dispositifs de serveur (10), etc., et des informations indiquant la destination de montage des commutateurs PCIe-SAN (20) étant partagées dans la zone de stockage de mémoire partagée (201).
PCT/JP2015/083865 2015-12-02 2015-12-02 Système informatique, et procédé pour partager des informations de dispositif WO2017094139A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/083865 WO2017094139A1 (fr) 2015-12-02 2015-12-02 Système informatique, et procédé pour partager des informations de dispositif

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/083865 WO2017094139A1 (fr) 2015-12-02 2015-12-02 Système informatique, et procédé pour partager des informations de dispositif

Publications (1)

Publication Number Publication Date
WO2017094139A1 true WO2017094139A1 (fr) 2017-06-08

Family

ID=58796631

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/083865 WO2017094139A1 (fr) 2015-12-02 2015-12-02 Système informatique, et procédé pour partager des informations de dispositif

Country Status (1)

Country Link
WO (1) WO2017094139A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005322069A (ja) * 2004-05-10 2005-11-17 Hitachi Ltd ディスクアレイ装置
JP2008134997A (ja) * 2006-10-27 2008-06-12 Fujitsu Ltd ネットワーク管理方法及びネットワーク管理プログラム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005322069A (ja) * 2004-05-10 2005-11-17 Hitachi Ltd ディスクアレイ装置
JP2008134997A (ja) * 2006-10-27 2008-06-12 Fujitsu Ltd ネットワーク管理方法及びネットワーク管理プログラム

Similar Documents

Publication Publication Date Title
CN106681751B (zh) 统一固件管理系统和管理方法以及计算机可读取介质
EP3073381B1 (fr) Contrôleur satellite à interface intelligente de gestion de plate-forme virtuelle (ipmi) et procédé
US9665469B2 (en) System and method of runtime downloading of debug code and diagnostics tools in an already deployed baseboard management controller (BMC) devices
CN100538567C (zh) 可编程控制器
WO2015136959A1 (fr) Système de commande, procédé, programme et dispositif de traitement d'informations
CN107818021A (zh) 使用bmc作为代理nvmeof发现控制器向主机提供nvm子系统的方法
US9432244B2 (en) Management device, information processing device and control method that use updated flags to configure network switches
US20150149684A1 (en) Handling two ses sidebands using one smbus controller on a backplane controller
US8793414B2 (en) Status information saving among multiple computers
US9176923B2 (en) Electronic guidance for restoring a predetermined cabling configuration
US10120702B2 (en) Platform simulation for management controller development projects
US9690602B2 (en) Techniques for programming and verifying backplane controller chip firmware
ES2595439T3 (es) Sistema de gestión de la información en aeronaves
US20150100298A1 (en) Techniques for validating functionality of backplane controller chips
US20150161069A1 (en) Handling two sgpio channels using single sgpio decoder on a backplane controller
US9785599B2 (en) Information processing apparatus and log output method
JP2007310719A (ja) ユニット形プログラマブルコントローラ
WO2017094139A1 (fr) Système informatique, et procédé pour partager des informations de dispositif
WO2018164037A1 (fr) Appareil de relais, serveur, procédé de configuration d'appareil de relais et support d'enregistrement
JP4376892B2 (ja) プログラマブルコントローラ
JP2002027027A (ja) 計算機システム、計算機管理システム及びシステム管理方法
EP2487637A1 (fr) Système de gestion de membres, dispositif de gestion de membres et programme
JP5849925B2 (ja) バージョン管理装置、機器、情報処理システム、バージョン管理方法、および、コンピュータ・プログラム
CN105765472A (zh) 远程控制装置以及控制系统
US11669373B2 (en) System and method for finding and identifying computer nodes in a network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15909769

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15909769

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP