CN106844142A - A kind of system and method that node SOL is monitored for SAS Switch whole machine cabinets - Google Patents

A kind of system and method that node SOL is monitored for SAS Switch whole machine cabinets Download PDF

Info

Publication number
CN106844142A
CN106844142A CN201611218827.5A CN201611218827A CN106844142A CN 106844142 A CN106844142 A CN 106844142A CN 201611218827 A CN201611218827 A CN 201611218827A CN 106844142 A CN106844142 A CN 106844142A
Authority
CN
China
Prior art keywords
node
sol
testing
bmc
whole machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611218827.5A
Other languages
Chinese (zh)
Inventor
赵盛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201611218827.5A priority Critical patent/CN106844142A/en
Publication of CN106844142A publication Critical patent/CN106844142A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The present invention provides a kind of for SAS Switch whole machine cabinets monitoring node SOL method and system, can just realize monitoring the backstage of node SOL by the scheme invented, once there is certain than more serious bug in node, monitor the log recorded in the program of node SOL and will provide some reference frames to help analyze solution bug in backstage, this method is particularly suitable for the more whole machine cabinet of calculate node, can realize calculate node serial ports SOL states in continual monitoring SAS Switch whole machine cabinets for a long time.

Description

A kind of system and method that node SOL is monitored for SAS Switch whole machine cabinets
Technical field
The present invention relates to server technology, and in particular to SAS Switch whole machine cabinets monitor the system and method for node SOL.
Background technology
Server product is widely used in current numerous areas, such as banking and insurance business, military project national defence, education section Skill, the manufacturing, and consumer electronics, data center etc., with the continuous expansion of server product range of application, to actual fortune Capable number of servers brings the growth of exponential type, particularly in application scenarios such as server cluster deployment to number of servers Demand it is even more surprising.
The problems such as traditional generic server scale application brings low density, high energy consumption, installation and big maintenance workload, A kind of server calculated towards large-scale data center arises at the historic moment --- whole machine cabinet server.
Calculate node is more in SAS Switch whole machine cabinets, and in test process, calculate node easily occurs various Bug, the serial ports SOL of calculate node is monitored and just seem necessary.
Under normal circumstances, once machine failure of delaying occurs in calculate node, cannot now key in related command to collect related letter Breath, therefore backstage performs some oracle listeners in test process, to obtain the operation of SAS Switch whole machine cabinet calculate nodes Information, for the serial ports SOL of substantial amounts of calculate node in SAS Switch whole machine cabinets, there is provided more efficiently, uninterruptedly to SAS Switch calculate node serial ports SOL states monitor just into the problem of urgent need to resolve for a long time.
The content of the invention
In order to solve the above technical problems, technical scheme is as follows:
The present invention provides a kind of method that node SOL is monitored for SAS Switch whole machine cabinets, comprises the following steps:
SS1, setup test environment;
SS2, the corresponding BMC IP address of whole machine cabinet calculate node is imported into shell script;
SS3, calculate node BMC is remotely accessed by IPMI orders in control end, and perform associative operation, associative operation bag Include:Node serial ports redirection is closed, serial ports is opened and is redirected.
Further, step SS3 includes:
SS31, the serial ports redirection for closing all calculate nodes;
SS32, establishment file build name pipeline, and the file of correlation is redirected and bound by excel functions Descriptor;
SS33, the serial ports redirection for opening all calculate nodes;
SS34, backstage uninterruptedly capture the serial ports redirection information of printing.
Further, the setup test environment of step SS1 is specially:Use certain calculating section of SAS Switch whole machine cabinets Point as testing and control node, and configure between good each node without cryptographic acess, it is ensured that each node each other can Ping leads to.
Further, a server node is chosen in addition as testing and control node.
Further, by with issue orders Telnet node BMC and close serial ports redirect:ipmitool-I lanplus-H $ip-U ADMIN-P ADMIN sol deactivate。
Further, by building name pipeline to issue orders come establishment file, reset by excel functions To and bind correlation filec descriptor:
Tempfifo=$ $ .fifo
mkfifo $tempfifo
exec 1000<>$tempfifo。
Further, redirect and will beat by realizing the serial ports of the continual opened nodes BMC in backstage to issue orders Print content output:nohup ipmitool-I lanplus-H $ip-U ADMIN-P ADMIN sol activate>>$ ip.log 2>&1<&1000&。
In addition the present invention provides a kind of system that node SOL is monitored for SAS Switch whole machine cabinets, including SAS Each calculate node of Switch BMC, testing and control node, network interface card 1, remotely administered server, network interface card 2 and memory, SAS Switch each calculate node BMC are connected with testing and control node by network interface card 1, testing and control node and remotely administered server Connected by network interface card 2, memory is connected with testing and control node, each calculate node of SAS Switch BMC is used to obtain serial ports Information, testing and control node is used to preserve the Serial Port Information of each calculate node of SAS Switch BMC transmissions, is crashed in server When, user can be redirected by the ipmi orders of IPMI IPMI in remotely administered server by serial ports SOL interface functions get the Serial Port Information for preserving in memory.
Further, testing and control node is one clothes of certain calculate node or other selection of SAS Switch whole machine cabinets Business device node is used as testing and control node.
Further, memory stores Serial Port Information by the way of storehouse, and memory is that testing and control node is built-in Cache memory Cache or random access memory ram.
The solution of the present invention be it is a kind of more efficiently, uninterruptedly SAS Switch calculate node serial ports SOL states are carried out The method monitored for a long time.It is convenient that work monitoring of the method to the calculate node of SAS Switch whole machine cabinets is provided, simply It is practical.
Brief description of the drawings
Fig. 1 shows that the present invention monitors the method flow diagram of node SOL for SAS Switch whole machine cabinets.
Fig. 2 shows that the present invention monitors the system architecture diagram of node SOL for SAS Switch whole machine cabinets.
Specific embodiment
Technical scheme is further illustrated below in conjunction with Figure of description and specific embodiment.It should be appreciated that this The described specific embodiment in place is only used to explain the present invention, is not intended to limit the present invention.
The present invention provide it is a kind of more efficiently, it is uninterrupted when carrying out long to SAS Switch calculate node serial ports SOL states Between the method monitored, step is as follows:
SS1, setup test environment;
SS2, the corresponding BMC IP address of whole machine cabinet calculate node is imported into shell script;
SS3, calculate node BMC is remotely accessed by IPMI orders in control end, and perform associative operation, associative operation bag Include:Node serial ports redirection is closed, serial ports is opened and is redirected.
Wherein step 3 specifically includes following steps:
SS31, the serial ports redirection for closing all calculate nodes;
SS32, establishment file build name pipeline, and the file of correlation is redirected and bound by excel functions Descriptor;
SS33, the serial ports redirection for opening all calculate nodes;
SS34, backstage uninterruptedly capture the serial ports redirection information of printing.
It implements process:
Controlled as test using certain calculate node or one server node of other selection of SAS Switch whole machine cabinets Node processed, and configure between good each node without cryptographic acess, it is ensured that each node each other can ping it is logical, it is necessary to will In bmciplist in the BMC IP write-in shell scripts of SAS Switch calculate nodes.
The serial ports for first closing each calculate node using IPMI orders is redirected, and document creation name pipeline is built afterwards, And customized filec descriptor is bound by exec functions;
The serial ports for opening each calculate node using IPMI orders is redirected, and continual printing Serial Port Information.
The associated script sol_minitor.sh that specific Server ends perform is mainly herein below:
Bmciplist="
192.168.1.120
192.168.1.121
192.168.1.122
192.168.1.123
192.168.1.124
192.168.1.125
192.168.1.126
192.168.1.127
192.168.1.128…
" // by the BMC ip addresses write-in variable bmciplist of each node in whole machine cabinet //
for ip in $bmciplist;do
The IP address that echo $ ip//printing is called //
ipmitool-I lanplus-H $ip-U ADMIN-P ADMIN sol deactivate
Done//Telnet node BMC simultaneously closes serial ports redirection
Tempfifo=$ $ .fifo//by the fifo files of the current process PID of establishment be assigned to tempfifo//
The fifo document creations name pipeline that mkfifo $ tempfifo//utilization is created //
exec 1000<>The readable write operation of fifo files that $ tempfifo//opening is created, with user-defined file descriptor Fd1000 bindings //
The fifo files that rm-rf $ tempfifo//deletion has been created //
for ip in $bmciplist;do
The IP address that echo $ ip//printing is called //
nohup ipmitool-I lanplus-H $ip-U ADMIN-P ADMIN sol activate>>$ip.log 2>&1<The serial ports of the continual opened nodes BMC in &1000& // backstage redirect and will printing content output //
Done EPs (end of program).
Described below with reference to Figure of description 2 be for what SAS Switch whole machine cabinets monitored node SOL according to the present invention System.
Fig. 2 is the structural frames of the system that node SOL is monitored for SAS Switch whole machine cabinets of one embodiment of the invention Figure.
As shown in Fig. 2 the system that node SOL is monitored for SAS Switch whole machine cabinets according to embodiments of the present invention, bag Include each calculate node of SAS Switch BMC, testing and control node, network interface card 1, remotely administered server, network interface card 2 and memory.
Wherein, each calculate node of SAS Switch BMC is arranged on SAS Switch whole machine cabinet servers, SAS Each calculate node of Switch BMC is the independent board on server master board, do not rely on server processor, BIOS or operating system and can independently be operated.SAS Switch each calculate node BMC pass through with testing and control node Network interface card 1 is connected, and SAS Switch each calculate node BMC IP ping can lead to mutually with testing and control node IP in whole machine cabinet, i.e., SAS Switch calculate node BMC IP will be in the same network segment with testing and control node.
Testing and control node is one server node of certain calculate node or other selection of SAS Switch whole machine cabinets As testing and control node.
Specifically, each calculate node of SAS Switch BMC is used to obtain Serial Port Information, due to each meter of SAS Switch Operator node BMC is the independent board on server master board, and when server fail, SAS Switch each count Operator node BMC can in time get information when server failure occurs, and the serial ports redirection SOL interface work(for passing through IPMI The information that will can be got is sent to testing and control node by network interface card 1, and it refers in mark that wherein serial ports redirects SOL interface functions Simulative serial port communication in accurate network connection, while redirecting the information that SOL interface functions will can also get by serial ports Save.
Testing and control node is connected with remotely administered server by network interface card 2.
Testing and control node is used to preserve the Serial Port Information of each calculate node of SAS Switch BMC transmissions, and in service When device crashes, the Serial Port Information stored in memory is inquired about by IPMI IPMI.
When server crashes, the ipmi that user can pass through IPMI IPMI in remotely administered server Order and the Serial Port Information for preserving in memory is got by serial ports redirection SOL interface functions.
In one embodiment of the invention, memory stores Serial Port Information by the way of storehouse.Use the side of storehouse The treatment of formula storage information is quick, efficiency high.
In one embodiment of the invention, memory is the built-in cache memory Cache of testing and control node Or random access memory ram.
Can just realize monitoring the backstage of node SOL by above-mentioned scheme, once there is certain than more serious in node Bug, backstage monitors the log recorded in the program of node SOL and will provide some reference frames to help analyze solution bug, This method is particularly suitable for the more whole machine cabinet of calculate node, can realize the continual monitoring SAS whole machines of Switch for a long time Calculate node serial ports SOL states in cabinet.
Although in terms of having been described for some in the context of device, it is apparent that these aspects also illustrate that corresponding method Description, the feature of wherein block or apparatus and method for step or method and step is corresponding.Similarly, in the context of method and step Described in each side also illustrate that corresponding piece or project or corresponding intrument feature description.Can be by (or use) Hardware unit such as microprocessor, programmable calculator or electronic circuit etc is some or all of in method and step to perform. Can be performed by such device in most important method and step some or it is multiple.
The realization can using hardware or using software or can using such as floppy disk, DVD, blue light, CD, ROM, PROM, EPROM, EEPROM's or flash memory etc is situated between with the stored digital for being stored in electronically readable control signal thereon Matter is performed, and the electronically readable control signal coordinates (or can coordinate with it) to cause to perform with programmable computer system Corresponding method.The data medium with electronically readable control signal can be provided, the electronically readable control signal can be with Programmable computer system coordinates to cause to perform approach described herein.
The realization can also work as computer program product in the form of the computer program product with program code When running on computers, program code is operated to perform the method.Can in machine-readable carrier storage program generation Code.
Described above be only it is illustrative, and it is to be understood that it is described herein arrangement and details modification and Change will be apparent to those skilled in the art.It is therefore intended that only by scope of the following claims rather than by The specific detail that is presented is limited above description and by way of explaining.

Claims (10)

1. a kind of method that node SOL is monitored for SAS Switch whole machine cabinets, it is characterised in that step is as follows:
SS1, setup test environment;
SS2, the corresponding BMC IP address of whole machine cabinet calculate node is imported into shell script;
SS3, calculate node BMC is remotely accessed by IPMI orders in control end, and perform associative operation, associative operation includes: Node serial ports redirection is closed, serial ports is opened and is redirected.
2. method according to claim 1, it is characterised in that step SS3 includes:
SS31, the serial ports redirection for closing all calculate nodes;
SS32, establishment file build name pipeline, and the file description of correlation is redirected and bound by excel functions Symbol;
SS33, the serial ports redirection for opening all calculate nodes;
SS34, backstage uninterruptedly capture the serial ports redirection information of printing.
3. method according to claim 1, it is characterised in that the setup test environment of step SS1 is specially:Use SAS Certain calculate node of Switch whole machine cabinets as testing and control node, and configure between good each node without cryptographic acess, Ensure each node each other can ping lead to.
4. method according to claim 3, it is characterised in that choose a server node in addition as testing and control section Point.
5. method according to claim 2, it is characterised in that Telnet node BMC and close string by issue orders Salty orientation:ipmitool-I lanplus-H$ip-U ADMIN-P ADMIN sol deactivate.
6. method according to claim 2, it is characterised in that by building name pipe to issue orders come establishment file Road, redirects and binds the filec descriptor of correlation by excel functions:
Tempfifo=$ $ .fifo
mkfifo$tempfifo
exec 1000<>$tempfifo。
7. method according to claim 2, it is characterised in that open section by realizing that backstage is continual to issue orders The serial ports of point BMC redirects and will printing content output:nohup ipmitool-I lanplus-H$ip-U ADMIN-P ADMIN sol activate>>$ip.log 2>&1<&1000&。
8. a kind of system that node SOL is monitored for SAS Switch whole machine cabinets, it is characterised in that including SAS Switch each Calculate node BMC, testing and control node, network interface card 1, remotely administered server, network interface card 2 and memory, SAS Switch each count Operator node BMC is connected with testing and control node by network interface card 1, and testing and control node is connected with remotely administered server by network interface card 2 Connect, memory is connected with testing and control node, each calculate node of SAS Switch BMC is used to obtain Serial Port Information, test control Node processed is used to preserve the Serial Port Information of each calculate node of SAS Switch BMC transmissions, and when server crashes, user can be with SOL interface functions are redirected in remotely administered server by serial ports by the ipmi orders of IPMI IPMI to obtain Get the Serial Port Information for preserving in memory.
9. system according to claim 8, it is characterised in that testing and control node for SAS Switch whole machine cabinets certain Calculate node chooses a server node as testing and control node in addition.
10. system according to claim 8, it is characterised in that memory stores Serial Port Information by the way of storehouse, deposits Reservoir is the built-in cache memory Cache or random access memory ram of testing and control node.
CN201611218827.5A 2016-12-26 2016-12-26 A kind of system and method that node SOL is monitored for SAS Switch whole machine cabinets Pending CN106844142A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611218827.5A CN106844142A (en) 2016-12-26 2016-12-26 A kind of system and method that node SOL is monitored for SAS Switch whole machine cabinets

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611218827.5A CN106844142A (en) 2016-12-26 2016-12-26 A kind of system and method that node SOL is monitored for SAS Switch whole machine cabinets

Publications (1)

Publication Number Publication Date
CN106844142A true CN106844142A (en) 2017-06-13

Family

ID=59135612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611218827.5A Pending CN106844142A (en) 2016-12-26 2016-12-26 A kind of system and method that node SOL is monitored for SAS Switch whole machine cabinets

Country Status (1)

Country Link
CN (1) CN106844142A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108923965A (en) * 2018-06-27 2018-11-30 郑州云海信息技术有限公司 A kind of Remote triggering server system generates the system and method for Kernel Panic log
CN110134559A (en) * 2019-05-22 2019-08-16 苏州浪潮智能科技有限公司 A kind of BMC fault detection method, system and associated component
CN110932886A (en) * 2019-11-15 2020-03-27 苏州浪潮智能科技有限公司 Method and system for automatically testing network performance of server and SOL testing device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101136756A (en) * 2006-08-29 2008-03-05 联想(北京)有限公司 Electric self-checking method, system and BMC chip on network long-range control host machine
CN102609349A (en) * 2012-02-08 2012-07-25 北京百度网讯科技有限公司 Method and system for screen capture in server failure
CN104363117A (en) * 2014-11-04 2015-02-18 浪潮电子信息产业股份有限公司 IPMI (intelligent platform management interface) based method for serial port redirection
CN104378218A (en) * 2013-08-12 2015-02-25 鸿富锦精密工业(深圳)有限公司 System and method for managing servers in cabinet
CN104954189A (en) * 2015-07-07 2015-09-30 上海斐讯数据通信技术有限公司 Automatic server cluster detecting method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101136756A (en) * 2006-08-29 2008-03-05 联想(北京)有限公司 Electric self-checking method, system and BMC chip on network long-range control host machine
CN102609349A (en) * 2012-02-08 2012-07-25 北京百度网讯科技有限公司 Method and system for screen capture in server failure
CN104378218A (en) * 2013-08-12 2015-02-25 鸿富锦精密工业(深圳)有限公司 System and method for managing servers in cabinet
CN104363117A (en) * 2014-11-04 2015-02-18 浪潮电子信息产业股份有限公司 IPMI (intelligent platform management interface) based method for serial port redirection
CN104954189A (en) * 2015-07-07 2015-09-30 上海斐讯数据通信技术有限公司 Automatic server cluster detecting method and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108923965A (en) * 2018-06-27 2018-11-30 郑州云海信息技术有限公司 A kind of Remote triggering server system generates the system and method for Kernel Panic log
CN110134559A (en) * 2019-05-22 2019-08-16 苏州浪潮智能科技有限公司 A kind of BMC fault detection method, system and associated component
CN110134559B (en) * 2019-05-22 2020-03-13 苏州浪潮智能科技有限公司 BMC fault detection method, system and related components
CN110932886A (en) * 2019-11-15 2020-03-27 苏州浪潮智能科技有限公司 Method and system for automatically testing network performance of server and SOL testing device

Similar Documents

Publication Publication Date Title
CN107170474B (en) Expansible the storage box, computer implemented method and computer readable storage means
US10353732B2 (en) Software-defined computing system remote support
US20140283075A1 (en) Storage appliance and threat indicator query framework
JP2011530127A (en) Method and system for maintaining data integrity between multiple data servers across a data center
CN109828868A (en) Date storage method, device, management equipment and dual-active data-storage system
CN109189602A (en) A kind of PCIE Slot Fault Locating Method, device and equipment
TWI709865B (en) Operation and maintenance data reading device and reading method thereof
CN106844142A (en) A kind of system and method that node SOL is monitored for SAS Switch whole machine cabinets
US20210111951A1 (en) Hyper-converged infrastructure networking configuration system
CN105095103A (en) Storage device management method and device used for cloud environment
CN108205482B (en) File mount restoration methods
CN110633046A (en) Storage method and device of distributed system, storage equipment and storage medium
CN107995318A (en) A kind of high-availability system of network cloud disk
CN106796542A (en) Merge storage operation
CN103546556B (en) One kind online moving method of virtual machine in future network XIA
US11256584B2 (en) One-step disaster recovery configuration on software-defined storage systems
CN108762992A (en) Main/standby switching method, device, computer equipment and storage medium
CN112073499A (en) Dynamic service method of multi-machine type cloud physical server
CN110321199B (en) Method and device for notifying common data change, electronic equipment and medium
CN107688441B (en) Method and device for realizing storage virtualization
CN108984238A (en) Gesture processing method, device and the electronic equipment of application program
EP4281872A1 (en) Published file system and method
CN114996955A (en) Target range environment construction method and device for cloud-originated chaotic engineering experiment
CN108491297A (en) A kind of server monitoring information acquisition method, device, equipment and storage medium
CN105045629B (en) A kind of non-disk workstation equipment starting method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170613