CN105912431A - Reboot testing method of server, server, control device and system - Google Patents

Reboot testing method of server, server, control device and system Download PDF

Info

Publication number
CN105912431A
CN105912431A CN201610202489.XA CN201610202489A CN105912431A CN 105912431 A CN105912431 A CN 105912431A CN 201610202489 A CN201610202489 A CN 201610202489A CN 105912431 A CN105912431 A CN 105912431A
Authority
CN
China
Prior art keywords
server
count
file
ispci
controller
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610202489.XA
Other languages
Chinese (zh)
Inventor
肖欢
巩祥文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201610202489.XA priority Critical patent/CN105912431A/en
Publication of CN105912431A publication Critical patent/CN105912431A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2273Test methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)

Abstract

The invention provides a reboot testing method of a server, the server, a control device and a system. The method comprises following steps: establishing intercommunication between the server and the controller through an exchanger; booting after the server receives a booting request sent by the controller; and determining the presence of Ispci-tmp files, reading equipment information and reading equipment information to Ispci-$count files and comparing uniformity of Ispci-tmp files and Ispci-$count files if present and generating Ispci-tmp files for the server otherwise; sending boot finishing information when Ispci-tmp files are the same as Ispci-$count files; creating gpu.text files and server.text files; receiving a shutdown request sent by the controller and performing shutdown operation so that automation of a server stability test is achieved.

Description

Server reboot method of testing, server, controller and system
Technical field
The present invention relates to server applied technical field, particularly to server reboot method of testing, clothes Business device, controller and system.
Background technology
Along with the development of cloud computing service, the stability requirement to server is more and more higher.Clothes at present A kind of important method of business device stability test is reboot test.
The reboot test mode of existing server mainly, installs reboot foot at each server node This, be each server node connection power supply by the way of artificial, each server node manually opened, Reboot script runs, and it is the most normal to detect start process, and then server node performs shutdown behaviour Make and it needs to manually successively each server node is turned off the operation of power supply, the most existing reboot Method of testing can only complete by the way of artificial participation, and server stability test cannot be made automatically to enter OK.
Summary of the invention
Embodiments provide server reboot method of testing, server, controller and system, Realize the automatization of server stability test.
Server reboot method of testing, by switch, sets up phase intercommunication between server and controller Letter;Also include:
When server receives the power on request that controller sends, boot up startup;
Server judges whether itself exists Ispci-tmp file, if it is, read setting in server Standby information, writes Ispci-$ count file by this facility information, and compare Ispci-tmp file and Ispci-$ count file is the most consistent, otherwise, for the facility information in server, generates Ispci-tmp File;
When described Ispci-tmp file is consistent with described Ispci-$ count file, sends startup and complete letter Breath is to controller;
Create gpu.txt file and server.txt file;
Receive the shutdown request that controller sends, carry out power-off operation.
Preferably, described set up the intercommunication of server and controller by switch, including:
Server is connected to switch by OS network and BMC network;
Controller is connected to switch by OS network.
Preferably, said method farther includes: arrange the first enumerator count in the server;
After described starting up, farther include: server judges itself whether to there is count literary composition Part, if it is, by described first enumerator count+1, storage to described count file;Otherwise, Start described first enumerator count, by described first enumerator count+1, generate count file, And described first enumerator count is write server starting up's item.
Preferably, described server is Pcie-Switch server, and this Pcie-Switch server includes: Resource Server and server end, wherein, described server end is inserted with a retimer card, by this Retimer card and MiniSASHD cable are connected with described Resource Server;
Farther include: startup sequential is set;
Described starting up, including: according to the startup sequential arranged, Resource Server described in sequence starting With server end.
Server reboot method of testing, is applied to controller, arranges the second enumerator in the controller Count, and detection threshold value is set;Also include:
M1, controller initialize described second enumerator count;
The startup that M2, reception server send completes information, it is judged that the meter of described second enumerator count Number whether less than detection threshold value, if it is, whether detection server exists gpu.txt file and Server.txt file, if it is, the shutdown function of invoking server, controls server shutdown;
M3, to server send power on request, the start function of invoking server, control server start Start, and by the second enumerator count+1, perform M2.
Preferably, said method farther includes: remove the operating system daily record in described server.
Preferably, described server is Pcie-Switch server, and this Pcie-Switch server includes: Resource Server and server end, wherein, described server end is inserted with a retimer card, by this Retimer card and MiniSASHD cable are connected with described Resource Server;
Described control server shuts down, including: described in sequential control, the shutdown of server end and described resource take Business device shutdown;
Described control server starting up, including: Resource Server starting up described in sequential control and Described server end starting up.
It is applied to the server of any of the above-described described server reboot method of testing, by the friendship of peripheral hardware Change planes, and the intercommunication of the controller of peripheral hardware, including: switch element, the first judging unit, reading Take writing unit and signal generating unit, wherein,
Described switch element, when the power on request sent when the controller receiving peripheral hardware, opens Machine starts, and triggers described first judging unit, when the shutdown request that the controller receiving peripheral hardware sends, Carry out power-off operation;
Described first judging unit, during for receiving the triggering of described start-up unit, it may be judged whether exist Ispci-tmp file, if it is, trigger described reading writing unit;And compare Ispci-tmp file and Ispci-$ count file is the most consistent, otherwise, triggers described signal generating unit;
Described reading writing unit, for reading the facility information in server, writes this facility information Ispci-$ count file, and when described Ispci-tmp file is consistent with described Ispci-$ count file, Send and started information to the controller of peripheral hardware, and create gpu.txt file and server.txt file;
Described signal generating unit, is used for as each facility information, generation Ispci-tmp file.
Preferably, above-mentioned server, the switch of peripheral hardware it is connected to by OS network and BMC network.
Preferably, above-mentioned server, farther include: the second judging unit and the first enumerator, wherein,
Described second judging unit, is used for judging whether count file, if it is, trigger institute State the first enumerator;Otherwise, start described first enumerator, generate count file, and by described the One enumerator write server starting up's item;
Described first enumerator, for adding up starting up's number of times of described switch element, when described switch When unit boots up startup, carry out count+1, and starting up's number of times is stored described count File.
Preferably, above-mentioned server, for Pcie-Switch server, this Pcie-Switch server bag Including: Resource Server and server end, wherein, described server end is inserted with a retimer card, passes through This retimer card and MiniSASHD cable are connected with described Resource Server.
It is applied to the controller of any of the above-described described server reboot method of testing, including: list is set Unit, the second enumerator, detector unit and call control unit, wherein,
Described unit is set, is used for arranging detection threshold value;
Described detector unit, for judging whether the counting of described second enumerator arranges unit less than described Arrange detection threshold value, if it is, detection peripheral hardware server in whether exist gpu.txt file and Server.txt file, if it is, call control unit described in Chu Faing;
Described call control unit, for when receiving the triggering of described detector unit, call peripheral hardware The shutdown function of server, controls the server shutdown of peripheral hardware, sends power on request to the server of peripheral hardware, Call the start function of the server of peripheral hardware, control the server starting up of peripheral hardware, and by described second The counting of enumerator adds 1.
Server reboot tests system, including: at least one any one server above-mentioned, switch With any one controller above-mentioned, wherein,
At least one server described and described controller are connected with described switch respectively.
Embodiments provide server reboot method of testing, server, controller and system, The method, by switch, sets up the intercommunication of server and controller;When server receives During the power on request that controller sends, boot up startup;Server judges whether itself exists Ispci-tmp File, if it is, the facility information read in server, writes Ispci-$ count by this facility information File, and it is the most consistent with Ispci-$ count file to compare Ispci-tmp file, otherwise, for server In facility information, generate Ispci-tmp file;As described Ispci-tmp file and described Ispci-$ count When file is consistent, sends startup and complete information to controller;Create gpu.txt file and server.txt literary composition Part;Receive the shutdown request that controller sends, carry out power-off operation, with it, pass through server Judge whether file exists, and the concordance between documents, startup of server can judged the most just Often, it addition, the startup of server and shutdown all can be carried out under the control of the controller automatically, and without Artificial participation is come in, it is achieved that the automatization of server stability test.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to reality Execute the required accompanying drawing used in example or description of the prior art to be briefly described, it should be apparent that below, Accompanying drawing in description is some embodiments of the present invention, for those of ordinary skill in the art, not On the premise of paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the flow chart of the server reboot method of testing that one embodiment of the invention provides;
Fig. 2 is the flow chart of the server reboot method of testing that another embodiment of the present invention provides;
Fig. 3 is the flow chart of the server reboot method of testing that another embodiment of the present invention provides;
Fig. 4 is the startup/shutdown time diagram of the Pcie-Switch server that the embodiment of the present invention provides;
Fig. 5 is the structural representation of the server that one embodiment of the invention provides;
Fig. 6 is the structural representation of the controller that one embodiment of the invention provides;
Fig. 7 is the structural representation of the server reboot test system that one embodiment of the invention provides.
Detailed description of the invention
For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with this Accompanying drawing in bright embodiment, is clearly and completely described the technical scheme in the embodiment of the present invention, Obviously, described embodiment is a part of embodiment of the present invention rather than whole embodiments, based on Embodiment in the present invention, those of ordinary skill in the art are institute on the premise of not making creative work The every other embodiment obtained, broadly falls into the scope of protection of the invention.
As it is shown in figure 1, embodiments provide a kind of server reboot method of testing, the method May comprise steps of:
Step 101: by switch, set up the intercommunication of server and controller;
Step 102: when server receives the power on request that controller sends, boot up startup;
Step 103: server judges whether itself exists Ispci-tmp file, if it is, perform step Rapid 104;Otherwise, step 105 is performed;
Step 104: read the facility information in server, writes Ispci-$ count by this facility information File, and it is the most consistent with Ispci-$ count file to compare Ispci-tmp file, if it is, perform Step 106, otherwise, performs step 107;
Step 105: for the facility information in server, generate Ispci-tmp file;
Step 106: send startup and complete information to controller, and create gpu.txt file and server.txt File, and perform step 108;
Step 107: prompting error message, and terminate current process;
Step 108: receive the shutdown request that controller sends, carry out power-off operation.
By switch, set up the intercommunication of server and controller;When server receives control During the power on request that device sends, boot up startup;Server judges itself whether to there is Ispci-tmp literary composition Part, if it is, the facility information read in server, writes Ispci-$ count by this facility information File, and it is the most consistent with Ispci-$ count file to compare Ispci-tmp file, otherwise, for server In facility information, generate Ispci-tmp file;As described Ispci-tmp file and described Ispci-$ count When file is consistent, sends startup and complete information to controller;Create gpu.txt file and server.txt literary composition Part;Receive the shutdown request that controller sends, carry out power-off operation, with it, pass through server Judge whether file exists, and the concordance between documents, startup of server can judged the most just Often, it addition, the startup of server and shutdown all can be carried out under the control of the controller automatically, and without Artificial participation is come in, it is achieved that the automatization of server stability test.
In an embodiment of the invention, in order to ensure the communication between server and controller, step 101 Detailed description of the invention: server is connected to switch by OS network and BMC network;Controller leads to Cross OS network and be connected to switch.
In an embodiment of the invention, in order to realize the statistical server number of starts, farther include: First enumerator count is set in the server;After step 102, farther include: server is sentenced Break and itself whether there is count file, if it is, by the first enumerator count+1, storage to count File;Otherwise, start the first enumerator count, by the first enumerator count+1, generate count literary composition Part, and the first enumerator count is write server starting up's item, by by enumerator write service Device starting up's item, it is ensured that number of starts statistical accuracy.
In an embodiment of the invention, described server is Pcie-Switch server, should Pcie-Switch server includes: Resource Server and server end, and wherein, described server end is inserted with One retimer card, is connected with described Resource Server by this retimer card and MiniSASHD cable Connect;Farther include: startup sequential is set;Described starting up, including: during according to the startup arranged Sequence, Resource Server and server end described in sequence starting, it is ensured that Pcie-Switch server is automatic Normal startup.
As in figure 2 it is shown, the embodiment of the present invention provides server reboot method of testing, it is applied to controller, May include steps of:
Step 201: the second enumerator count is set in the controller, and detection threshold value is set;
Step 202: controller initializes the second enumerator count;
Step 203: the startup receiving server transmission completes information, it is judged that the meter of the second enumerator count Whether number is less than detection threshold value, if it is, perform step 204;Otherwise, step 205 is performed;
Step 204: whether there is gpu.txt file and server.txt file in detection server, if it is, Then perform step 206;Otherwise, step 207 is performed;
Step 205: closing server, and exit control server, and terminate current process;
Step 206: the shutdown function of invoking server, controls server shutdown, and performs step 208;
Step 207: postpone certain time, and return execution step 203;
Step 208: send power on request, the start function of invoking server to server, control service Device starting up, and by the second enumerator count+1, perform step 203.
In an embodiment of the invention, server is opened by the operation in order to avoid having performed in server The raw impact of movable property, the method farther includes: remove the operating system daily record in server.
In an embodiment of the invention, described server is Pcie-Switch server, should Pcie-Switch server includes: Resource Server and server end, and wherein, described server end is inserted with One retimer card, is connected with described Resource Server by this retimer card and MiniSASHD cable Connect;Described control server shuts down, including: described in sequential control, the shutdown of server end and described resource take Business device shutdown;Described control server starting up, including: Resource Server start described in sequential control Start and described server end starting up, it is ensured that the normal startup of Pcie-Switch server, it is achieved Automatization to the stability test of Pcie-Switch server.
For making the object, technical solutions and advantages of the present invention clearer, in conjunction with server and controller it Between interaction, be described in further detail.
As it is shown on figure 3, further embodiment of this invention provides server reboot method of testing, the method May comprise steps of:
Step 301: by switch, set up the intercommunication of server and controller;
In this step, server is connected to switch by OS network and BMC network;Controller leads to Cross OS network and be connected to switch;When server is Pcie-Switch server, this Pcie-Switch Server includes: Resource Server and server end, and wherein, Resource Server can comprise multiple GPU, Server end is inserted with a retimer card, is taken with resource by this retimer card and MiniSASHD cable Business device connects;Resource Server and server end are connected to switch by OS network and BMC network.
Step 302: the first enumerator count is set in the server, the second counting is set in the controller Device count, and detection threshold value is set;
In this step, when server is Pcie-Switch server, can be by the first enumerator count It is arranged on Resource Server.
Step 303: controller initializes the second enumerator count, removes the operating system day in server Will;
In this step, remove the operating system daily record in server, be in order to avoid server before behaviour Make stability test is impacted.
Step 304: controller sends power on request to server, the start function of invoking server, enters Row server starting up;
In this step, when server is non-thermal pluggable server such as Pcie-Switch server etc., permissible Further by arranging startup sequential, start server, as shown in Figure 4, the present invention according to starting sequential Embodiment is startup/shutdown sequential that Pcie-Switch server is arranged, due to Pcie-Switch server For non-hot plug, then need first to start the Resource Server including GPU, when Resource Server Qidong After completing, restart server end;Meanwhile, in shutdown process, then it is first shut off server end, Turn off Resource Server, thus avoid server owing to starting the machine of delaying that sequence problem causes.
Step 305: server judges whether itself exists count file, if it is, perform step 306;Otherwise perform step 307;
Step 306: by the first enumerator count+1, storage to count file, and perform step 308;
Step 307: start the first enumerator count, by the first enumerator count+1, generates count File, and the first enumerator count is write server starting up's item;
In the process of step 305 to step 307, mainly server self statistics to its number of starts, This statistics is automatically performed by enumerator, and without artificial participation.
Step 308: server judges whether itself exists Ispci-tmp file, if it is, perform step Rapid 309;Otherwise perform step 310;
Step 309: read the facility information in server, writes Ispci-$ count by this facility information File, and it is the most consistent with Ispci-$ count file to compare Ispci-tmp file, if it is, perform Step 311;Otherwise perform step 312;
Step 310: for the facility information in server, generate Ispci-tmp file;
Step 308 to step 310 is to the collection of the information of each equipment in server and contrast, passes through Contrast to facility information, it is judged that whether server starts completely, such as: have GPU1 in a server And GPU2, the information of GPU1 and GPU2 is then included at Ispci-tmp file, and Ispci-$ count Only have the information of GPU1 in file, then two files are inconsistent, startup of server is described and is not fully complete.
Step 311: send startup and complete information to controller, establishment gpu.txt file and server.txt File, and perform step 313;
Step 312: prompting error message, and terminate current process;
Step 313: controller receives the startup of server transmission and completes information, it is judged that the second enumerator count Counting whether less than detection threshold value, if it is, perform step 314;Otherwise, step 315 is performed;
Such as: arranging detection threshold value is 1000, and the second enumerator count is counted as startup of server Number, then, when startup of server number of times is less than 1000, the counting of the second enumerator count is less than 1000.
Step 314: whether there is gpu.txt file and server.txt file in detection server, if it is, Then perform step 316;Otherwise, step 317 is performed;
In this step, it is necessary first to whether detection server is connected to switch, i.e. controller can lead to Cross switch and be connected to server, owing to above mentioning after startup of server completes, gpu.txt will be created File and server.txt file, then by controller detection server in whether exist gpu.txt file and Server.txt file, further determines that server has had been started up.
Step 315: closing server, and exit control server, and terminate current process;
Step 316: the shutdown function of invoking server, controls server shutdown, and performs step 304;
In this step, by the second computer count+1, for non-thermal pluggable server such as Pcie-Switch For server etc., it is possible to according to the sequential being arranged above with, the shutdown of sequential control server end and resource take Business device shutdown.
Step 317: postpone certain time, and return execution step 313.
If gpu.txt file and server.txt file not detected, being likely to be due to server and also not starting Complete, then the startup that can the most again receive server transmission by extending certain time such as 5s completes letter Breath.
As it is shown in figure 5, the embodiment of the present invention provides is applied to any of the above-described described server reboot survey The server of method for testing, this server switch by peripheral hardware, and phase intercommunication between the controller of peripheral hardware Letter, including: switch element the 501, first judging unit 502, reading writing unit 503 and signal generating unit 504, wherein,
Switch element 501, when the power on request sent when the controller receiving peripheral hardware, opens Machine starts, and triggers the first judging unit 502, when the shutdown request that the controller receiving peripheral hardware sends, Carry out power-off operation;
First judging unit 502, during for receiving the triggering of start-up unit 501, it may be judged whether exist Ispci-tmp file, reads writing unit 503 if it is, trigger;And compare Ispci-tmp file and Ispci-$ count file is the most consistent, otherwise, triggers signal generating unit 504;
Read writing unit 503, for reading the facility information in server, this facility information is write Ispci-$ count file, and when Ispci-tmp file is consistent with Ispci-$ count file, sends and start The information that completes is to the controller of peripheral hardware, and creates gpu.txt file and server.txt file;
Signal generating unit 504, is used for as each facility information, generation Ispci-tmp file.
In an alternative embodiment of the invention, the switch of peripheral hardware it is connected to by OS network and BMC network.
In still another embodiment of the process, above-mentioned server farther includes: the second judging unit and first Enumerator (not shown), wherein,
Second judging unit, is used for judging whether count file, if it is, trigger the first meter Number device;Otherwise, start the first enumerator, generate count file, and by the first enumerator write service Device starting up's item;
First enumerator, for starting up's number of times of statistic switch unit 501, when switch element 501 When booting up startup, carry out count+1, and starting up's number of times is stored count file.
In an alternative embodiment of the invention, above-mentioned server is Pcie-Switch server, should Pcie-Switch server includes: Resource Server and server end, and wherein, server end is inserted with one Retimer card, is connected with Resource Server by this retimer card and MiniSASHD cable.
As shown in Figure 6, the embodiment of the present invention provides and is applied to any of the above-described described server reboot survey The controller of method for testing, this controller, including: unit the 601, second enumerator 602, detection list are set Unit 603 and call control unit 604, wherein,
Unit 601 is set, is used for arranging detection threshold value;
Detector unit 603, for judging that whether the counting of the second enumerator 602 is less than arranging unit 601 Arrange detection threshold value, if it is, detection peripheral hardware server in whether exist gpu.txt file and Server.txt file, calls control unit 604 if it is, trigger;
Call control unit 604, for when receiving the triggering of detector unit 603, call peripheral hardware The shutdown function of server, controls the server shutdown of peripheral hardware, sends power on request to the server of peripheral hardware, Call the start function of the server of peripheral hardware, control the server starting up of peripheral hardware, and count second The counting of device 602 adds 1.
The contents such as the information between each unit in said apparatus is mutual, execution process, due to the present invention Embodiment of the method is based on same design, and particular content can be found in the narration in the inventive method embodiment, this Place repeats no more.
As it is shown in fig. 7, the embodiment of the present invention provides server reboot to test system, including: at least one Individual any one server 701 above-mentioned, switch 702 and controller 703, wherein,
At least one server 701 and controller 703 are connected with switch 702 respectively.
According to such scheme, server reboot method of testing that various embodiments of the present invention are provided, clothes Business device, controller and system, at least have the advantages that
1., by switch, set up the intercommunication of server and controller;When server receives control During the power on request that device processed sends, boot up startup;Server judges whether itself exists Ispci-tmp File, if it is, the facility information read in server, writes Ispci-$ count by this facility information File, and it is the most consistent with Ispci-$ count file to compare Ispci-tmp file, otherwise, for server In facility information, generate Ispci-tmp file;As described Ispci-tmp file and described Ispci-$ count When file is consistent, sends startup and complete information to controller;Create gpu.txt file and server.txt literary composition Part;Receive the shutdown request that controller sends, carry out power-off operation, with it, pass through server Judge whether file exists, and the concordance between documents, startup of server can judged the most just Often, it addition, the startup of server and shutdown all can be carried out under the control of the controller automatically, and without Artificial participation is come in, it is achieved that the automatization of server stability test.
2. server is connected to switch by OS network and BMC network;Controller passes through OS net Network is connected to switch;Enable the controller to control the automatic switching of server, it is ensured that server The automatization of stability test, it addition, by arranging startup sequential;According to the startup sequential arranged, suitable Sequence starts Resource Server and the server end of Pcie-Switch server so that non-hot plug Pcie-Switch server also is able to realize the automatization of stability test.
3., after startup of server completes, started information to controller, establishment gpu.txt file With server.txt file, controller, after receiving startup and completing information, still can detect in server Whether there is gpu.txt file and server.txt file, it is ensured that the accuracy of startup of server.
4., by arranging detection threshold value, only it is less than when the counting of the second enumerator count in controller During detection threshold value, controller just can call on/off function, controls the on/off of server, keeps away Exempt from stability test and enter endless loop, it addition, controller is by removing the operating system daily record in server, Avoid server stability is impacted except the operation outside shutdown by server, improve clothes further The accuracy of business device stability test.
It should be noted that in this article, the relational terms of such as first and second etc be used merely to by One entity or operation separate with another entity or operating space, and not necessarily require or imply this Relation or the order of any this reality is there is between a little entities or operation.And, term " includes ", " comprise " or its any other variant is intended to comprising of nonexcludability, so that include that one is The process of row key element, method, article or equipment not only include those key elements, but also include the brightest Other key elements really listed, or also include intrinsic for this process, method, article or equipment Key element.In the case of there is no more restriction, statement " include one " and limit Key element, it is not excluded that there is also another in including the process of described key element, method, article or equipment Outer same factor.
One of ordinary skill in the art will appreciate that: realize all or part of step of said method embodiment Can be completed by the hardware that programmed instruction is relevant, aforesaid program can be stored in embodied on computer readable Storage medium in, this program upon execution, performs to include the step of said method embodiment;And it is aforementioned Storage medium include: various Jie that can store program code such as ROM, RAM, magnetic disc or CD In matter.
Last it should be understood that the foregoing is only presently preferred embodiments of the present invention, it is merely to illustrate this The technical scheme of invention, is not intended to limit protection scope of the present invention.All spirit in the present invention and former Any modification, equivalent substitution and improvement etc. done within then, are all contained in protection scope of the present invention.

Claims (10)

1. server reboot method of testing, it is characterised in that by switch, set up server and The intercommunication of controller;Also include:
When server receives the power on request that controller sends, boot up startup;
Server judges whether itself exists Ispci-tmp file, if it is, read setting in server Standby information, writes Ispci-$ count file by this facility information, and compare Ispci-tmp file and Ispci-$ count file is the most consistent, otherwise, for the facility information in server, generates Ispci-tmp File;
When described Ispci-tmp file is consistent with described Ispci-$ count file, sends startup and complete letter Breath is to controller;
Create gpu.txt file and server.txt file;
Receive the shutdown request that controller sends, carry out power-off operation.
Method the most according to claim 1, it is characterised in that
Described set up the intercommunication of server and controller by switch, including:
Server is connected to switch by OS network and BMC network;
Controller is connected to switch by OS network;
And/or,
Farther include: the first enumerator count is set in the server;
After described starting up, farther include: server judges itself whether to there is count literary composition Part, if it is, by described first enumerator count+1, storage to described count file;Otherwise, Start described first enumerator count, by described first enumerator count+1, generate count file, And described first enumerator count is write server starting up's item.
Method the most according to claim 1 and 2, it is characterised in that
Described server is Pcie-Switch server, and this Pcie-Switch server includes: resource takes Business device and server end, wherein, described server end is inserted with a retimer card, by this retimer Card and MiniSASHD cable are connected with described Resource Server;
Farther include: startup sequential is set;
Described starting up, including: according to the startup sequential arranged, Resource Server described in sequence starting With server end.
4. server reboot method of testing, it is characterised in that be applied to controller, in the controller Second enumerator count is set, and detection threshold value is set;Also include:
M1, controller initialize described second enumerator count;
The startup that M2, reception server send completes information, it is judged that the meter of described second enumerator count Number whether less than detection threshold value, if it is, whether detection server exists gpu.txt file and Server.txt file, if it is, the shutdown function of invoking server, controls server shutdown;
M3, to server send power on request, the start function of invoking server, control server start Start, and by the second enumerator count+1, perform M2.
Method the most according to claim 4, it is characterised in that farther include: remove described clothes Operating system daily record in business device.
6. according to the method described in claim 4 or 5, it is characterised in that described server is Pcie-Switch server, this Pcie-Switch server includes: Resource Server and server end, its In, described server end is inserted with a retimer card, by this retimer card and MiniSASHD cable It is connected with described Resource Server;
Described control server shuts down, including: described in sequential control, the shutdown of server end and described resource take Business device shutdown;
Described control server starting up, including: Resource Server starting up described in sequential control and Described server end starting up.
7. it is applied to the server of the arbitrary described server reboot method of testing of claims 1 to 3, By the switch of peripheral hardware, with the intercommunication of the controller of peripheral hardware, it is characterised in that including: open Close unit, the first judging unit, read writing unit and signal generating unit, wherein,
Described switch element, when the power on request sent when the controller receiving peripheral hardware, opens Machine starts, and triggers described first judging unit, when the shutdown request that the controller receiving peripheral hardware sends, Carry out power-off operation;
Described first judging unit, during for receiving the triggering of described start-up unit, it may be judged whether exist Ispci-tmp file, if it is, trigger described reading writing unit;And compare Ispci-tmp file and Ispci-$ count file is the most consistent, otherwise, triggers described signal generating unit;
Described reading writing unit, for reading the facility information in server, writes this facility information Ispci-$ count file, and when described Ispci-tmp file is consistent with described Ispci-$ count file, Send and started information to the controller of peripheral hardware, and create gpu.txt file and server.txt file;
Described signal generating unit, is used for as each facility information, generation Ispci-tmp file.
Server the most according to claim 7, it is characterised in that
The switch of peripheral hardware it is connected to by OS network and BMC network;
And/or,
Farther include: the second judging unit and the first enumerator, wherein,
Described second judging unit, is used for judging whether count file, if it is, trigger institute State the first enumerator;Otherwise, start described first enumerator, generate count file, and by described the One enumerator write server starting up's item;
Described first enumerator, for adding up starting up's number of times of described switch element, when described switch When unit boots up startup, carry out count+1, and starting up's number of times is stored described count File;
And/or,
Described server is Pcie-Switch server, and this Pcie-Switch server includes: resource takes Business device and server end, wherein, described server end is inserted with a retimer card, by this retimer Card and MiniSASHD cable are connected with described Resource Server.
9. it is applied to the controller of the arbitrary described server reboot method of testing of claim 4 to 6, It is characterized in that, including: unit, the second enumerator, detector unit are set and call control unit, its In,
Described unit is set, is used for arranging detection threshold value;
Described detector unit, for judging whether the counting of described second enumerator arranges unit less than described Arrange detection threshold value, if it is, detection peripheral hardware server in whether exist gpu.txt file and Server.txt file, if it is, call control unit described in Chu Faing;
Described call control unit, for when receiving the triggering of described detector unit, call peripheral hardware The shutdown function of server, controls the server shutdown of peripheral hardware, sends power on request to the server of peripheral hardware, Call the start function of the server of peripheral hardware, control the server starting up of peripheral hardware, and by described second The counting of enumerator adds 1.
10. server reboot tests system, it is characterised in that including: at least one claim 7 Or server described in 8, switch and the controller described in claim 9, wherein,
At least one server described and described controller are connected with described switch respectively.
CN201610202489.XA 2016-04-01 2016-04-01 Reboot testing method of server, server, control device and system Pending CN105912431A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610202489.XA CN105912431A (en) 2016-04-01 2016-04-01 Reboot testing method of server, server, control device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610202489.XA CN105912431A (en) 2016-04-01 2016-04-01 Reboot testing method of server, server, control device and system

Publications (1)

Publication Number Publication Date
CN105912431A true CN105912431A (en) 2016-08-31

Family

ID=56745210

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610202489.XA Pending CN105912431A (en) 2016-04-01 2016-04-01 Reboot testing method of server, server, control device and system

Country Status (1)

Country Link
CN (1) CN105912431A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649014A (en) * 2016-12-28 2017-05-10 郑州云海信息技术有限公司 Automatic testing method of calculating type server which supports multiple GPUs
CN108958995A (en) * 2018-05-21 2018-12-07 郑州云海信息技术有限公司 A kind of method and system of whole machine cabinet server stability test
CN116244113A (en) * 2023-02-22 2023-06-09 安芯网盾(北京)科技有限公司 System downtime obstacle avoidance and restoration method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7818621B2 (en) * 2007-01-11 2010-10-19 International Business Machines Corporation Data center boot order control
CN104375910A (en) * 2014-11-24 2015-02-25 浪潮电子信息产业股份有限公司 Automatic power-on and power-off test method
CN104536875A (en) * 2015-01-16 2015-04-22 浪潮电子信息产业股份有限公司 IPMI-based method for carrying out automatic restart test on server
CN104899120A (en) * 2015-05-27 2015-09-09 浪潮电子信息产业股份有限公司 Server stability testing method based on BMC (baseboard management controller) startup and shutdown functions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7818621B2 (en) * 2007-01-11 2010-10-19 International Business Machines Corporation Data center boot order control
CN104375910A (en) * 2014-11-24 2015-02-25 浪潮电子信息产业股份有限公司 Automatic power-on and power-off test method
CN104536875A (en) * 2015-01-16 2015-04-22 浪潮电子信息产业股份有限公司 IPMI-based method for carrying out automatic restart test on server
CN104899120A (en) * 2015-05-27 2015-09-09 浪潮电子信息产业股份有限公司 Server stability testing method based on BMC (baseboard management controller) startup and shutdown functions

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649014A (en) * 2016-12-28 2017-05-10 郑州云海信息技术有限公司 Automatic testing method of calculating type server which supports multiple GPUs
CN108958995A (en) * 2018-05-21 2018-12-07 郑州云海信息技术有限公司 A kind of method and system of whole machine cabinet server stability test
CN116244113A (en) * 2023-02-22 2023-06-09 安芯网盾(北京)科技有限公司 System downtime obstacle avoidance and restoration method and device
CN116244113B (en) * 2023-02-22 2023-12-19 安芯网盾(北京)科技有限公司 System downtime obstacle avoidance and restoration method and device

Similar Documents

Publication Publication Date Title
CN107193750B (en) Script recording method and device
WO2018120721A1 (en) Method and system for testing user interface, electronic device, and computer readable storage medium
CN109510742B (en) Server network card remote test method, device, terminal and storage medium
CN110162435B (en) Method, system, terminal and storage medium for starting and testing PXE of server
CN105787364B (en) Automatic testing method, device and system for tasks
CN104391765A (en) Method for automatically diagnosing starting fault of server
CN103186740A (en) Automatic detection method for Android malicious software
CN110704304A (en) Application program testing method and device, storage medium and server
CN111143150A (en) Method and system for testing PCBA (printed circuit board assembly), testing equipment and micro-control unit
CN110557299A (en) network transmission function batch test method, system, terminal and storage medium
CN111209151A (en) Linux-based NVME SSD hot plug test method, system, terminal and storage medium
CN105512562B (en) Vulnerability mining method and device and electronic equipment
CN111258913A (en) Automatic algorithm testing method and device, computer system and readable storage medium
CN114546738A (en) Server general test method, system, terminal and storage medium
CN105912431A (en) Reboot testing method of server, server, control device and system
CN108446224B (en) Performance analysis method of application program on mobile terminal and storage medium
CN110515755A (en) Interface function verification method, device, equipment and computer readable storage medium
CN107679423A (en) Partition integrity inspection method and device
CN108897646B (en) Switching method of BIOS (basic input output System) chips and substrate management controller
CN109656761A (en) Server HT automatic test approach and system based on Linux system
CN106201787A (en) Terminal control method and device
CN111176924B (en) GPU card dropping simulation method, system, terminal and storage medium
CN110083493A (en) A kind of embedded system failure self-recovery method, terminal device and storage medium
CN112115060A (en) Audio test method and system based on terminal
CN112596750B (en) Application testing method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160831