CN111752773A - Method and device for realizing power-on self-check verification of clustered system - Google Patents

Method and device for realizing power-on self-check verification of clustered system Download PDF

Info

Publication number
CN111752773A
CN111752773A CN202010429642.9A CN202010429642A CN111752773A CN 111752773 A CN111752773 A CN 111752773A CN 202010429642 A CN202010429642 A CN 202010429642A CN 111752773 A CN111752773 A CN 111752773A
Authority
CN
China
Prior art keywords
bmc
power
tested
server
verification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010429642.9A
Other languages
Chinese (zh)
Other versions
CN111752773B (en
Inventor
秦秀凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010429642.9A priority Critical patent/CN111752773B/en
Publication of CN111752773A publication Critical patent/CN111752773A/en
Application granted granted Critical
Publication of CN111752773B publication Critical patent/CN111752773B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2247Verification or detection of system hardware configuration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2284Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing by power-on test, e.g. power-on self test [POST]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/161Computing infrastructure, e.g. computer clusters, blade chassis or hardware partitioning

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)
  • Power Sources (AREA)

Abstract

The invention provides a method for realizing power-on self-check verification of a clustered system, which comprises the following steps: the method comprises the steps that a control end server BMC establishes communication connection with a plurality of end-to-be-tested servers BMC in a clustering manner; the control end server BMC acquires BMC information of the end server to be tested, and calls a corresponding installation template according to the BMC information of the end server to be tested to install the end server to be tested; the invention also provides a device for realizing the power-on self-check verification of the clustered system, which effectively solves the problems of low efficiency and incomplete scene caused by single verification of a single server, can be suitable for the clustered server system and effectively improves the efficiency and the comprehensiveness of the power-on self-check verification of the server during power-on.

Description

Method and device for realizing power-on self-check verification of clustered system
Technical Field
The invention relates to the field of PCB system design, in particular to a method and a device for realizing power-on self-check verification of a clustered system.
Background
The power-on startup strategy is the basis for starting up the BMC to determine whether to start up the system. And the power-on startup strategy verifies different conditions respectively to obtain the conclusion whether the power-on startup strategy of the machine to be tested is sound and available. The system power-on self-check is a protective check of the system in the process of self operation and interaction between hardware and software, and has great effect on the healthy operation of the system, the correction of conflict between the software and the system and the protection of the normal operation of the hardware, so the power-on self-check is a necessary system maintenance measure.
In the prior art, a power-on self-test mode of a computer system mainly tests devices such as a CPU, a system motherboard, a basic memory, an extended memory, a system ROMBIOS after a power supply is turned on, and if an error is found in the self-test, the method is handled according to two conditions: the system is shut down for serious faults (fatal faults), and no prompt or signal can be given at the moment because various initialization operations are not completed; and giving out a prompt or an audible alarm signal for the non-serious fault and waiting for the processing of a user. However, the existing verification mode is single manual verification for a single server, which is not only long in time and incapable of realizing detection of a clustered system, but also incomplete in single verification scene, incapable of realizing remote operation, and not beneficial to improving the efficiency of power-on self-check verification of a server system.
Disclosure of Invention
The invention aims to solve the problems in the prior art, and innovatively provides a method and a device for realizing the power-on self-check verification of a clustered system, so that the problems of low efficiency and incomplete scene caused by single verification of a single server are effectively solved, the method and the device can be applied to the clustered server system, and the efficiency and comprehensiveness of the power-on self-check verification of the server during startup are effectively improved.
The first aspect of the present invention provides a method for implementing power-on self-check verification in a clustered system, including:
the method comprises the steps that a control end server BMC establishes communication connection with a plurality of end-to-be-tested servers BMC in a clustering manner;
the control end server BMC acquires BMC information of the end server to be tested, and calls a corresponding installation template according to the BMC information of the end server to be tested to install the end server to be tested;
the control end server sets a power-on startup strategy and controls the BMC of the end to be tested to carry out circular power-on startup strategy verification according to the set power-on startup strategy in batch, wherein the power-on startup strategy verification comprises system power-on self-test.
Optionally, the BMC information of the server at the end to be tested includes IP information of the BMC, login password information, and user name information.
Optionally, the installed template is preset in the control end server and corresponds to the model and the function of the end server to be tested one by one.
Optionally, the control end server sets an energization startup strategy, and controls the BMC of the end-to-be-tested end server to perform cyclic energization startup strategy verification according to the set energization startup strategy in batch, wherein the energization startup strategy verification includes system startup self-check:
the method comprises the steps that a control end server defines all states of a power-on startup strategy, sets the power-on startup strategy, sends the set power-on startup strategy to a server to be tested in batches, and a BMC (baseboard management controller) of the server to be tested receives the power-on startup strategy sent by the control end server;
the control end server sequentially sends a power-off shutdown instruction and a power-on startup instruction to the BMC of the end-to-be-tested server in batches according to the set power-on startup strategy;
after receiving a power-off instruction and a power-on instruction sent by a control end server, a BMC (baseboard management controller) of the end to be tested controls the end to be tested to sequentially perform power-off and power-on in batches;
the server BMC of the end to be tested verifies whether the restart is successful, if the server BMC of the end to be tested is successfully restarted, the next step is carried out, if the BMC restart is unsuccessful, whether the cumulative verification time period of the BMC is smaller than the second time period is counted after the first time period, if the cumulative verification time period of the BMC is smaller than the second time period, the next step is carried out, and if the cumulative verification time period of the BMC is not smaller than the second time period, the BMC restart verification is continuously;
and the BMC of the end to be tested verifies whether the IP of the system is ping-on, if so, the next step is carried out, if not, the ping-on is carried out, and whether the accumulated verification time period of the IP of the system is smaller than the fourth time period is counted after a third time period is set, if so, the next step is carried out, and if not, the verification of the IP of the system is continued.
Further, still include: and the BMC of the end to be tested monitors logs in the verification process of the circular power-on startup strategy and sends the verification result to the BMC of the control end.
Optionally, all states of the power-on boot policy include: the system is in a starting state, the system keeps in a shutdown state and the last state.
The second aspect of the present invention provides an apparatus for implementing power-on self-check verification in a clustered system, including:
the communication connection module is used for establishing communication connection between the control end server BMC and a plurality of end-to-be-tested servers BMC in the cluster;
the control end server BMC acquires the BMC information of the end to be tested, and calls a corresponding installation template to install the end to be tested according to the BMC information of the end to be tested;
and the verification module is used for setting a power-on startup strategy by the control end server and controlling the BMC of the end to be tested to perform circular power-on startup strategy verification according to the set power-on startup strategy in batch, wherein the power-on startup strategy verification comprises system power-on self-test.
Optionally, the verification module specifically includes:
defining a setting submodule, wherein a control end server defines all states of a power-on starting strategy, sets the power-on starting strategy, sends the set power-on starting strategy to a server to be tested in batches, and a BMC (baseboard management controller) of the server to be tested receives the power-on starting strategy sent by the control end server;
the command sending submodule is used for the control end server to sequentially send a power-off shutdown command and a power-on startup command to the BMC of the end server to be tested in batches according to the set power-on startup strategy;
the instruction execution module is used for controlling the server to be tested to sequentially perform power-off shutdown and power-on startup in batch after the server BMC to be tested receives the power-off shutdown instruction and the power-on startup instruction sent by the server to be controlled;
the BMC restarts the verification submodule, the server BMC of the end to be tested verifies whether the restart is successful, if the server BMC of the end to be tested is restarted successfully, the next step is carried out, if the BMC restart is unsuccessful, whether the cumulative verification time period of the BMC is smaller than the second time period is counted after the first time period, if the cumulative verification time period of the BMC is smaller than the second time period, the next step is carried out, and if the cumulative verification time period of the BMC is not smaller than the second time period, the BMC restart verification;
and the server BMC for the end to be tested verifies whether the system IP is ping-on, if so, carries out the next step, if not, counts whether the accumulated verification time period of the system IP is less than the fourth time period after a third time period, if so, carries out the next step, and if not, continues to carry out the system IP verification.
Further, still include: and the server BMC at the end to be tested performs log monitoring on the verification process of the circular power-on startup strategy and sends the verification result to the server BMC at the control end.
The technical scheme adopted by the invention comprises the following technical effects:
1. the method and the system effectively solve the problems of low efficiency and incomplete scene caused by single verification of a single server, can be applied to a clustered server system, and effectively improve the efficiency and comprehensiveness of the power-on self-check verification of the server during startup.
2. The method and the device can verify all the states of the power-on startup strategy at one time and verify all the strategy states of the power-on startup strategy for multiple times in a circulating manner, so that manual operation is reduced, and the coverage scene is comprehensive.
3. The installation template is arranged in the control end server in advance, and the installation template corresponds to the machine type and the function of the server to be tested one by one, so that the installation requirements of servers with different machine types and different functions are met.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without any creative effort.
FIG. 1 is a schematic flow diagram of a process according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of step S3 in one embodiment of the method of the present invention;
FIG. 3 is another schematic flow chart of step S3 in a method according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of an apparatus according to a second embodiment of the present invention;
FIG. 5 is a schematic structural diagram of an authentication module in a second apparatus according to an embodiment of the present invention;
fig. 6 is another schematic structural diagram of an authentication module in a second apparatus according to an embodiment of the present invention.
Detailed Description
In order to clearly explain the technical features of the present invention, the following detailed description of the present invention is provided with reference to the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different features of the invention. To simplify the disclosure of the present invention, the components and arrangements of specific examples are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and procedures are omitted so as to not unnecessarily limit the invention.
Example one
As shown in fig. 1, the present invention provides a method for implementing power-on self-test verification in a clustered system, including:
s1, the control end server BMC establishes communication connection with a plurality of end-to-be-tested servers BMC in the cluster;
s2, the control end server BMC acquires the BMC information of the end to be tested, and calls a corresponding installation template to install the end to be tested according to the BMC information of the end to be tested;
and S3, the control end server sets a power-on startup strategy, and controls the BMC of the end to be tested to perform circular power-on startup strategy verification in batch according to the set power-on startup strategy, wherein the power-on startup strategy verification comprises system power-on self-test.
In step S1, the control end server BMC and the plurality of peer end servers BMC in the cluster may establish a communication connection through a network, or may establish a communication connection through other manners, which is not limited herein.
In step S2, the BMC information of the server to be tested includes IP information of the BMC, login password information, and user name information. The installation template is preset in the control end server and corresponds to the model and the function of the end server to be tested one by one. The servers with different types and functions have different requirements on the hard disk or application, the BMC information of the end-to-be-tested server is acquired through the BIOS of the control end server, different installation templates are called for server equipment meeting different conditions (types and functions) to be installed, and installation requirements of servers with different types and different functions are met.
The server BMC of the terminal to be tested performs real-time log monitoring on the installation process, and feeds back error information in the log to prevent the error in the installation from influencing the processes of later-stage drive installation, startup self-checking verification and the like.
In step S3, as shown in fig. 2, the method specifically includes:
s31, the control end server defines all states of the power-on startup strategy, sets the power-on startup strategy, sends the set power-on startup strategy to the end-to-be-tested server in batch, and the end-to-be-tested server BMC receives the power-on startup strategy sent by the control end server;
s32, the control end server sequentially sends a power-off instruction and a power-on instruction to the BMC of the end server to be tested in batches according to the set power-on strategy;
s33, after the BMC of the end to be tested receives the power-off instruction and the power-on instruction sent by the server of the control end, the server of the end to be tested is controlled in batch to sequentially carry out power-off and power-on;
s34, the server BMC of the end to be tested verifies whether the restart is successful, if so, the step S35 is executed; if the verification result is no, go to step S36;
s35, the server BMC of the end to be tested verifies whether the system IP is ping-on, if yes, the step S37 is executed, and if no, the step S38 is executed;
s36, after the first time period, counting whether the BMC accumulated verification time period is smaller than the second time period, if yes, executing the step S35, if no, continuing to execute the step S34;
s37, the server BMC of the end to be tested is electrified and the boot strategy is verified;
and S38, after the third time period, counting whether the system IP cumulative verification time period is less than the fourth time period, if so, executing the step S37, and if not, continuing to execute the step S35.
In step S31, all the states of the power-on boot policy include: the system startup state, the system shutdown state and the last system state can define all the states of the power-on startup strategy in an array definition mode.
In steps S32-S33, the power-off and power-on instructions sent by the control side server may be implemented by IPMI commands.
In step S34, the BMC of the server to be tested may specifically verify the BMCping status if the restart is successful, and if the BMCping is successful, the BMC is restarted successfully; if the BMC cannot ping, the BMC is unsuccessfully restarted.
In step S35, the BMC of the end server to be tested verifies whether the system IP ping is corresponding to the system power-on self-test, if the system IP ping can be enabled, it indicates that the system power-on self-test is enabled, and if the system IP ping is disabled, it indicates that the system power-on self-test is disabled, and a loop verification is required.
In step S36, after a first time period, it is counted whether the BMC cumulative verification time period is smaller than a second time period, where the first time period may be 30S, or may be flexibly adjusted according to an actual situation, that is, the BMC cumulative verification time period is counted once every 30S, and the second time period may be 10min, or may be adjusted according to an actual situation, that is, in the second time period, if the BMC verification fails, the loop verification is performed.
In step S37, if the BMC verifies and restarts successfully in the second time period and the system IP successfully pings in the fourth time period, the server of the end to be tested powers on and starts up successfully; if the BMC fails to verify and restart in the second time period and the system IP fails to ping successfully in the fourth time period, the server to be tested is powered on and the startup strategy fails to verify; if the BMC successfully verifies and restarts the server in the second time period and the IP of the system is not successfully ping communicated in the fourth time period, the server to be tested is failed to verify the power-on starting strategy; if the BMC fails to verify and restart in the second time period and the IP of the system succeeds in ping in the fourth time period, the server to be tested fails to verify the power-on startup strategy.
In step S38, after a third time period, it is counted whether the system IP cumulative verification time period is less than a fourth time period, where the third time period may be 30S, or may be flexibly adjusted according to the actual situation, that is, statistics of the system IP cumulative verification time period is performed every 30S, and the fourth time period may be 10min, or may be adjusted according to the actual situation, that is, in the fourth time period, if the system IP cannot ping, then a circular verification is performed.
It should be noted that, if the control end server detects that the state of the end server to be tested is the system power-on shutdown state in the power-on startup strategy, the control end server resends the power-on startup instruction to control the end server to be tested to perform power-on startup.
As shown in fig. 3, step S3 further includes:
and S39, the BMC of the end to be tested performs log monitoring on the verification process of the circular power-on startup strategy, and sends the verification result to the BMC of the control end.
And the control end server BMC receives the verification result sent by the end server to be tested, can screen information and displays the information on a control end server interface.
The BMC of the end to be tested performs real-time log monitoring on the verification process and feeds back error information in the log, so that the server of the control end can monitor the starting self-verification condition of the end to be tested in real time, timely adjustment is facilitated, and the successful and efficient starting self-verification of the end to be tested is ensured.
The method and the system effectively solve the problems of low efficiency and incomplete scene caused by single verification of a single server, can be applied to a clustered server system, and effectively improve the efficiency and comprehensiveness of the power-on self-check verification of the server during startup.
The method and the device can verify all the states of the power-on startup strategy at one time and verify all the strategy states of the power-on startup strategy for multiple times in a circulating manner, so that manual operation is reduced, and the coverage scene is comprehensive.
Example two
As shown in fig. 4, the technical solution of the present invention further provides an apparatus for implementing power-on self-test verification in a clustered system, including:
the communication connection module 11 is used for establishing communication connection between a control end server BMC and a plurality of end servers BMC to be tested in a clustering manner;
the acquisition and installation module 12 is used for acquiring the BMC information of the end server to be tested by the control end server BMC, and calling a corresponding installation template according to the BMC information of the end server to be tested to install the end server to be tested;
and the verification module 13 is used for setting a power-on startup strategy by the control end server and controlling the BMC of the end to be tested to perform circular power-on startup strategy verification according to the set power-on startup strategy in batch, wherein the power-on startup strategy verification comprises system power-on self-test.
Further, as shown in fig. 5, the verification module 13 specifically includes:
defining a setting submodule 131, defining all states of the power-on startup strategy by the control end server, setting the power-on startup strategy, sending the set power-on startup strategy to the end server to be tested in batches, and receiving the power-on startup strategy sent by the control end server by the end server to be tested by the BMC;
the instruction sending submodule 132 is used for the control end server to sequentially send a power-off shutdown instruction and a power-on startup instruction to the BMC of the end server to be tested in batches according to the set power-on startup strategy;
after receiving the power-off shutdown instruction and the power-on startup instruction sent by the control end server, the BMC of the end-to-be-tested server controls the end-to-be-tested server to sequentially perform power-off shutdown and power-on startup in batch;
the BMC restarts the verification submodule 134, the server BMC of the end to be tested verifies whether the restart is successful, if the server BMC of the end to be tested restarts successfully, the next step is carried out, if the BMC restart is unsuccessful, whether the cumulative BMC verification time period is smaller than the second time period is counted after the first time period, if the cumulative BMC verification time period is smaller than the second time period, the next step is carried out, and if the cumulative BMC restart verification is not smaller than the second time period, the BMC restart verification is carried out continuously;
and the system IP verification submodule 135 is used for verifying whether the system IP is ping-on or not by the server BMC to be tested, if so, performing the next step, if not, counting whether the accumulated verification time period of the system IP is smaller than the fourth time period or not after a third time period, if so, performing the next step, and if not, continuing to perform the system IP verification.
Further, as shown in fig. 6, the verification module 13 further includes: and the log monitoring and result sending submodule 136 is used for monitoring the log of the server BMC at the end to be tested aiming at the verification process of the circular power-on startup strategy and sending the verification result to the server BMC at the control end.
And the control end server BMC receives the verification result sent by the end server to be tested, can screen information and displays the information on a control end server interface.
The BMC of the end to be tested performs real-time log monitoring on the verification process and feeds back error information in the log, so that the server of the control end can monitor the starting self-verification condition of the end to be tested in real time, timely adjustment is facilitated, and the successful and efficient starting self-verification of the end to be tested is ensured.
The method and the system effectively solve the problems of low efficiency and incomplete scene caused by single verification of a single server, can be applied to a clustered server system, and effectively improve the efficiency and comprehensiveness of the power-on self-check verification of the server during startup.
The method and the device can verify all the states of the power-on startup strategy at one time and verify all the strategy states of the power-on startup strategy for multiple times in a circulating manner, so that manual operation is reduced, and the coverage scene is comprehensive.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (9)

1. A method for realizing power-on self-check verification of a clustered system is characterized by comprising the following steps:
the method comprises the steps that a control end server BMC establishes communication connection with a plurality of end-to-be-tested servers BMC in a clustering manner;
the control end server BMC acquires BMC information of the end server to be tested, and calls a corresponding installation template according to the BMC information of the end server to be tested to install the end server to be tested;
the control end server sets a power-on startup strategy and controls the BMC of the end to be tested to carry out circular power-on startup strategy verification according to the set power-on startup strategy in batch, wherein the power-on startup strategy verification comprises system power-on self-test.
2. The method as claimed in claim 1, wherein the BMC information of the end-to-be-tested server includes IP information of BMC, login password information, and user name information.
3. The method for implementing power-on self-check verification of the clustered system as claimed in claim 1, wherein the installation template is pre-arranged in the control end server and corresponds to the model and function of the end server to be tested one by one.
4. The method as claimed in claim 1, wherein the control end server sets a power-on startup policy, and controls the BMC of the end-to-be-tested end server to perform the verification of the circular power-on startup policy in batch according to the set power-on startup policy, wherein the verification of the power-on startup policy includes the following steps:
the method comprises the steps that a control end server defines all states of a power-on startup strategy, sets the power-on startup strategy, sends the set power-on startup strategy to a server to be tested in batches, and a BMC (baseboard management controller) of the server to be tested receives the power-on startup strategy sent by the control end server;
the control end server sequentially sends a power-off shutdown instruction and a power-on startup instruction to the BMC of the end-to-be-tested server in batches according to the set power-on startup strategy;
after receiving a power-off instruction and a power-on instruction sent by a control end server, a BMC (baseboard management controller) of the end to be tested controls the end to be tested to sequentially perform power-off and power-on in batches;
the server BMC of the end to be tested verifies whether the restart is successful, if the server BMC of the end to be tested is successfully restarted, the next step is carried out, if the BMC restart is unsuccessful, whether the cumulative verification time period of the BMC is smaller than the second time period is counted after the first time period, if the cumulative verification time period of the BMC is smaller than the second time period, the next step is carried out, and if the cumulative verification time period of the BMC is not smaller than the second time period, the BMC restart verification is continuously;
and the BMC of the end to be tested verifies whether the IP of the system is ping-on, if so, the next step is carried out, if not, the ping-on is carried out, and whether the accumulated verification time period of the IP of the system is smaller than the fourth time period is counted after a third time period is set, if so, the next step is carried out, and if not, the verification of the IP of the system is continued.
5. The method for implementing power-on self-test verification of a clustered system as claimed in claim 4, further comprising: and the BMC of the end to be tested monitors logs in the verification process of the circular power-on startup strategy and sends the verification result to the BMC of the control end.
6. The method as claimed in claim 4, wherein all the states of the power-on startup policy include: the system is in a starting state, the system keeps in a shutdown state and the last state.
7. A realization device for power-on self-check verification of a clustered system is characterized by comprising:
the communication connection module is used for establishing communication connection between the control end server BMC and a plurality of end-to-be-tested servers BMC in the cluster;
the control end server BMC acquires the BMC information of the end to be tested, and calls a corresponding installation template to install the end to be tested according to the BMC information of the end to be tested;
and the verification module is used for setting a power-on startup strategy by the control end server and controlling the BMC of the end to be tested to perform circular power-on startup strategy verification according to the set power-on startup strategy in batch, wherein the power-on startup strategy verification comprises system power-on self-test.
8. The apparatus for implementing power-on self-test verification in a clustered system as claimed in claim 7, wherein the verification module specifically comprises:
defining a setting submodule, wherein a control end server defines all states of a power-on starting strategy, sets the power-on starting strategy, sends the set power-on starting strategy to a server to be tested in batches, and a BMC (baseboard management controller) of the server to be tested receives the power-on starting strategy sent by the control end server;
the command sending submodule is used for the control end server to sequentially send a power-off shutdown command and a power-on startup command to the BMC of the end server to be tested in batches according to the set power-on startup strategy;
the instruction execution module is used for controlling the server to be tested to sequentially perform power-off shutdown and power-on startup in batch after the server BMC to be tested receives the power-off shutdown instruction and the power-on startup instruction sent by the server to be controlled;
the BMC restarts the verification submodule, the server BMC of the end to be tested verifies whether the restart is successful, if the server BMC of the end to be tested is restarted successfully, the next step is carried out, if the BMC restart is unsuccessful, whether the cumulative verification time period of the BMC is smaller than the second time period is counted after the first time period, if the cumulative verification time period of the BMC is smaller than the second time period, the next step is carried out, and if the cumulative verification time period of the BMC is not smaller than the second time period, the BMC restart verification;
and the server BMC for the end to be tested verifies whether the system IP is ping-on, if so, carries out the next step, if not, counts whether the accumulated verification time period of the system IP is less than the fourth time period after a third time period, if so, carries out the next step, and if not, continues to carry out the system IP verification.
9. The apparatus for implementing power-on self-test verification in a clustered system as claimed in claim 8, further comprising: and the server BMC at the end to be tested performs log monitoring on the verification process of the circular power-on startup strategy and sends the verification result to the server BMC at the control end.
CN202010429642.9A 2020-05-20 2020-05-20 Method and device for realizing power-on self-check verification of clustered system Active CN111752773B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010429642.9A CN111752773B (en) 2020-05-20 2020-05-20 Method and device for realizing power-on self-check verification of clustered system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010429642.9A CN111752773B (en) 2020-05-20 2020-05-20 Method and device for realizing power-on self-check verification of clustered system

Publications (2)

Publication Number Publication Date
CN111752773A true CN111752773A (en) 2020-10-09
CN111752773B CN111752773B (en) 2023-01-06

Family

ID=72673607

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010429642.9A Active CN111752773B (en) 2020-05-20 2020-05-20 Method and device for realizing power-on self-check verification of clustered system

Country Status (1)

Country Link
CN (1) CN111752773B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541711A (en) * 2011-12-31 2012-07-04 曙光信息产业股份有限公司 Method for testing X86 architecture server mainboards
CN105068900A (en) * 2015-07-27 2015-11-18 浪潮电子信息产业股份有限公司 Testing method for remote control server cold reboot
CN106933710A (en) * 2017-02-19 2017-07-07 郑州云海信息技术有限公司 The method of testing that DC is restarted is carried out to server based on WOL functions
CN109634626A (en) * 2018-12-18 2019-04-16 郑州云海信息技术有限公司 A kind of method and system of the Remote Installation Server system drive based on BMC

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541711A (en) * 2011-12-31 2012-07-04 曙光信息产业股份有限公司 Method for testing X86 architecture server mainboards
CN105068900A (en) * 2015-07-27 2015-11-18 浪潮电子信息产业股份有限公司 Testing method for remote control server cold reboot
CN106933710A (en) * 2017-02-19 2017-07-07 郑州云海信息技术有限公司 The method of testing that DC is restarted is carried out to server based on WOL functions
CN109634626A (en) * 2018-12-18 2019-04-16 郑州云海信息技术有限公司 A kind of method and system of the Remote Installation Server system drive based on BMC

Also Published As

Publication number Publication date
CN111752773B (en) 2023-01-06

Similar Documents

Publication Publication Date Title
CN105808394B (en) Server self-healing method and device
CN102571498B (en) Fault injection control method and device
CN111488233A (en) Method and system for processing bandwidth loss problem of PCIe device
CN108429629A (en) Equipment fault restoration methods and device
WO2019105221A1 (en) Energy storage system startup method and energy storage device
US10880153B2 (en) Method and system for providing service redundancy between a master server and a slave server
CN111694710A (en) Method, device and equipment for monitoring faults of substrate management controller and storage medium
CN111737064A (en) BMC system control method and device, storage medium and computer equipment
CN113645095A (en) Automatic switch testing method, equipment and medium based on snmp alarm information
CN107528705B (en) Fault processing method and device
CN111352662B (en) Server starting sequence control method, system, terminal and storage medium
CN111752773B (en) Method and device for realizing power-on self-check verification of clustered system
CN106411643B (en) BMC detection method and device
CN116974941A (en) Testing method for management interface function of intelligent platform of baseboard management controller
CN113868001B (en) Method, system and computer storage medium for checking memory repair result
CN111309509A (en) Method and system for solving channel switching failure based on server BMC
CN114138574A (en) Controller testing method, device, server and storage medium
CN110399266B (en) Method and device for testing new user creation function of BMC (baseboard management controller) system and related components
CN110879546A (en) Method for realizing double-chip power supply management by combining software and hardware
CN108988050A (en) A kind of Wi-Fi intelligent socket and Wi-Fi network availability safeguards system
CN115250249B (en) IPv6 Ready-based automatic testing method, device, medium and equipment
TWI759926B (en) System and method for power on testing
CN116302851B (en) FPGA logic abnormality monitoring and recovering method, device, equipment and medium
KR102262942B1 (en) Gateway self recovery method by the wireless bridge of wireless network system system
CN111274075B (en) Method and system for automatically testing random power failure of BOX (BOX) host

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant