CN116521496A - Method, system, computer device and storage medium for verifying server performance - Google Patents

Method, system, computer device and storage medium for verifying server performance Download PDF

Info

Publication number
CN116521496A
CN116521496A CN202310432839.1A CN202310432839A CN116521496A CN 116521496 A CN116521496 A CN 116521496A CN 202310432839 A CN202310432839 A CN 202310432839A CN 116521496 A CN116521496 A CN 116521496A
Authority
CN
China
Prior art keywords
error
hardware
scheme
real
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310432839.1A
Other languages
Chinese (zh)
Inventor
赵相斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310432839.1A priority Critical patent/CN116521496A/en
Publication of CN116521496A publication Critical patent/CN116521496A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a server performance verification method, a system, computer equipment and a storage medium, wherein the method comprises the following steps: receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction; transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme; acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information; processing the real error based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance; various real hardware errors occur and are identified by the server through the occurrence of the real hardware errors; the hardware real error occurrence scheme, the register and the log information are grabbed, and the server performance test efficiency is optimized.

Description

Method, system, computer device and storage medium for verifying server performance
Technical Field
The present invention relates to the field of server testing, and in particular, to a method, a system, a computer device, and a storage medium for verifying server performance.
Background
In the prior art, when verifying the performance of a server, an error injection test is usually performed on a xdp (extended debug port) interface on a server motherboard, the error injection is an analog error or an error which only occurs on a hardware interface and is not actually occurred on hardware, and a PEI card and a MEI card provided by Intel can be injected singly or in a single type, so that the accuracy and the efficiency of the error injection test are limited.
Disclosure of Invention
The invention aims at: provided are a server performance verification method, system, computer device, and storage medium.
The technical scheme of the invention is as follows: in a first aspect, the present invention provides a server performance verification method, the method comprising:
receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction;
transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information;
the real error is processed based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance.
In a preferred embodiment, the receiving the error injection instruction and generating the hardware real error occurrence scheme based on the error injection instruction includes:
receiving an error injection instruction;
reading server hardware configuration information, wherein the server hardware configuration information at least comprises: hardware model, hardware quantity, content capacity and memory location;
generating a hardware real error generation scheme based on the error injection instruction and the server hardware configuration information, wherein the hardware real error generation scheme at least comprises an error list, an error priority and an error type.
In a preferred embodiment, before the receiving the error injection instruction and generating the hardware real error occurrence scheme based on the error injection instruction, the method further includes:
displaying a preset error injection scheme on a human-computer interaction interface for a user to select to input an error injection instruction, wherein the preset error injection scheme comprises any one of a user-defined error, a random error and fault flood.
In a preferred embodiment, responding to the error injection scheme selected by the user as a custom error, the generating a hardware real error occurrence scheme based on the error injection instruction and the server hardware configuration information includes:
Receiving user-defined hardware type information, user-defined error occurrence frequency information and error position information which are input by a user;
generating a hardware true error occurrence scheme based on the custom hardware type information, the custom error occurrence frequency information, the error position information and the server hardware configuration information.
In a preferred embodiment, the generating the hardware real error occurrence scheme based on the error injection instruction and the server hardware configuration information in response to the error injection scheme selected by the user being a random error includes:
and randomly generating a hardware true error occurrence scheme based on the server hardware configuration information.
In a preferred embodiment, responding to the error injection scheme selected by the user as the fault flood, the generating the hardware real error occurrence scheme based on the error injection instruction and the server hardware configuration information includes:
and randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error generation scheme.
In a preferred embodiment, before the sending the hardware real error occurrence scheme to the target hardware for the target hardware to generate the real error based on the hardware real error occurrence scheme, the method further includes:
Analyzing the hardware real error occurrence scheme to obtain hardware real error category information, wherein the hardware real error category information comprises at least one of CPU hardware errors, memory hardware errors, PCIE hardware errors and other errors;
analyzing the hardware true error generation scheme to obtain target hardware layer information corresponding to the hardware true error category information;
analyzing the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the sending the hardware real error occurrence scheme to the target hardware comprises the following steps:
and sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
In a preferred embodiment, the parsing the hardware true error occurrence scheme to obtain hardware true error category information includes:
and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
In a preferred embodiment, the analyzing the hardware true error occurrence scheme further generates a first error random number when obtaining the target error type information corresponding to the target hardware layer information;
Before the analyzing the hardware real error occurrence scheme obtains the target error type information corresponding to the target hardware layer information, the method further comprises:
and correcting the target hardware layer information based on the first error random number.
In a preferred embodiment, the analyzing the hardware true error occurrence scheme further generates a second error random number when obtaining the target error type information corresponding to the target hardware layer information.
Before the sending the target hardware layer information and the target error type information to the target hardware based on the hardware real error category information, the method further includes:
and correcting the target error type information based on the second error random number.
In a preferred embodiment, the sending the hardware true error occurrence scheme to the target hardware for the target hardware to generate the true error based on the hardware true error occurrence scheme includes:
and issuing the target hardware layer information and the target error type information to target hardware in the form of an error packet or an error stream so as to generate a real error.
In a preferred embodiment, the obtaining the log information of the real error includes:
Capturing log information of the real errors based on a preset interface, wherein the preset interface comprises at least one of a serial port, an XDP interface, an IPMI interface, a redfish interface and an SSH interface; and the log information of the real error is reported to an OSkernel and BMC by the target hardware after the real error occurs through a UEFI system.
In a preferred embodiment, the acquiring the hardware true error occurrence scheme and the register information includes:
acquiring a hardware true error occurrence scheme;
and acquiring register information for transmitting and processing the trigger register setting, wherein the register information at least comprises SMI link information and CSMI link information.
In a preferred embodiment, the acquiring the register information of the error transfer and processing trigger register setting includes:
and acquiring the register information which is transmitted to the trigger register setting in the UEFI system after the error occurs.
In a preferred embodiment, the processing the real error to verify the server performance based on the log information of the real error, the hardware real error occurrence scheme, and the register information includes:
judging whether the log of the real error is consistent with the hardware real error occurrence scheme or not, and judging whether the register information is consistent with the hardware real error occurrence scheme or not;
And if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, verifying that the server performance is qualified.
In a preferred embodiment, the determining whether the log of the real error is consistent with the hardware real error occurrence scheme includes:
judging whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, wherein the error information comprises error types, error generation positions, error times and error generation time.
In a preferred embodiment, the determining whether the register information is consistent with the hardware true error occurrence scheme includes:
and judging whether the error type in the register information is consistent with the error type in the hardware true error occurrence scheme.
In a second aspect, the present invention also provides a server performance verification system, the system comprising:
the receiving and generating module is used for receiving the error injection instruction and generating a hardware true error generation scheme based on the error injection instruction;
the sending module is used for sending the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
The acquisition module is used for acquiring the log information of the real errors and acquiring a hardware real error occurrence scheme and register information;
and the processing module is used for processing the real errors based on the log information of the real errors, the hardware real error occurrence scheme and the register information so as to verify the server performance.
In a third aspect, the present invention also provides a computer apparatus comprising:
one or more processors;
and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the server performance verification method of any one of the first aspects.
In a fourth aspect, the present invention also provides a computer-readable storage medium storing computer instructions that cause the computer to perform the server performance verification method according to any one of the first aspects.
The invention has the advantages that: provided are a server performance verification method, system, computer device and storage medium, the method comprising: receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction; transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme; acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information; processing the real error based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance; various real hardware errors occur and are identified by the server through the occurrence of the real hardware errors; the hardware real error occurrence scheme, the register and the log information are grabbed, and the server performance test efficiency is optimized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a server performance verification architecture in the present application;
FIG. 2 is a flow chart of a method for verifying server performance provided by the present application;
FIG. 3 is a block diagram of server performance verification provided herein;
FIG. 4 is a block diagram of a server performance verification system provided herein;
fig. 5 is a schematic diagram of a computer device provided in the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
As described in the background art, in the prior art
The server motherboard has xdp (extenddebugport) interface, which is a JTAG (joint testactiongroup, joint test workgroup) type interface, and is an international standard test protocol (ieee 1149.1 compliant) mainly used for error injection testing. In the prior art, the error injection is an analog error or an error which only occurs in a hardware interface and is not actually occurred in hardware, but the entity PEI card or MEI card provided by Intel can inject the same type of error in a single and single type, and cannot continuously and randomly generate a large amount of hardware errors.
To solve the above problems, the present application creatively proposes a server performance verification method, a system, a computer device and a storage medium, where the method includes: receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction; transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme; acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information; processing the real error based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance; various real hardware errors occur and are identified by the server through the occurrence of the real hardware errors; the hardware real error occurrence scheme, the register and the log information are grabbed, and the server performance test efficiency is optimized.
The following describes the aspects of the present application in detail with reference to the drawings and various embodiments.
Embodiment one: the embodiment describes a server performance verification architecture in the present application.
Specifically, referring to fig. 1, the architecture includes:
the server hardware and hardware interfaces comprise CPU (central processing unit/Processor) hardware interfaces, memory hardware interfaces, PCIE (PCI-Express, bus and interface standard) hardware interfaces, PCH (platform control south bridge) hardware interfaces and corresponding hardware which are integrated and deployed to the server, and of course, the server hardware and the hardware interfaces also comprise other hardware interfaces which are set according to actual needs, wherein the CPU hardware interfaces, the memory hardware interfaces, the PCIE hardware interfaces, the PCH hardware interfaces and other hardware interfaces which are set according to actual needs are visible and interactive; the hardware receives and executes the error issued by the error verification system through a hardware interface;
the architecture further includes: the system comprises a man-machine interaction interface and a log input module, wherein the man-machine interaction interface is connected with an error verification system and is used for inputting an error injection instruction by a user, displaying an error injection scheme for the user to select, monitoring the error occurrence progress, checking the error occurrence history, checking the error occurrence scheme and displaying a verification result after verification; the log grabbing interfaces comprise, but are not limited to, a serial port, an XDP interface, an IPMI interface, a redfish interface, an SSH interface and the like, are used for grabbing logs after fault injection, and provide log information for an error processing mechanism;
The architecture also comprises an error generating device, wherein the error verifying system is arranged in the error generating device, and an error generating module and an error diagnosing module are contained in the error generating device. The error generating module is responsible for generating a hardware real error generating scheme according to an error injection instruction input by a user, analyzing the hardware real error generating scheme into an error generating instruction transmitted to the hardware, and the error diagnosing module is responsible for analyzing the log grabbed by the log grabbing interface and judging whether the real error generating scheme accords with the hardware real error generating scheme.
Embodiment two: based on the architecture of server performance verification described in the first embodiment, the present embodiment is described with reference to fig. 2 and fig. 3, where the server performance verification process is described in the present application.
Specifically, referring to fig. 2 and fig. 3, the present application provides a server performance verification method, where the method includes:
s210, receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction.
Specifically, the system is connected with the man-machine interaction interface, a user inputs an error injection instruction on the man-machine interaction interface, and the system generates a hardware true error generation scheme after receiving the error injection instruction transmitted by the man-machine interaction interface.
In one embodiment, the receiving the error injection instruction and generating the hardware true error occurrence scheme based on the error injection instruction includes:
s211, displaying a preset error injection scheme on a man-machine interaction interface for a user to select to input an error injection instruction, wherein the preset error injection scheme comprises any one of self-defined errors, random errors and fault flooding.
Specifically, in order to improve the usability of the user, when the user inputs the error injection command through the input/output function of the man-machine interface, a plurality of error injection schemes are provided on the man-machine interface, including but not limited to: custom errors (user can customize hardware type, error type, number of error occurrences, error location, etc.), random errors (errors will be randomly generated according to server hardware configuration), fault flooding (a large number of hardware errors are continuously and randomly generated according to server hardware configuration).
S212, receiving an error injection instruction.
And receiving a preset error injection scheme and specific content selected by a user, wherein the user can select any one of custom errors, random errors and fault flooding. Of course, if the preset error injection scheme also includes other error injection schemes, the user may also select the error injection scheme, which is not limited to the above-mentioned custom error, random error, and fault flood.
S213, reading server hardware configuration information, wherein the server hardware configuration information at least comprises: hardware model, number of hardware, content capacity, and memory location.
Specifically, the system reads the hardware configuration of the server, such as the information of the hardware model, the hardware quantity, the memory capacity, the position and the like, confirms the currently available configuration hardware, and is used for making a real error occurrence scheme of the hardware.
S214, generating a hardware real error generation scheme based on the error injection instruction and the server hardware configuration information, wherein the hardware real error generation scheme at least comprises an error list, an error priority and an error type.
Specifically, (1) responding to an error injection scheme selected by a user to be a custom error, wherein the generating a hardware true error generation scheme based on the error injection instruction and the server hardware configuration information comprises the following steps:
s2141, receiving user-defined hardware type information, user-defined error occurrence frequency information and error position information input by a user;
s2142, generating a hardware true error occurrence scheme based on the custom hardware type information, custom error occurrence frequency information, error position information and the server hardware configuration information.
If the user-defined hardware type information, the user-defined error occurrence frequency information and the error position information input by the user conflict with the read server hardware configuration information, the conflict content information is sent to the man-machine interaction interface for display reminding. The user-defined hardware type information input by the user is a CPU hardware type, but the read server hardware configuration information shows that the current CPU type hardware is unavailable, then the user injection instruction conflicts with the read server hardware configuration information, the error cannot be executed, and the conflict content information is displayed, namely, the CPU hardware is unavailable to a human-computer interaction interface for displaying reminding.
(2) Responding to the error injection scheme selected by the user as random error, wherein the generating the hardware true error generation scheme based on the error injection instruction and the server hardware configuration information comprises the following steps:
s2143, randomly generating a hardware true error generation scheme based on the server hardware configuration information.
(3) Responding to the error injection scheme selected by the user as fault flood, wherein the generating the hardware true error generation scheme based on the error injection instruction and the server hardware configuration information comprises the following steps:
S2144, randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error occurrence scheme.
SA10, analyzing the hardware true error occurrence scheme to obtain hardware true error category information, wherein the hardware true error category information comprises at least one of CPU hardware errors, memory hardware errors, PCIE hardware errors and other errors.
Preferably, the step includes: and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
Specifically, after the hardware real error occurrence scheme is generated, the system grabs the hardware real error occurrence scheme to analyze step by step. Because the server is integrated and deployed with a CPU hardware interface, a memory hardware interface, a PCIE hardware interface, a PCH hardware interface and corresponding hardware, and also comprises other hardware interfaces which are set according to actual needs, a hardware analysis module corresponding to each hardware and the hardware interface is deployed in the system. After the system captures the real error occurrence scheme of the hardware, the real error occurrence scheme is analyzed in the corresponding hardware analysis module according to the error priority and the error list.
For example, if the error list in the hardware real error occurrence scheme includes a CPU hardware error and a memory hardware error, the CPU hardware error priority is higher than the memory hardware error priority based on the error priority, and the CPU hardware error and the memory hardware error are sequentially parsed in the CPU hardware parsing module and the memory parsing module.
SA20, analyzing the hardware true error generation scheme to obtain target hardware layer information corresponding to the hardware true error category information.
Specifically, after the error category to which the real error generation scheme of the hardware belongs is identified and analyzed, the hardware layer information of the error is continuously analyzed and acquired.
For example, after analyzing the hardware real error occurrence scheme to be the CPU hardware error category, the corresponding hardware layer information is further analyzed. The hardware layer comprises a transmission layer, a data layer and a processing layer. The transmission layer is responsible for transmitting, receiving and outputting data, and is conventionally called RX (received data) and TX (transport) for transmitting data, taking PCIE hardware as an example. The data layer stores a large amount of data, and the processing layer involves processing and computing of the data. The deeper the hardware level, the higher the severity of the data transmission errors.
And when the hardware real error occurrence scheme is analyzed to obtain the target error type information corresponding to the target hardware layer information, a first error random number is also generated.
SA30, correcting the target hardware layer information based on the first error random number.
The first error random number is generated in the process of resolving the hardware layer, and corrects errors after resolving the target hardware layer information, so that the purpose is to prevent errors from being affected by algorithms when resolving the next level, and a fixed step state is formed, such as common normal distribution. The random number is used for enabling errors to be covered in a whole range in a controllable mode, and reducing risks of uneven coverage.
SA40, analyzing the hardware true error occurrence scheme to obtain the target error type information corresponding to the target hardware layer information.
And generating a second error random number when the real error occurrence scheme of the hardware is analyzed to obtain the target error type information corresponding to the target hardware layer information.
Specifically, after the error is resolved to the hardware layers, the error needs to be further resolved, and the error is further refined in each hardware layer to obtain the target error type information. Such as single BIT errors or BIT jumps that can occur at any hardware level, due to BIT flipping of a single BIT or multiple BITs in binary data. Other errors also involve numerous types of errors, such as CRC errors, read-write process errors, and the like.
SA50, correcting the target error type information based on the second error random number.
The error type analysis process generates a second error random number, and the second error random number corrects errors after analyzing the target error type information, so as to prevent error distribution from being affected by an algorithm and form a fixed step state, such as a common normal distribution. The random number is used for enabling errors to be covered in a whole range in a controllable mode, and reducing risks of uneven coverage.
S220, sending the hardware real error occurrence scheme to the target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme.
Specifically, the sending the hardware real error occurrence scheme to the target hardware includes:
s221, sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
More specifically, the target hardware layer information and the target error type information are issued to the target hardware in the form of an error packet or an error stream to generate a real error.
After the analysis of the error type is completed, the errors are issued to the hardware in the form of error packets or error streams, and the expression forms of the hardware errors are different, which is roughly divided into hardware physical layer setting: a certain physical layer of the hardware is completely paralyzed and cannot work; link failure: failure of a segment or all segments of the data transmission link results in data that cannot be read or added to the calculation sequence; data errors: data cannot be used due to translocation, deletion, compiling errors and the like in the process of reading, writing and transmitting the data, and data errors occur, so that the data cannot be used; as well as other types of errors, cause hardware to run in error.
And the target hardware is executed after receiving the target hardware layer information and the target error type information issued by the system, so as to generate a real error.
S230, acquiring log information of the real errors, and acquiring a hardware real error generation scheme and register information.
In one embodiment, the obtaining the log information of the real error includes:
s231, capturing log information of the real errors based on a preset interface, wherein the preset interface comprises at least one of a serial port, an XDP (extensible debug port) interface, an IPMI interface (IntelligentPlatform ManagementInterface, an intelligent platform management interface), a redfish interface and an SSH interface; the log information of the real error is reported to an osker (operating system kernel) and a BMC (baseboard management controller) by the target hardware after the real error occurs through a UEFI (Unified ExtensibleFirmwareInterface ) system.
Specifically, the target hardware executes to generate an error, the error is transmitted to the UEFI system after the error occurs, the UEFI processes the error, and the error may be refused to be corrected again by the hardware, or the error is considered to be a true error triggering register after the error is accumulated to a certain value. The target hardware reports the error to the UEFI system, which reports the error to OSkernel and BMC, which generates a log of log information (including but not limited to massage, dmasg, etc.).
In one embodiment, the acquiring the hardware real error occurrence scheme and the register information includes:
s232, acquiring a hardware true error occurrence scheme.
S233, acquiring register information of setting an error transfer and processing trigger register, wherein the register information at least comprises SMI (serial management interface) link information and CSMI link information.
Specifically, the acquiring the register information of the trigger register setting of the error transfer and processing includes:
and acquiring the register information which is transmitted to the trigger register setting in the UEFI system after the error occurs.
After the error occurs, the error is transferred to the UEFI system, the error is transferred and processed, the register setting is triggered, and register information (including but not limited to an SMI link, a CSMI link and the like) is transferred to the system for error processing.
S240, processing the real errors based on the log information of the real errors, the hardware real error occurrence scheme and the register information to verify the server performance.
In one embodiment, the processing the real error to verify the server performance based on the log information of the real error, the hardware real error occurrence scheme, and the register information includes:
Judging whether the log of the real error is consistent with the hardware real error occurrence scheme or not, and judging whether the register information is consistent with the hardware real error occurrence scheme or not;
and if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, verifying that the server performance is qualified.
Preferably, the determining whether the log of the real error is consistent with the hardware real error occurrence scheme includes:
judging whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, wherein the error information comprises error types, error generation positions, error times and error generation time.
The judging whether the register information is consistent with the hardware true error occurrence scheme comprises the following steps:
and judging whether the error type in the register information is consistent with the error type in the hardware true error occurrence scheme. The register information comprises an error type list, and whether the register information is consistent with the hardware true error occurrence scheme is judged by comparing whether the error type list in the register information is matched with the error type list in the hardware true error occurrence scheme.
The type of error in the real scheme is consistent with the register information.
The error processing flow pulls the hardware true error occurrence scheme, the register information and the log information, and carries out the fault processing flow, and the error processing is divided into manual checking and automatic checking, and the main purpose is to compare whether the hardware true error occurrence scheme is matched with the true error log and whether the hardware true error occurrence scheme is matched with the register information, so as to verify whether the RSA function of the server is normal. The automatic check will execute the automatic use case, and the final result of execution will be presented in the man-machine interface. The human checking mode can directly display the grabbed information to an operator for manual testing.
According to the server performance verification method provided by the embodiment, various real hardware errors are generated and identified by the server through the generation of the real hardware errors; the hardware real error occurrence scheme, the register and the log information are grabbed, and the performance test efficiency of the server is optimized;
further, a variety of real hardware integration is deployed to the server such that a variety of real hardware errors occur and are identified by the server.
Furthermore, according to the read server configuration information, an error occurrence scheme conforming to the configuration of the server to be tested is formulated, and the adaptation capability to various servers is excellent.
Further, large-scale and continuous injection of real hardware errors is realized;
further, the hardware errors are refined step by step through analyzing the errors, so that a large number of various and complex hardware errors are generated, and the method is greatly close to the real environment used by customers.
Embodiment III: corresponding to the above-described embodiments one to two, the server performance verification system provided in the present application will be described below with reference to fig. 4. The system may be implemented in hardware or software, or may be implemented in a combination of hardware and software, which is not limited in this application.
In one example, the present application provides a server performance verification system comprising:
a receiving and generating module 410, configured to receive an error injection instruction and generate a hardware real error occurrence scheme based on the error injection instruction;
a sending module 420, configured to send the hardware real error occurrence scheme to a target hardware, so that the target hardware generates a real error based on the hardware real error occurrence scheme;
the acquiring module 430 is configured to acquire log information of the real error, and acquire a hardware real error occurrence scheme and register information;
A processing module 440, configured to process the real error based on the log information of the real error, the hardware real error occurrence scheme and the register information to verify the server performance.
In one embodiment, the reception generation module 410 includes:
a receiving unit 411 configured to receive an error injection instruction;
a reading unit 412, configured to read server hardware configuration information, where the server hardware configuration information at least includes: hardware model, hardware quantity, content capacity and memory location;
the generating unit 413 is configured to generate a hardware real error occurrence scheme based on the error injection instruction and the server hardware configuration information, where the hardware real error occurrence scheme includes at least an error list, an error priority, and an error type.
Preferably, the system further comprises:
the display module 450 is configured to display a preset error injection scheme on the man-machine interface for a user to select to input the error injection instruction before the receiving and generating module 410 receives the error injection instruction and generates the hardware real error occurrence scheme based on the error injection instruction, where the preset error injection scheme includes any one of a custom error, a random error and a fault flood.
More preferably, the generating unit 413 is specifically configured to: responding to the user-selected error injection scheme as the custom error, and receiving custom hardware type information, custom error occurrence frequency information and error position information which are input by a user;
generating a hardware true error occurrence scheme based on the custom hardware type information, the custom error occurrence frequency information, the error position information and the server hardware configuration information.
More preferably, the generating unit 413 is specifically configured to: and responding to the error injection scheme selected by the user as a random error, and randomly generating a hardware true error generation scheme based on the server hardware configuration information.
More preferably, the generating unit 413 is specifically configured to: and responding to the fault flood by the error injection scheme selected by the user, and randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error occurrence scheme.
Preferably, the system further comprises:
a first parsing module 460, configured to send the hardware real error occurrence scheme to a target hardware by using the sending module 420, so that before the target hardware generates a real error based on the hardware real error occurrence scheme, the hardware real error occurrence scheme is parsed to obtain hardware real error category information, where the hardware real error category information includes at least one of a CPU hardware error, a memory hardware error, a PCIE hardware error, and other errors;
The second parsing module 470 is configured to parse the hardware real error occurrence scheme to obtain target hardware layer information corresponding to the hardware real error category information;
a third parsing module 480, configured to parse the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the sending module 420 is specifically configured to: and sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
More preferably, the first parsing module 460 is specifically configured to:
and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
More preferably, the first parsing module 460 is further configured to generate a first error random number when parsing the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the system further comprises:
and a correction module 490, configured to correct the target hardware layer information based on the first error random number before the second analysis module 470 analyzes the hardware actual error occurrence scheme to obtain the target hardware layer information corresponding to the hardware actual error category information.
More preferably, the second parsing module 470 is further configured to generate a second error random number when parsing the hardware real error occurrence scheme to obtain the target error type information corresponding to the target hardware layer information;
the correction module 490 is further configured to correct the target error type information based on the second error random number before the sending module 420 sends the target hardware layer information and the target error type information to target hardware based on the hardware real error type information.
More preferably, the sending module 420 is specifically configured to: and issuing the target hardware layer information and the target error type information to target hardware in the form of an error packet or an error stream so as to generate a real error.
More preferably, the obtaining module 430 includes:
a grabbing unit 431, configured to grab the log information of the real error based on a preset interface, where the preset interface includes at least one of a serial port, an XDP interface, an IPMI interface, a redfish interface, and an SSH interface; and the log information of the real error is reported to an OSkernel and BMC by the target hardware after the real error occurs through a UEFI system.
More preferably, the obtaining module 430 further includes:
A first obtaining unit 432, configured to obtain a hardware real error occurrence scheme;
the second obtaining unit 433 is configured to obtain register information that is set by the error transfer and handling trigger register, where the register information includes at least SMI link information and CSMI link information.
More preferably, the second acquiring unit 433 is specifically configured to: and acquiring the register information which is transmitted to the trigger register setting in the UEFI system after the error occurs.
More preferably, the processing module 440 is specifically configured to determine whether the log of the real error is consistent with the hardware real error occurrence scheme, and determine whether the register information is consistent with the hardware real error occurrence scheme;
if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, the processing module 440 verifies that the server performance is acceptable.
More preferably, the processing module 440 includes:
the first determining unit 441 is configured to determine whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, where the error information includes an error type, an error generation position, an error number and an error generation time.
More preferably, the processing module 440 further includes:
a second judging unit 442, configured to judge whether the error type in the register information is consistent with the error type in the hardware real error occurrence scheme.
Embodiment four: corresponding to the first to third embodiments, a description will be given below of a computer device provided in the present application with reference to fig. 5. As shown in fig. 5, in one example, the present application provides a computer device comprising:
one or more processors;
and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the following:
receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction;
transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information;
the real error is processed based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
receiving an error injection instruction;
reading server hardware configuration information, wherein the server hardware configuration information at least comprises: hardware model, hardware quantity, content capacity and memory location;
generating a hardware real error generation scheme based on the error injection instruction and the server hardware configuration information, wherein the hardware real error generation scheme at least comprises an error list, an error priority and an error type.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
displaying a preset error injection scheme on a human-computer interaction interface for a user to select to input an error injection instruction, wherein the preset error injection scheme comprises any one of a user-defined error, a random error and fault flood.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
receiving user-defined hardware type information, user-defined error occurrence frequency information and error position information which are input by a user;
generating a hardware true error occurrence scheme based on the custom hardware type information, the custom error occurrence frequency information, the error position information and the server hardware configuration information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and randomly generating a hardware true error occurrence scheme based on the server hardware configuration information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error generation scheme.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
analyzing the hardware real error occurrence scheme to obtain hardware real error category information, wherein the hardware real error category information comprises at least one of CPU hardware errors, memory hardware errors, PCIE hardware errors and other errors;
analyzing the hardware true error generation scheme to obtain target hardware layer information corresponding to the hardware true error category information;
analyzing the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the sending the hardware real error occurrence scheme to the target hardware comprises the following steps:
and sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
The method comprises the steps that a first error random number is also generated when the real error occurrence scheme of the hardware is analyzed to obtain target error type information corresponding to the target hardware layer information;
the program instructions, when read for execution by the one or more processors, further perform the operations of:
and correcting the target hardware layer information based on the first error random number.
And generating a second error random number when the real error occurrence scheme of the hardware is analyzed to obtain the target error type information corresponding to the target hardware layer information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and correcting the target error type information based on the second error random number.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and issuing the target hardware layer information and the target error type information to target hardware in the form of an error packet or an error stream so as to generate a real error.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
capturing log information of the real errors based on a preset interface, wherein the preset interface comprises at least one of a serial port, an XDP interface, an IPMI interface, a redfish interface and an SSH interface; and the log information of the real error is reported to an OSkernel and BMC by the target hardware after the real error occurs through a UEFI system.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
acquiring a hardware true error occurrence scheme;
and acquiring register information for transmitting and processing the trigger register setting, wherein the register information at least comprises SMI link information and CSMI link information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and acquiring the register information which is transmitted to the trigger register setting in the UEFI system after the error occurs.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
judging whether the log of the real error is consistent with the hardware real error occurrence scheme or not, and judging whether the register information is consistent with the hardware real error occurrence scheme or not;
And if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, verifying that the server performance is qualified.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
judging whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, wherein the error information comprises error types, error generation positions, error times and error generation time.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and judging whether the error type in the register information is consistent with the error type in the hardware true error occurrence scheme.
The program instructions, when read and executed by the one or more processors, may further perform operations corresponding to the steps in the foregoing method embodiments, and reference may be made to the foregoing description, which is not repeated herein. With reference to FIG. 5, an exemplary architecture for a computer device is shown, which may include a processor 510, a video display adapter 511, a disk drive 512, an input/output interface 513, a network interface 514, and a memory 520. The processor 510, the video display adapter 511, the disk drive 512, the input/output interface 513, the network interface 514, and the memory 520 may be communicatively coupled via a communication bus 530.
The processor 510 may be implemented by a general-purpose central processing unit (CentralProcessingUnit, CPU), a microprocessor, an application specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), or one or more integrated circuits, etc., for executing relevant programs to implement the technical solutions provided herein.
Memory 520 may be implemented in the form of read-only memory (ReadOnlyMemory, ROM), random access memory (RandomAccessMemory, RAM), static storage devices, dynamic storage devices, or the like. The memory 520 may store an operating system 521 for controlling the operation of the computer device 500, and a Basic Input Output System (BIOS) 522 for controlling the low-level operation of the computer device 500. In addition, a web browser 523, data storage management 524, and an icon font processing system 525, etc. may also be stored. The icon font processing system 525 may be an application program that specifically implements the operations of the foregoing steps in the embodiments of the present application. In general, when the technical solutions provided in the present application are implemented by software or firmware, relevant program codes are stored in the memory 520 and invoked by the processor 510 to be executed.
The input/output interface 513 is used for connecting with an input/output module to realize information input and output. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
The network interface 514 is used to connect communication modules (not shown) to enable communication interactions of the device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 530 includes a path to transfer information between components of the device (e.g., processor 510, video display adapter 511, disk drive 512, input/output interface 513, network interface 514, and memory 520).
In addition, the computer device 500 may also obtain information of specific acquisition conditions from the virtual resource object acquisition condition information database 541 for making condition judgment, and so on.
It should be noted that although the above-described computer device 500 illustrates only a processor 510, a video display adapter 511, a disk drive 512, an input/output interface 513, a network interface 514, a memory 520, a bus 530, etc., the computer device may include other components necessary to achieve proper operation in an implementation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the present application, and not all the components shown in the drawings.
From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and include several instructions to cause a computer device (which may be a personal computer, a cloud server, or a network device, etc.) to perform the methods described in the various embodiments or some parts of the embodiments of the present application.
Fifth embodiment: corresponding to the first to fourth embodiments described above, a computer-readable storage medium provided in the present application will be described below. In one example, the present application provides a computer-readable storage medium storing computer instructions that cause the computer to:
receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction;
Transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information;
the real error is processed based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance.
The computer instructions cause the computer to further perform the operations of:
receiving an error injection instruction;
reading server hardware configuration information, wherein the server hardware configuration information at least comprises: hardware model, hardware quantity, content capacity and memory location;
generating a hardware real error generation scheme based on the error injection instruction and the server hardware configuration information, wherein the hardware real error generation scheme at least comprises an error list, an error priority and an error type.
The computer instructions cause the computer to further perform the operations of:
displaying a preset error injection scheme on a human-computer interaction interface for a user to select to input an error injection instruction, wherein the preset error injection scheme comprises any one of a user-defined error, a random error and fault flood.
The computer instructions cause the computer to further perform the operations of:
receiving user-defined hardware type information, user-defined error occurrence frequency information and error position information which are input by a user;
generating a hardware true error occurrence scheme based on the custom hardware type information, the custom error occurrence frequency information, the error position information and the server hardware configuration information.
The computer instructions cause the computer to further perform the operations of:
and randomly generating a hardware true error occurrence scheme based on the server hardware configuration information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error generation scheme.
The computer instructions cause the computer to further perform the operations of:
analyzing the hardware real error occurrence scheme to obtain hardware real error category information, wherein the hardware real error category information comprises at least one of CPU hardware errors, memory hardware errors, PCIE hardware errors and other errors;
Analyzing the hardware true error generation scheme to obtain target hardware layer information corresponding to the hardware true error category information;
analyzing the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the sending the hardware real error occurrence scheme to the target hardware comprises the following steps:
and sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
The computer instructions cause the computer to further perform the operations of:
and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
The method comprises the steps that a first error random number is also generated when the real error occurrence scheme of the hardware is analyzed to obtain target error type information corresponding to the target hardware layer information;
the computer instructions cause the computer to further perform the operations of:
and correcting the target hardware layer information based on the first error random number.
And generating a second error random number when the real error occurrence scheme of the hardware is analyzed to obtain the target error type information corresponding to the target hardware layer information.
The computer instructions cause the computer to further perform the operations of:
and correcting the target error type information based on the second error random number.
The computer instructions cause the computer to further perform the operations of:
and issuing the target hardware layer information and the target error type information to target hardware in the form of an error packet or an error stream so as to generate a real error.
The computer instructions cause the computer to further perform the operations of:
capturing log information of the real errors based on a preset interface, wherein the preset interface comprises at least one of a serial port, an XDP interface, an IPMI interface, a redfish interface and an SSH interface; and the log information of the real error is reported to an OSkernel and BMC by the target hardware after the real error occurs through a UEFI system.
The computer instructions cause the computer to further perform the operations of:
acquiring a hardware true error occurrence scheme;
and acquiring register information for transmitting and processing the trigger register setting, wherein the register information at least comprises SMI link information and CSMI link information.
The computer instructions cause the computer to further perform the operations of:
and acquiring the register information which is transmitted to the trigger register setting in the UEFI system after the error occurs.
The computer instructions cause the computer to further perform the operations of:
judging whether the log of the real error is consistent with the hardware real error occurrence scheme or not, and judging whether the register information is consistent with the hardware real error occurrence scheme or not;
and if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, verifying that the server performance is qualified.
The computer instructions cause the computer to further perform the operations of:
judging whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, wherein the error information comprises error types, error generation positions, error times and error generation time.
The computer instructions cause the computer to further perform the operations of:
and judging whether the error type in the register information is consistent with the error type in the hardware true error occurrence scheme.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points. The apparatus embodiments described above are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
In addition, it is to be understood that: the terms "first," "second," "third," "fourth" are used herein for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", "a third" and a fourth "may explicitly or implicitly include one or more such feature.
The above embodiments are merely for illustrating the technical concept and features of the present invention, and are not intended to limit the scope of the present invention to those skilled in the art to understand the present invention and implement the same. All modifications made according to the spirit of the main technical proposal of the invention should be covered in the protection scope of the invention.

Claims (20)

1. A method for verifying server performance, the method comprising:
receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction;
transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
Acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information;
the real error is processed based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance.
2. The server performance verification method according to claim 1, wherein the receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction comprises:
receiving an error injection instruction;
reading server hardware configuration information, wherein the server hardware configuration information at least comprises: hardware model, hardware quantity, content capacity and memory location;
generating a hardware real error generation scheme based on the error injection instruction and the server hardware configuration information, wherein the hardware real error generation scheme at least comprises an error list, an error priority and an error type.
3. The server performance verification method according to claim 2, wherein before receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction, the method further comprises:
Displaying a preset error injection scheme on a human-computer interaction interface for a user to select to input an error injection instruction, wherein the preset error injection scheme comprises any one of a user-defined error, a random error and fault flood.
4. The server performance verification method according to claim 3, wherein, in response to a user-selected error injection scheme being a custom error, the generating a hardware real error occurrence scheme based on the error injection instruction and the server hardware configuration information comprises:
receiving user-defined hardware type information, user-defined error occurrence frequency information and error position information which are input by a user;
generating a hardware true error occurrence scheme based on the custom hardware type information, the custom error occurrence frequency information, the error position information and the server hardware configuration information.
5. The method for verifying server performance as defined in claim 3, wherein,
responding to the error injection scheme selected by the user as random error, wherein the generating the hardware true error generation scheme based on the error injection instruction and the server hardware configuration information comprises the following steps:
And randomly generating a hardware true error occurrence scheme based on the server hardware configuration information.
6. The method for verifying server performance as defined in claim 3, wherein,
responding to the error injection scheme selected by the user as fault flood, wherein the generating the hardware true error generation scheme based on the error injection instruction and the server hardware configuration information comprises the following steps:
and randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error generation scheme.
7. The method according to any one of claims 2-6, wherein before the sending the hardware real error occurrence scheme to the target hardware for the target hardware to generate the real error based on the hardware real error occurrence scheme, the method further comprises:
analyzing the hardware true error generation scheme to obtain hardware true error category information, wherein the hardware true error category information comprises at least one of CPU hardware errors, memory hardware errors, bus and interface standard hardware errors and integrated south bridge errors;
analyzing the hardware true error generation scheme to obtain target hardware layer information corresponding to the hardware true error category information;
Analyzing the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the sending the hardware real error occurrence scheme to the target hardware comprises the following steps:
and sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
8. The server performance verification method according to claim 7, wherein the parsing the hardware real error occurrence scheme to obtain hardware real error category information includes:
and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
9. The method for verifying server performance according to claim 8, wherein the analyzing the hardware real error occurrence scheme further generates a first error random number when obtaining target error type information corresponding to the target hardware layer information;
before the analyzing the hardware real error occurrence scheme obtains the target error type information corresponding to the target hardware layer information, the method further comprises:
and correcting the target hardware layer information based on the first error random number.
10. The method for verifying server performance according to claim 9, wherein the analyzing the hardware real error occurrence scheme further generates a second error random number when obtaining target error type information corresponding to the target hardware layer information;
before the sending the target hardware layer information and the target error type information to the target hardware based on the hardware real error category information, the method further includes:
and correcting the target error type information based on the second error random number.
11. The server performance verification method according to claim 10, wherein the sending the hardware real error occurrence scheme to target hardware for the target hardware to generate a real error based on the hardware real error occurrence scheme comprises:
and issuing the target hardware layer information and the target error type information to target hardware in the form of an error packet or an error stream so as to generate a real error.
12. The server performance verification method according to claim 11, wherein the obtaining the log information of the real error includes:
capturing log information of the real errors based on a preset interface, wherein the preset interface comprises at least one of a serial port, an extended debugging port interface, an intelligent platform management interface, a redfish interface and an SSH interface; and the log information of the real errors is reported to an operating system kernel and a baseboard management controller by the target hardware after the real errors occur through a unified extensible firmware interface system.
13. The server performance verification method according to claim 12, wherein the acquiring hardware real error occurrence scheme and register information includes:
acquiring a hardware true error occurrence scheme;
and acquiring register information for transmitting and processing the setting of the trigger register, wherein the register information at least comprises serial management interface link information and CSMI link information.
14. The server performance verification method according to claim 13, wherein the obtaining the register information of the error transfer and handling trigger register setting includes:
and after the error occurs, the acquired register information is transmitted to a trigger register set in the unified extensible firmware interface system.
15. The server performance verification method according to claim 14, wherein the processing the real error based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance includes:
judging whether the log of the real error is consistent with the hardware real error occurrence scheme or not, and judging whether the register information is consistent with the hardware real error occurrence scheme or not;
And if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, verifying that the server performance is qualified.
16. The server performance verification method according to claim 15, wherein the determining whether the log of the real error is consistent with the hardware real error occurrence scheme includes:
judging whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, wherein the error information comprises error types, error generation positions, error times and error generation time.
17. The server performance verification method according to claim 15, wherein the determining whether the register information is consistent with the hardware real error occurrence scheme includes:
and judging whether the error type in the register information is consistent with the error type in the hardware true error occurrence scheme.
18. A server performance verification system, the system comprising:
the receiving and generating module is used for receiving the error injection instruction and generating a hardware true error generation scheme based on the error injection instruction;
The sending module is used for sending the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
the acquisition module is used for acquiring the log information of the real errors and acquiring a hardware real error occurrence scheme and register information;
and the processing module is used for processing the real errors based on the log information of the real errors, the hardware real error occurrence scheme and the register information so as to verify the server performance.
19. A computer device, the computer device comprising:
one or more processors;
and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the server performance verification method of any one of claims 1-17.
20. A computer-readable storage medium storing computer instructions that cause the computer to perform the server performance verification method of any one of claims 1-17.
CN202310432839.1A 2023-04-21 2023-04-21 Method, system, computer device and storage medium for verifying server performance Pending CN116521496A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310432839.1A CN116521496A (en) 2023-04-21 2023-04-21 Method, system, computer device and storage medium for verifying server performance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310432839.1A CN116521496A (en) 2023-04-21 2023-04-21 Method, system, computer device and storage medium for verifying server performance

Publications (1)

Publication Number Publication Date
CN116521496A true CN116521496A (en) 2023-08-01

Family

ID=87393405

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310432839.1A Pending CN116521496A (en) 2023-04-21 2023-04-21 Method, system, computer device and storage medium for verifying server performance

Country Status (1)

Country Link
CN (1) CN116521496A (en)

Similar Documents

Publication Publication Date Title
US7324922B2 (en) Run-time performance verification system
CN108388489B (en) Server fault diagnosis method, system, equipment and storage medium
US9569325B2 (en) Method and system for automated test and result comparison
EP2696534B1 (en) Method and device for monitoring quick path interconnect link
CN109800159A (en) Program debugging method, program debugging device, terminal device and storage medium
US11748218B2 (en) Methods, electronic devices, storage systems, and computer program products for error detection
CN100365994C (en) Method and system for regulating ethernet
CN109388604B (en) Hot plug control method, device and storage medium based on PCIe
CN101286129A (en) Embedded systems debugging
CN109753391A (en) The systems, devices and methods of the functional test of one or more structures of processor
EP3692443A1 (en) Application regression detection in computing systems
CN112732499A (en) Test method and device based on micro-service architecture and computer system
CN112380046A (en) Calculation result checking method, system, device, equipment and storage medium
CN112286750A (en) GPIO (general purpose input/output) verification method and device, electronic equipment and medium
CN106445787B (en) Method and device for monitoring server core dump file and electronic equipment
CN114003445A (en) I2C monitoring function test method, system, terminal and storage medium of BMC
CN111124828B (en) Data processing method, device, equipment and storage medium
CN116521496A (en) Method, system, computer device and storage medium for verifying server performance
CN115794530A (en) Hardware connection testing method, device, equipment and readable storage medium
CN112463574A (en) Software testing method, device, system, equipment and storage medium
Carreira et al. Assessing the effects of communication faults on parallel applications
CN114328065A (en) Interrupt verification method and device and electronic equipment
CN113722143A (en) Program flow monitoring method and device, electronic equipment and storage medium
CN112667512A (en) Data drive test method, device, equipment and computer readable storage medium
CN116382968B (en) Fault detection method and device for external equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination